Commit Graph

39 Commits

Author SHA1 Message Date
Eric Engestrom
5a5b00cfca ci: drop unneeded printing of pass/fail alongside the exit_code
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35214>
2025-05-29 07:29:25 +00:00
Guilherme Gallo
e9e98d997d ci/lava: Parametrize message burst length on unit tests
We can have jobs with a lower job timeout values, given by
CI_JOB_TIMEOUT environment variable, such as the pytest ones.

The previously hardcoded burst length of 1000 messages at a simulated
rate of 1 msg/sec caused tests to exceed these timeouts and fail
unexpectedly on specific job timeouts.

Reported-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34907>
2025-05-19 22:44:21 +00:00
Guilherme Gallo
f4301626cd ci/lava: Uprev freezegun
The former version was presenting some bugs running fixtures in parallel
testing.
With the new version, we need to change the side effects on mocks a
little bit and fix a start time to make the tests reproducible.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32163>
2024-11-21 04:10:52 +00:00
Guilherme Gallo
b2c2f0d187 ci/lava: Set default exit code to 1 for failed jobs
Sets the default exit code to 1 to ensure the GitLab job fails when the
LAVA job fails or is interrupted. Adds tests to verify the exit code is
correctly set based on the logs or the lack of them (unexpected
finishing: timeouts and canceling).

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32163>
2024-11-21 04:10:52 +00:00
Daniel Stone
f44970173d ci/lava: Provide list of overlays to submitter
Instead of providing a hardcoded set of arguments, allow overlays to be
added to the submitter script. Passing Python dicts as a string
representation and relying on coercion from strings is far from great,
but fire doesn't give anything else, so.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Co-authored-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31882>
2024-10-31 18:00:27 +00:00
Daniel Stone
9be46b29f0 ci/lava: Print relative timestamps in sections
Follow what the shell executor does and print the time since the job
started in the section header.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
8ee6241a8c ci/hw: Wrap pre-test setup in collapsed section
Most people don't care about environment variables and starting Weston.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Vignesh Raman
9b762a3caf ci/lava: update unit tests
Update unit tests to handle exit code in HWCI result output.

Co-developed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31189>
2024-09-20 10:29:39 +00:00
Daniel Stone
6608b5ee46 ci/lava: Fix pytest not passing farm value
This was throwing an exception as a required argument was missing.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30978>
2024-09-09 16:27:07 +00:00
Guilherme Gallo
e96e25f323 ci/lava: Don't run jobs if the remaining execution time is too short
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
2024-04-22 21:20:07 +00:00
Guilherme Gallo
5363874676 ci/lava: A few formatting cleanups
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
2024-04-22 21:20:07 +00:00
Guilherme Gallo
41cd32d10e ci/lava: Broader R8152 error handling
The r8152 error detection is now considering any order of the known
patterns to detect variations of the r8152 issues during the test phase.
This includes a small refactoring for eventual new issues.

Additionally, adjusted the timing for setting the `start_time` in
`test_lava_job_submitter.py` to ensure consistency and reliability in
test execution, aligning the start time closer to the job submission
process.

With this fix, the bad state shown in the following job will be
detected:
https://gitlab.freedesktop.org/drm/msm/-/jobs/55033953

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27688>
2024-02-20 00:48:24 +00:00
Guilherme Gallo
f3850c97d1 ci/lava: Fix the integration test
During the development of this fix, I utilized the `test_full_yaml_log`
test, which is marked as slow. This test is excellent for reproducing
past job submissions. It can be executed using the following commands:

```
lavacli jobs logs --raw 12496073 > /tmp/log.yaml
pytest .gitlab-ci/tests/test_lava_job_submitter.py -m slow -s
```

Here, `12496073` is the LAVA job ID from this specific job:
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/53546660

The logs were not functioning as expected due to a few mistakes I made
with generators, such as:
- Not reusing the `time_traveller_gen` generator to call `next` more
  than once.
- Forgetting to parse the YAML inside `time_travel_from_log_chunk`.

Additionally:
- Added some statistics at the end of the test to aid in diagnosing
  whether the test was reproduced accurately.
- Renamed some variables for clarity.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26996>
2024-01-24 18:39:17 +00:00
Guilherme Gallo
26564b8515 bin/ci: Don't submit jobs on integration test
`test_full_yaml_log` don't need to submit job, because it would need to
replicate/mock more stuff, like the first stage init, which is not
necessary to reproduce issues from the raw YAML log file.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26995>
2024-01-23 22:47:24 +00:00
Guilherme Gallo
d6b30d42b0 ci/lava: Skip regression test if LAVA log file is not present
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
11a97b644c ci/lava: Refactor LAVAJobSubmitter and add tests
Some refactoring was needed to make LAVAJobSubmitter class testable via
pytest.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
0ac3824922 ci/lava: Add a simple Structural Logger into submitter
Refactor some pieces of the submitter to improve the clarity of the
functions and create a simple dictionary with aggregated data from the
submitter execution which will be dumped to a file when the script
exits.

Add support for the AutoSaveDict based structured logger as well, which
will come in a follow-up commit.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
c03f7233ca ci/lava: Extract LAVA proxy and LAVAJob abstractions
Let's make lava_job_submitter.py cleaner with only parsing and retry
mechanism capabilities.

Moved out from the submitter script:

1. proxy functions
  - moved to lava.utils.lava_proxy.py
2. LAVAJob class definition
  - moved to lava.utils.lava_job.py
  - added structural logging capabilities into LAVAJob
  - Implemented properties for job_id, is_finished, and status, with
    corresponding setter methods that update the log dictionary.
  - Added new methods show, get_lava_time, and refresh_log for improved
    log handling and data retrieval.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
bbdbf0862c ci/lava: Update lavacli version
- Use new YAML loader derived from ruamel.yaml
- Remove PyYAML dependency from LAVA job submitter package

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20596>
2023-01-10 20:10:49 +00:00
Guilherme Gallo
f040122bed ci/lava: Feed yaml.load with raw bytes data
LAVA uses XMLRPC to send jobs information and control, more specifically
it sends device logs via YAML dumps encoded in UTF-8 bytes.

In Python, we have xmlrpc.client.Binary class as the serializer
protocol, we get the logs wrapped by this class, which encodes the data
as UTF-8 bytes data.

We were converting the encoded data to a string via the `str` function,
but this led the loaded YAML data to use single quotes instead of double
quotes for string values that made special characters, such as `\x1b` to
be escaped as `\\x1b`.

With this fix, we can now drop one of the hacks that fixed the bash
colors.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20051>
2022-12-21 12:44:49 +00:00
Guilherme Gallo
2cb71ac530 ci/lava: Only parse result within testcase section
This commit fixes an issue related to leftover between jobs in the same
device under test in LAVA.

There is a possibility of having the resulting output being dumped just
after the boot, such as this job:
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/25674303#L155

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17752>
2022-08-01 23:08:37 +00:00
Guilherme Gallo
4783e55039 ci/lava: Add slow pytest marker
Mark test_full_yaml_log with this new marker to be easily run by the
developers.
Make `debian-testing` skip this test with `not slow` marker hint.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>
2022-07-08 12:26:05 +00:00
Guilherme Gallo
45a4b01427 ci/lava: Split lava_log into modules
This script is getting too big, it been hard to extend it.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>
2022-07-08 12:26:05 +00:00
Guilherme Gallo
20827dfa9b ci/lava: Update license header
Use SPDX to indicate the license.
Update authors of lava_job_submitter.py

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
2022-07-07 00:28:53 +00:00
Guilherme Gallo
24f368d652 ci/lava: Stop printing after the result line
There are some leftovers in the jobs logs after the result log line.
Only print until the init-stage2.sh output, to raise the chance to check
for the test script results at the first glance in the Gitlab logs.

Extra changes:
- Add `hung` status for jobs considered hanging in the Gitlab
- print them after the retry loop

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
2022-07-07 00:28:53 +00:00
Guilherme Gallo
466917ea4c ci/lava: Add an integration test for LAVA jobs
test_full_yaml_log is a test that will look for a LAVA log YAML file at
`/tmp/log.yaml` and consume it as it was a realtime CI job.
It is useful for debugging issues related with LAVA.

Let's keep it skipped by default, to avoid introducing entire logs into
the codebase.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
2022-07-07 00:28:53 +00:00
Guilherme Gallo
aa26a6ab72 ci/lava: Follow job execution via LogFollower
Now LogFollower is used to deal with the LAVA logs.

Moreover, this commit adds timeouts per Gitlab section, if a section
takes longer than expected, cancel the job and retry again.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
2022-07-07 00:28:53 +00:00
Guilherme Gallo
2569d7d7df ci/lava: Create LogFollower and move logging methods
- Create LogFollower to capture LAVA log and process it adding some
- GitlabSection and color treatment to it
- Break logs further, make new gitlab sections between testcases
- Implement LogFollower as ContextManager to deal with incomplete LAVA
  jobs.
- Use template method to simplify gitlab log sections management
- Fix sections timestamps

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
2022-07-07 00:28:53 +00:00
Guilherme Gallo
3b8d10d270 ci/lava: Improve result parsing regex
LAVA job logs have an ongoing problem of message interleaving with kmsg.
So any kernel dumps and LAVA signals (which are being printed in kmsg)
will have a chance to clutter the pattern matching for `hwci: mesa:
(pass|fail)` line.

v2:
- Add an 1 second sleep before exiting the test script, to give enough
  time to print the result message without conflicting with LAVA ENDTC
  signal from kmsg

Closes: #6714

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17175>
2022-06-28 22:51:45 +00:00
Guilherme Gallo
cee1c4fc7f ci/lava: Filter out undesired messages
Some LAVA jobs emit lots of messages "Listened to connection for
namespace 'common' for up to 1s" in a row at the end of the logs, making
difficult to see the result of the test script.

This commit removes those lines until a proper solution is deployed on
the LAVA side.

Closes: #6116

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17151>
2022-06-22 01:48:16 +00:00
Guilherme Gallo
75973e3a1c ci/lava: Add support for more complex color codes
Currently, the LAVA job submitter is employing a temporary solution for
the bash escape code mangling in the LAVA jobs. Until the issue is not
fixed on the LAVA side, the submitter will replace the wrong characters
with the fixed ones.

This commit improves the regex pattern to comprehend the scenarios of
color codes with font formatting and background color information, such
as: `echo -e "\e[1;41;39mRed background with white bold text color\e[0m"`

Fixes: #5503

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17046>
2022-06-15 19:10:09 +00:00
Guilherme Gallo
ee2278de65 ci/lava: Fix Gitlab Section markers
LAVA is mangling the escape codes from ANSI during log fetching from the
target device, making the gitlab section markers from deqp, for example,
to not work, inputting noise into the log.

This commit makes the simplest fix which is to replace the mangled
characters to the fixed ones.

This approach is error-prone, since it may unwittingly replace a genuine
log that resembles the mangled escape code. But this solution should
suffice until we get a proper fix from LAVA team itself.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16520>
2022-05-23 16:51:47 +00:00
Guilherme Gallo
e00281f6da ci/lava: Fix colored LAVA outputs
LAVA is mangling the escape codes from ANSI during log fetching from the
target device, making the colored lines from deqp, for example, to not
work, inputting noise into the log.

This commit makes the most straightforward fix which is to replace the
mangled characters to the fixed ones.

This approach is error-prone since it may unwittingly replace a genuine
log that resembles the mangled escape code. But this solution should
suffice until we get a proper fix from LAVA developers itself.

Fixes: #5503

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16520>
2022-05-23 16:51:47 +00:00
Guilherme Gallo
0ff3517fb7 ci/lava: Make job submitter parse the job result
Currently, the LAVA job submitter fetches the job results from the LAVA
XMLRPC call, but that is not necessary, as the job result is easily
found in the logs. E.g. the bare-metal and poe jobs uses that log to set
the final job status of their runs.

Another reason for the change is that the LAVA signals are not reliable
in some devices with one serial port, causing some troubles in a618
recently. So, if one signal fails to be sent/received, the job will
ultimately fail even when the hwci script has been successful.

Fixes: #6435

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16425>
2022-05-13 02:17:32 +00:00
Guilherme Gallo
201b0b6d29 ci/lava: Retry when data fetching log RPC call is corrupted
Rarely the jobs.logs RPC call can return corrupted data, such as
mal-formed YAML data. As this is expected and very rare to occur, let's
retry this RPC call several times to give it a chance to fix itself.

Retrying would not swallow the log lines since we keep track of how many
log lines each job has.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938>
2022-04-28 06:33:46 +00:00
Guilherme Gallo
4ffd21ca70 ci/lava: Improve exception handling
Move exceptions to its own file.
Create MesaCITimeoutError and MesaCIRetryError with specific exception
data for better exception classification.
Avoid the use of `fatal_err` in favor of raising a proper exception.
Make _call_proxy exception handling exhaustive, add missing
ResponseError treatment.

Also, detect JobError during job result parsing. So when a LAVA timeout error
happens, it is probably cause by some boot/network issues with a
specific device, we can retry the same job in other device with the same
device_type.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938>
2022-04-28 06:33:46 +00:00
Guilherme Gallo
18d80f25ee ci/lava: Parse all test cases from 0_mesa suite
LAVA can filter which test suite to show the results from, let's list
all testcases possible in the mesa test suite, to be able to divide more
complex jobs into test_cases.
Another advantage is that the test case can vary its name.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938>
2022-04-28 06:33:46 +00:00
Guilherme Gallo
84a5ea4228 ci/lava: Encapsulate job data in a class
Less free-form passing stuff around, and also makes it easier to
implement log-based following in future.

The new class has:
- job log polling: This allows us to get rid of some more function-local
  state; the job now contains where we are, and the timeout etc is
  localised within the thing polling it.
- has-started detection into job class
- heartbeat logic to update the job instance state with the start time
  when the submitter begins to track the logs from the LAVA device

Besides:

- Split LAVA jobs and Mesa CI policy
- Update unit tests with LAVAJob class

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938>
2022-04-28 06:33:46 +00:00
Guilherme Gallo
794009c9ee ci: Add unit tests for lava_job_submitter
These tests will explore some scenarios involving LAVA delays to submit
the job to the device, some device delays outputting data to LAVA
logs, and sensitive data protection.

For example, the subtests from test_retriable_follow_job, "timed out
more times than retry attempts" and "very long silence" caught a bug
where a job retried until the limited attempts and the CI job still
succeeded. https://gitlab.freedesktop.org/mesa/mesa/-/jobs/18325174

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14876>
2022-02-16 23:32:39 +00:00