Commit Graph

105 Commits

Author SHA1 Message Date
Guilherme Gallo
a33c0e1867 ci/lava: Split boot action into deploy and boot
The boot action was wrapping the deploy action, which could cause
timeout misalignment. For example, the boot `GitlabSection` timeout was
shorter than the deploy timeout in LAVA, leading to cases where LAVA
jobs were canceled during their own retry mechanism.

By splitting these actions, we can align the timeouts properly,
preventing interference and unnecessary job cancellations.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33906>
2025-03-10 05:44:25 +00:00
Guilherme Gallo
93ce733425 ci/lava: Improve exception handling for job failures
Include detailed error messages when raising exceptions on LAVA job
failures to enhance debugging and error tracking.

Also, handle additional error types by extracting error messages from
metadata and retrying accordingly.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32163>
2024-11-21 04:10:52 +00:00
Guilherme Gallo
bc86b73bbe ci/lava: Fix lava-tags parsing
python-fire auto-converts `item1,item2` into a tuple, but if there is a
dash `-` inside the argument, it treats it as a string.

Let's validate the data, when it comes as a `str` or `tuple`.

For more details, here are the tested scenarios:

| --lava-tags= | Type  | Value               |
|--------------|-------|---------------------|
| None         | bool  | True                |
| ''           | str   | ''                  |
| tag1         | str   | "tag1"              |
| tag1,        | tuple | ("tag1",)           |
| tag-1,tag-2  | str   | 'tag-1,tag-2'       |
| tag1,tag2    | tuple | ("tag1", "tag2")    |
| ','          | str   | ','                 |
| ',,'         | str   | ',,'                |
| 'tag1,,'     | str   | 'tag1,,'            |

See also:
https://google.github.io/python-fire/guide/#argument-parsing

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31882>
2024-10-31 18:00:27 +00:00
Daniel Stone
f44970173d ci/lava: Provide list of overlays to submitter
Instead of providing a hardcoded set of arguments, allow overlays to be
added to the submitter script. Passing Python dicts as a string
representation and relying on coercion from strings is far from great,
but fire doesn't give anything else, so.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Co-authored-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31882>
2024-10-31 18:00:27 +00:00
Daniel Stone
f32a2de26d ci/lava: Provide LAVA rootfs URL directly
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31882>
2024-10-31 18:00:27 +00:00
Daniel Stone
d171f47f44 ci/lava: Coalesce post-processed job information
We can combine a section _and_ a print.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
9be46b29f0 ci/lava: Print relative timestamps in sections
Follow what the shell executor does and print the time since the job
started in the section header.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
8ee6241a8c ci/hw: Wrap pre-test setup in collapsed section
Most people don't care about environment variables and starting Weston.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
3c7b53e27c ci/lava: Be a little less enthusiastic with bold
Some things can just fade away into the background.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
65f05f2231 ci/lava: Explicitly pass UTC timezone
Rather than putting it into the environment, just pass it explicitly
every time we need a timestamp.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
2b4d468421 ci/lava: Hide more boot details into sections
Make sure we keep as much of the boot as we can behind sections, so
by default people only see the actual test run.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
837004fa26 ci/lava: Rename lava_boot section
'LAVA boot' is less meaningful to normal people than explaining that
we're booting the hardware device under test.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
964b979131 ci/lava: Add section for device wait
This way it's easier to see how long it took.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
46c8423489 ci/lava: Remove pointless messages
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Daniel Stone
cff886d808 ci: Don't print structured log data URL
People who want to get it will know where to find it. Most people will
not care.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
2024-10-20 11:32:42 +01:00
Vignesh Raman
b9cee06f9e ci/lava: handle non-zero exit codes
The LAVA job submitter always exits with code 1, regardless
of the HWCI_TEST_SCRIPT's exit code. This commit fixes the
LAVA job submitter to exit with the actual code returned by
the test.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31189>
2024-09-20 10:29:39 +00:00
Daniel Stone
eae6e122ab ci/lava: Fix path to structured logger
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30978>
2024-09-09 16:27:07 +00:00
Vignesh Raman
5d9e650ed6 ci/lava: add farm in structured log files
Farm variable is added for devices in collabora farm.
So add support in LAVA job submitter to use this in
structured log files.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29583>
2024-06-07 14:32:49 +00:00
Guilherme Gallo
4b81ee6418 ci/lava: Fix how exception entry in structured log
Improves the error logging in the LAVA job submitter by capturing and
logging the exception message rather than just the exception type when a
job fails to run.

Additionally, introduces a clearer script interruption
message to aid in debugging and immediate understanding of job
submission failures.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
2024-04-22 21:20:07 +00:00
Guilherme Gallo
e96e25f323 ci/lava: Don't run jobs if the remaining execution time is too short
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
2024-04-22 21:20:07 +00:00
Guilherme Gallo
3e33171471 ci/lava: Introduce unretriable exception handling
This commit refactors the exception hierarchy to differentiate between
retriable and fatal errors in the CI pipeline, specifically within the
LAVA job submission process.

A new base class, `MesaCIRetriableException`, is introduced for
exceptions that should trigger a retry of the CI job, while
`MesaCIFatalException` is added for non-recoverable errors that halt the
process immediately.

Additionally, the logic for deciding whether a job should be retried or
not is updated to check for instances of `MesaCIRetriableException`,
improving the robustness and reliability of the CI job execution
strategy.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
2024-04-22 21:20:07 +00:00
Guilherme Gallo
5363874676 ci/lava: A few formatting cleanups
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
2024-04-22 21:20:07 +00:00
Eric Engestrom
e46702f7ae ci: deduplicate constructing the ARTIFACTS_BASE_URL
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26670>
2023-12-13 21:12:22 +00:00
David Heidelberg
5e44cee47d ci: inject gfx-ci/linux S3 artifacts without rebuilding containers
We need update kernel often. We need test kernel changes often.

Introduced `KERNEL_EXTERNAL_TAG` to differ between `KERNEL_TAG` which is
also used to rebuild the containers. We don't need rebuild containers
for the external kernel, so this way we don't have to.

Updating kernel goes wruuuuuum.

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23563>
2023-11-07 12:22:09 +00:00
Guilherme Gallo
76922f8404 ci/lava: Create LAVAJobDefinition
To absorb complexity from the building blocks to generate job
definitions for each mode:
- fastboot-uart
- uboot-uart
- uboot-ssh

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
2023-11-02 03:31:50 +00:00
Guilherme Gallo
af9273eb4f ci/lava: Fix imports formatting
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
2023-11-02 03:31:50 +00:00
Guilherme Gallo
f7f2d26e3b ci/lava: Use project_name instead of hardcoded mesa
The LAVA job submitter is being used by other fd.o projects, such as
`drm/ci`, so let's make it generate more generic job definitions and
test cases.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
2023-11-02 03:31:49 +00:00
David Heidelberg
3a4bdf26e6 ci: remove LAVA prefix from variables which can be used also elsewhere
At least these two can be easily used in bare-metal or Labgrid setups.

Currently I already have MR for implementing these for Labgrid.

Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24665>
2023-08-17 13:25:46 +00:00
Guilherme Gallo
a5ccb4dafb ci/lava: Use an alpine image for SSH client container
Use a lightweight container for ping, ssh, curl and bash support.
Also use an image located at fd.o infrastructure, since we are having
some issues with Collabora's gitlab one lately.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23534>
2023-07-07 14:34:40 +00:00
Guilherme Gallo
80290bcddd ci/lava: Raise the post test metadata gathering retry count
In some devices, it takes a few dozens of seconds to LAVA post process
the job and give final metadata related to the job.
It is worth to wait a little more (up to 30 sec) to make structured log
data more accurate.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22870>
2023-05-19 14:45:17 +00:00
Guilherme Gallo
8626a52637 ci/lava: Add bridge function for job definition
To use the supported job definition depending on some Mesa CI job
characteristics.

The strategy here, is to use LAVA with a containerized SSH session to
follow the job output, escaping from dumping data to the UART, which
proves to be error prone in some devices.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22870>
2023-05-19 14:45:17 +00:00
Guilherme Gallo
6bb7add829 ci/lava: Fix last section in job submitter
It only happens after the LogFollower cleanup (__exit__ method)

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22870>
2023-05-19 14:45:17 +00:00
Guilherme Gallo
11a97b644c ci/lava: Refactor LAVAJobSubmitter and add tests
Some refactoring was needed to make LAVAJobSubmitter class testable via
pytest.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
710b568dcd ci/lava: Force use of UTC timezones
LAVA farm is giving datetime in UTC timezone, let's standardize it
locally for the script run, so datetimes coming from LAVA proxy calls
will be at the same timezone as the ones we use in structural logging
and traces.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
5c5aec15b1 ci/lava: Integrate StructuralLogger with AutoSaveDict
Let's use the AutoSaveDict as structural logger abstraction to enable
real-time monitoring of LAVA jobs. Mainly used for local runs and
debugging of Mesa CI LAVA jobs.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
0ac3824922 ci/lava: Add a simple Structural Logger into submitter
Refactor some pieces of the submitter to improve the clarity of the
functions and create a simple dictionary with aggregated data from the
submitter execution which will be dumped to a file when the script
exits.

Add support for the AutoSaveDict based structured logger as well, which
will come in a follow-up commit.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
41f29c5333 ci/lava: Update LogFollower for better section handling and history
Update the LogFollower class to improve section handling and provide a
history of sections encountered during log processing:

1. Add section_history attribute to store the history of encountered
   GitlabSections.
2. Make LAVA job submitter use the section history feature to improve
   structural logging.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
cfe644a9e5 ci/lava: Use python-fire in job submitter
Cleanup argparse to use dataclasses+python-fire to give easier
maintenance to job submitter.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
c03f7233ca ci/lava: Extract LAVA proxy and LAVAJob abstractions
Let's make lava_job_submitter.py cleaner with only parsing and retry
mechanism capabilities.

Moved out from the submitter script:

1. proxy functions
  - moved to lava.utils.lava_proxy.py
2. LAVAJob class definition
  - moved to lava.utils.lava_job.py
  - added structural logging capabilities into LAVAJob
  - Implemented properties for job_id, is_finished, and status, with
    corresponding setter methods that update the log dictionary.
  - Added new methods show, get_lava_time, and refresh_log for improved
    log handling and data retrieval.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
Guilherme Gallo
6f6b892dca ci/lava: Move job definition stuff to another file
The LAVA job submitter is too big, let's reorganize it a little.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
2023-04-19 14:36:37 +00:00
David Heidelberg
675f757ffb ci/lava: implement the priority
Before: kernelci 38; Mesa3D 75

Priority now:
 - 38 ‒ kernelci
 - 40 ‒ after merge and performance
 - 50 ‒ user runs
 - 75 ‒ marge-bot (MUST be prioritized)

Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21812>
2023-03-09 20:28:07 +00:00
David Heidelberg
796686af1b ci: migrate from wget to curl
Better error handling is more reliable.

Options:
 -L, follow location
 --retry, number of retries
 --retry-all-errors, does not fail on ALL errors, that's why there is -f
 -f, fail fast with no output at all on server errors
 --retry-delay, make curl sleep this amount of time before each retry

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20788>
2023-01-19 23:46:44 +00:00
Guilherme Gallo
8bc51d78a5 ci/lava: Tweak LAVA jobs timeouts
The Mesa CI LAVA job submitter was suffering from a bug in the LAVA
software that made their timeouts related to sub-actions unreliable,
such as waiting for the user login prompt automatic response.

The following MR
https://git.lavasoftware.org/lava/lava/-/merge_requests/1900 fixed this
issue. So we can now better control job timeouts granularity, failing
the job faster when there is something weird hanging the boot stage.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20596>
2023-01-10 20:10:49 +00:00
Guilherme Gallo
bbdbf0862c ci/lava: Update lavacli version
- Use new YAML loader derived from ruamel.yaml
- Remove PyYAML dependency from LAVA job submitter package

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20596>
2023-01-10 20:10:49 +00:00
Guilherme Gallo
590f74084d ci/lava: Show LAVA job info during fails
Currently, LAVA jobs only show metadata when successful, let's show this
info in all retries to make it easier to debug or report issues.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20051>
2022-12-21 12:44:49 +00:00
Guilherme Gallo
2e723cdc10 ci/lava: Anticipate overlayfs download
To debug the LAVA jobs locally, we have an option in the
lava_job_submitter script to ignore the JWT token to make it possible to
retry jobs without the need to get an unexpired token.

But this trick needs to modify the overlayed directory so that we would
need to download and extract it earlier in the run.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20051>
2022-12-21 12:44:49 +00:00
Guilherme Gallo
f040122bed ci/lava: Feed yaml.load with raw bytes data
LAVA uses XMLRPC to send jobs information and control, more specifically
it sends device logs via YAML dumps encoded in UTF-8 bytes.

In Python, we have xmlrpc.client.Binary class as the serializer
protocol, we get the logs wrapped by this class, which encodes the data
as UTF-8 bytes data.

We were converting the encoded data to a string via the `str` function,
but this led the loaded YAML data to use single quotes instead of double
quotes for string values that made special characters, such as `\x1b` to
be escaped as `\\x1b`.

With this fix, we can now drop one of the hacks that fixed the bash
colors.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20051>
2022-12-21 12:44:49 +00:00
David Heidelberg
95a7b65c14 ci: replace gzip usage with zstd where posible
v2: added missing zstd to arm_build.sh

Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17776>
2022-08-04 11:21:45 +00:00
David Heidelberg
784642f773 ci: compress LAVA rootfs with zstd instead of gzip
Visible size reduction and minor performance improvement at no cost.

rootfs measure:

```
2022-07-27 11:15:16.235268: 475MB downloaded in 111.59s (4.26MB/s)
2022-07-27 15:07:40.984857: 425MB downloaded in 85.57s (4.97MB/s)
```

So let say approx. 95s vs 85s if we assume 5MB/s, which can bring us
10s speedup...

Acked-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17776>
2022-08-04 11:21:45 +00:00
Guilherme Gallo
69400a0762 ci/lava: Customise sections timeouts via envvars
Refer to environment variables before falling back to the default
timeouts for each Gitlab section.

This makes more explicit in the job definition that there is a
particular case where the job may obey different timeouts.

Closes: #6908

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17703>
2022-08-03 21:50:34 +00:00