Break it to smaller pieces with variable size (fastboot has 3 deploy
actions and uboot only one) to build the base definition nicely in the
end.
Extract kernel/dtb attachment and init_stage1 extraction into functions
to be later reused by SSH job definition.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25912>
Our LAVA farm is currently experiencing issues with running and pulling
docker. LAVA has been detecting (with a low rate) timeouts during these
commands, causing some jobs to fail with infrastructure errors.
Increasing the failure_retry will make the job retry run the container
when LAVA detects the failure without losing its place in the job queue.
We are currently investigating why docker times out. But, when LAVA
fails to detect it, we cancel the job on our side and resubmit it to the
job queue. For more information, please refer to following dashboard:
https://ci-stats-grafana.freedesktop.org/goto/VjZvaA_4z?orgId=1
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23534>
To ensure proper SSH functioning, the device IP should be added to the
LAVA device dictionary by setting device_ip. LAVA will then map the
value to lava-target-ip.
meson-g12b-a311d-khadas-vim3-cbg-4 has an IP in the dictionary, while
sun50i-h6-pine-h64-cbg-1 and meson-g12b-a311d-khadas-vim3-cbg-2 do not.
Since some devices are not yet properly configured, and device tag
fixing is not an option here, let's temporarily switch to a job
definition based on UART, until it gets fixed.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22870>
To use the supported job definition depending on some Mesa CI job
characteristics.
The strategy here, is to use LAVA with a containerized SSH session to
follow the job output, escaping from dumping data to the UART, which
proves to be error prone in some devices.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22870>
Create a separate job definition that runs the job via SSH session.
The DUT test only sets up the SSH server via dropbear, and another
deployed docker runner in LAVA dispatcher access the DUT via SSH with
pseudo-terminal to propagate the logs in real time.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22870>
Update the LogFollower class to improve section handling and provide a
history of sections encountered during log processing:
1. Add section_history attribute to store the history of encountered
GitlabSections.
2. Make LAVA job submitter use the section history feature to improve
structural logging.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
Let's make lava_job_submitter.py cleaner with only parsing and retry
mechanism capabilities.
Moved out from the submitter script:
1. proxy functions
- moved to lava.utils.lava_proxy.py
2. LAVAJob class definition
- moved to lava.utils.lava_job.py
- added structural logging capabilities into LAVAJob
- Implemented properties for job_id, is_finished, and status, with
corresponding setter methods that update the log dictionary.
- Added new methods show, get_lava_time, and refresh_log for improved
log handling and data retrieval.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22500>
This commit updates LogFollower class to handle carriage return
characters in LAVA logs. LAVA treats carriage return characters as a
line break, so each carriage return in an output console is mapped to a
console line in LAVA.
The updated LogFollower class now merges lines that end with a carriage
return character into a single line, making the Gitlab sections work
correctly. In addition, the `remove_trailing_whitespace` method has been
updated to remove trailing `\r\n` characters from log lines.
The `test_lava_log_merge_carriage_return_lines` test function has also
been updated to test for carriage returns at the end of the previous
line.
Closes: #8242
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21614>
Since the Collabora LAVA update related to the downtime from
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21119, the
LAVA logs from Collabora continued to use the hack for older versions
which digested some control characters, such as carriage returns acting
as newlines, which made it necessary to recover from split lines to make
Gitlab sections work in job logs as expected.
Collabora's LAVA instance now gives a more raw log output. It is
necessary to pay attention to newlines at the end of each log message,
which may cause double newlines when printed with Python built-in
`print` function. I decided to remove the repeating `\n` from the
received log messages to make them transparent to LogFollower users.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8242
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21325>
LavaFarm is a class created to handle the different types of LAVA farms
and their tags in Mesa CI. Since specific jobs may require different
types of LAVA farms to run on, it is essential to determine which farm
the runner is running on to configure the job correctly.
LavaFarm provides an easy-to-use interface for checking the runner tag
and returning the corresponding LAVA farm, making it simple for Mesa CI
to configure jobs appropriately. By adding tests for LavaFarm, the team
can ensure that this class is functioning as expected, allowing for the
smooth execution of Mesa CI jobs on the correct LAVA farm.
The tests ensure that get_lava_farm returns the correct LavaFarm value
when given invalid or valid tags and that it returns LavaFarm.UNKNOWN
when no tag is provided. The tests use Hypothesis strategies to generate
various labels and farms for testing.
Example of use:
```
from lava.utils.lava_farm import LavaFarm, get_lava_farm
lava_farm = get_lava_farm()
if lava_farm == LavaFarm.DUMMY:
# Configure the job for the DUMMY farm
...
elif lava_farm == LavaFarm.COLLABORA:
# Configure the job for the COLLABORA farm
...
elif lava_farm == LavaFarm.KERNELCI:
# Configure the job for the KERNELCI farm
...
else:
# Handle the case where the LAVA farm is unknown
...
```
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21325>
LAVA uses XMLRPC to send jobs information and control, more specifically
it sends device logs via YAML dumps encoded in UTF-8 bytes.
In Python, we have xmlrpc.client.Binary class as the serializer
protocol, we get the logs wrapped by this class, which encodes the data
as UTF-8 bytes data.
We were converting the encoded data to a string via the `str` function,
but this led the loaded YAML data to use single quotes instead of double
quotes for string values that made special characters, such as `\x1b` to
be escaped as `\\x1b`.
With this fix, we can now drop one of the hacks that fixed the bash
colors.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20051>
Empirically, a successful LAVA boot time should take less than 3
minutes.
LAVA itself is configured to attempt thrice to boot the device,
summing up to 9 minutes.
It is better to retry the boot than cancel the job and re-submit to
avoid the enqueue delay.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17646>
We should be explicit that we are cancelling jobs once the script finds
some log messages that are linked with known issues. That means the
script preemptively retried the job without giving chances to recover.
Adds magenta color to cancelled jobs.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>
Implement a log-based retry hint for R8152 issue described in #6681,
which is based on detecting these two consecutive lines:
```
r8152 <USB> eth0: Tx status -71
nfs: server <IP> not responding, still trying
```
Where <IP> and <USB> could be any IP and USB addresses, respectfully.
This commit is a temporary fix since it requires a section-aware log
follower, implemented in !16323. When the cited MR is merged, one will
make a proper fix on top of that.
Closes: #6681
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>
In some jobs, such as
https://gitlab.freedesktop.org/gallo/mesa/-/jobs/24904100, the kmsg is
interleaved with stderr/stdout in serial console, making it difficult to
confidently find the log messages to detect when the DUT is booting,
when the DUT is running etc.
Luckily, LAVA sends redundant messages about their signals. We can use
them to mitigate the chance of missing an interleaved message by being
more open to different messages, using the regex on both `debug` and
`target` LAVA log levels.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
Currently, the submitter consider that every new log that comes from the
DUT console is a signal that the device is healthy, but maybe that is
not the case, since in some kernel hangs/failures, no output is
presented except from some kernel messages.
This commit bypass the heartbeat when the LogFollower detect a kernel
message. Any log line that does follow the kmsg pattern will make the
job labeled as healthy again.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>
- Create LogFollower to capture LAVA log and process it adding some
- GitlabSection and color treatment to it
- Break logs further, make new gitlab sections between testcases
- Implement LogFollower as ContextManager to deal with incomplete LAVA
jobs.
- Use template method to simplify gitlab log sections management
- Fix sections timestamps
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16323>