The job definitions for lava-related jobs are encapsulated in a directory,
while the other two farm managers were in the generic test directory. Having a
directory for ci-tron places it side by side with other farm managers. For
bare-metal, it has another advantage, as this encapsulates elements related to
this farm manager in a single place.
To maintain simplicity and consistency in file naming, the gitlab-ci file in
the lava directory is renamed as it has a prefix that corresponds with the
directory hosting it. The other farm equivalent files don't include this
duplication as a prefix, and there isn't such a prefix in any other case of
the CI.
Signed-off-by: Sergi Blanch-Torne <sergi.blanch.torne@collabora.com>
Acked-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35427>
With the rest of the Qualcomm devices moving to LAVA, we can remove the
original (!) bare-metal infrastructure, leaving only the Igalia RPi
devices still using bare-metal. When those are converted to b2c, we can
remove the rest of bare-metal.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35148>
The EXTERNAL_KERNEL_TAG variable is no longer needed.
For LAVA and bare-metal, we can override the KERNEL_TAG variable to fetch
both the kernel image and modules from a different tag than the default
mainline gfx-ci/linux kernel.
For LAVA, this also avoids the issue where jobs using EXTERNAL_KERNEL_TAG
would still have mainline kernel modules downloaded by the LAVA overlay.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34873>
This unifies the behaviour between the LAVA, baremetal, and CI-Tron
farms by ensuring every job has access and runs the same scripts.
The init-* scripts are however still sourced from outside the build
artifact, hopefully not for too long.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33026>
FORCE_KERNEL_TAG allows testing kernel uprevs without rebuilding
containers by supplying an external kernel directly for booting on
hardware devices. Renaming it to EXTERNAL_KERNEL_TAG clarifies its
purpose, distinguishing it from KERNEL_TAG which rebuilds containers.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31795>
The date isn't interesting, since you can tell if it wraps around
anyway. Just print the time down to the millisecond, which should be
plenty enough. We also don't need to print 'R SERIAL_CPU> ' when almost
every line is reading from the CPU serial buffer, so make that the
default.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31602>
Contrary to what the original commit said, this is actually still used
(see .gitlab-ci/bare-metal/poe-powered.sh:205), and the boot retry logic
has been broken ever since, exacerbating the rpi farm boot problems.
Fixes: 97b2afa16a ("ci/bare-metal: Drop the 2 vs 1 exit code from poe_run.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30340>
In order to turn on/off through SNMP DuT under PoE switch, the SNMP key
in some vendors don't directly use the interface number, but a number
shifted a base number.
Define this base number as BM_POE_BASE environment in the runner.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29306>
Use the CustomLogger class and CLI tool to create strutured logs
for poe scripts which are used by broadcom and nouveau jobs.
Renamed stage lint to code-validation and added python-test job
which runs the tests for structured and customer logger to ci.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25179>
We need update kernel often. We need test kernel changes often.
Introduced `KERNEL_EXTERNAL_TAG` to differ between `KERNEL_TAG` which is
also used to rebuild the containers. We don't need rebuild containers
for the external kernel, so this way we don't have to.
Updating kernel goes wruuuuuum.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23563>
We have job-level retry on failure now, and will continue to need to in
order to work around fd.o infrastructure flakes. If we stop doing retry
inside the job, then we can crank down the gitlab-level timeouts on test
jobs to be closer to our CI guidelines and avoid blocking a runner for an
hour when things go wrong (for example, cheza #16 failing to boot in a
recognized way and continuously looping due to the intra-job retry).
Plus, the job logs will be more readable when you don't have two boots in
one job, and we'll get the flakes surfaced in our monitoring dashboards.
If internal retries were really doing useful work we may see an increase
in flakes as a result of this. I'm committing to turning off boards or
reducing coverage as necessary to handle this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25790>
1. Move block used for detecting R8152 problems to the bootloader
phase where it belongs. Also remove requirement to 100 failures and just
retry immediatelly.
2. Consider job failed after 10 errors, not 100. From the logs on
cheza-14, ~ 30 errors is enough to fail.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25285>
Modify the build process for the images to include the build to have ci-kdl
available in the Mesa jobs. Modify also the init-stage2 to launch in the
background the process that will collect data and store a json file with the
relative changes on the recorded data.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24177>
Better error handling is more reliable.
Options:
-L, follow location
--retry, number of retries
--retry-all-errors, does not fail on ALL errors, that's why there is -f
-f, fail fast with no output at all on server errors
--retry-delay, make curl sleep this amount of time before each retry
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20788>
This is a farm of 5 (6, but one fails) TK1 boards for nouveau testing,
hosted and maintained by me. Currently it runs GLES dEQP.
I've been using ./.gitlab-ci/bin/ci_run_n_monitor.py --stress --target
gk20a to test it and am pretty confident of the skips/flakes list. Last
night it ran 318 jobs without fail, and prior to that there were two sets
of runs in the 100-200 range where only the one failing runner failed any
jobs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18497>
If we got a "Reached the end of the CPU serial log without finding a
result" because the test phase timed out, then the CPU serial would have
been closed as part of the timeout process, so we need to close the rest
and re-instantiate the servo run class.
fastboot and poe already re-instantiate the class on retry.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17689>