Better error handling is more reliable.
Options:
-L, follow location
--retry, number of retries
--retry-all-errors, does not fail on ALL errors, that's why there is -f
-f, fail fast with no output at all on server errors
--retry-delay, make curl sleep this amount of time before each retry
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20788>
This is a farm of 5 (6, but one fails) TK1 boards for nouveau testing,
hosted and maintained by me. Currently it runs GLES dEQP.
I've been using ./.gitlab-ci/bin/ci_run_n_monitor.py --stress --target
gk20a to test it and am pretty confident of the skips/flakes list. Last
night it ran 318 jobs without fail, and prior to that there were two sets
of runs in the 100-200 range where only the one failing runner failed any
jobs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18497>
If we got a "Reached the end of the CPU serial log without finding a
result" because the test phase timed out, then the CPU serial would have
been closed as part of the timeout process, so we need to close the rest
and re-instantiate the servo run class.
fastboot and poe already re-instantiate the class on retry.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17689>
This should help with "marge got stuck for an hour and all I got was this
failed job with no results/" when a system intermittently wedges.
This replaces the BM_POE_TIMEOUT ("did we get something on serial in the
last 3 minutes?") that rpi had, in favor of checking that the whole test
job gets through in 20 minutes.
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
If the SerialBuffers can just feed the same line queue, then we don't need
the extra threads reading line queues into a new merged line queue.
Less python threading code is always better. Plus, now we can pass args
to SerialBuffer.lines() for timeout/phase.
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
The test suite is full of flakes around transform feedback, atomics, and
tess. But, I hope it can be useful for regression testing core Mesa
reworks.
This required updating the kernel to 5.16.12 to get a more stable boot
process. That kernel rebuild caused an update of the container with
piglit which that was missed in a previous MR, so we got new xfails in x86
swrast.
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> (nouveau)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
This required updating the kernel to 5.16.12 to get a more stable boot
process. That kernel rebuild caused an update of the container with
piglit which that was missed in a previous MR, so we got new xfails in x86
swrast. Also, including modules on arm64 exposed a bug in v3d's
poe-powered.sh rsyncing of modules.
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
For every CI job, put JWT content into a file and unset CI_JOB_JWT
environment var
=======
* virgl jobs:
- Share JWT token file to crosvm instance
- Keep using `export -p` due to high complexity in the scripts
of these jobs. At least, the CI_JOB_JWT will not be leaked,
since it is being unset at the `before_script` phase of each
Mesa CI job.
* iris jobs: Update lava_job_submitter to take token file as argument
- generate-env with CI_JOB_JWT_TOKEN_FILE
- create token file during baremetal init stage
* baremetal jobs: Copy token file to bare-metal NFS
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14004>
Using suites makes load-balancing our jobs much easier, keeps the CPU busy
handling the a630_gles_others.sh test sets (and improves the output and
baseline handling for them), and makes it trivial to add in more short
test sets.
a306: still 5 jobs, and we add KHR-GLES2 (KHR-GLES3 is unstable)
a530: still 5 jobs, added KHR-GLES*
a630_gl: 5 jobs becomes 4, and we add KHR-GLES*
a630_vk: still 3 jobs, now 1/3 of all VK instead of 1/4.
a630_vk_full: still 2 jobs, now includes full bypass testing, partial
no-force testing, and testing of pre-merge-skipped tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12256>
Whilst we want to reuse the same init and job environment for LAVA and
bare-metal, LAVA needs to additionally inject wget and tar jobs, so we
can actually get our per-job environment, as the rootfs we run in is
just the container-generated base rootfs.
Split the init script into two stages, with the first stage doing very
base bringup of devices and networking, and the second stage setting the
job environment and running the jobs.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>
The screensaver kicks in at 10 minutes and obscures the screen,
independent of dpms. This causes piglit tests to get flaky (swaps start
taking a whole second, and swapbuffersmsc-divisor-zero times out at
exactly the wrong time) and slow if the run takes longer than 10 minutes.
Hopefully with this we'll see some piglit glx flakes go away forever, it
did seem to for this test locally.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11334>
The a530s will occasionally fail to make it to the fastboot prompt,
with no other deltas between a working log and a log stalled waiting
for that line to show up.
So, add a serial timeout (like the rpi boards do for similar reasons),
and on timeout restart the run. We actually restart the whole serial
watching process, because the SerialBuffer finishes itself on timeout.
This should also help with the intermittent issue we've had where a
power cycle causes the python serial module to throw an exception.
Tested with the gitlab-disabled db820c that never makes it to the
fastboot prompt (I think it's one where we need a longer micro cable
to connect it!) and saw successful boot looping to retry.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11308>