a630 has been completing jobs, and then corrupting the very last line of
UART output - the one where we pass the overall result back from the DUT
to the job. The bare-metal monitor will wait for this line to appear,
never see it, and then the job times out.
Since this line is the most critical one of all to get out, just spam
the prints to try to make sure they get through.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26032>
Modify the build process for the images to include the build to have ci-kdl
available in the Mesa jobs. Modify also the init-stage2 to launch in the
background the process that will collect data and store a json file with the
relative changes on the recorded data.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24177>
-iN, --idle-time=N
Set the idle timeout to N seconds. The default timeout is
300 seconds. When there has not been any user input for the idle
timeout, Weston enters an inactive mode. The screen fades to black,
monitors may switch off, and the shell may lock the session.
A value of 0 effectively disables the timeout.
We don't want the session to get locked and monitors to switch off while tests
are running, as many of them depend on swapping buffers.
Cc: mesa-stable
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21877>
Better error handling is more reliable.
Options:
-L, follow location
--retry, number of retries
--retry-all-errors, does not fail on ALL errors, that's why there is -f
-f, fail fast with no output at all on server errors
--retry-delay, make curl sleep this amount of time before each retry
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20788>
This can be used instead of HWCI_START_XORG to provide X in CI.
It will only be actually used if HWCI_START_XORG is not set in the same
job.
It is particularly useful as weston has the explicit headless backend
which is more straightforward to use in the headless systems in CI.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20393>
Review feedback requested a change that was incorrect, causing Xorg to
start to fail, but I forgot to retest the manual -full jobs that relied on
it.
Fixes: 99a6f2a186 ("ci: Set the path to the VK drivers during HWCI_START_XORG/WESTON.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20523>
LAVA job logs have an ongoing problem of message interleaving with kmsg.
So any kernel dumps and LAVA signals (which are being printed in kmsg)
will have a chance to clutter the pattern matching for `hwci: mesa:
(pass|fail)` line.
v2:
- Add an 1 second sleep before exiting the test script, to give enough
time to print the result message without conflicting with LAVA ENDTC
signal from kmsg
Closes: #6714
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17175>
Every few weeks we'd get a flake where the script noticed the devcore
right as we were wrapping up, and then tar would exit because the file it
was tarring changed underneath it. Just kill devcores before we do that
-- even if we kill it while it's working, it's so rare that it probably
won't bother anyone.
Fixes: #5867
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16820>
After a LAVA job submitter rework, the init-stage2.sh was changed to be
compatible with new LAVA job definitions, but the result from the script
represented by `HWCI_TEST_SCRIPT` variable is obfuscated by the `set -e`
command. So when the test script fails, `set` will override the exit
code and the jobs will pass when they should fail.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16325>
Any daemon executed in init-stage2.sh may interfere with LAVA signals,
since any threaded output to console may clutter the signals, which are
based on the log output.
E.g: This job
https://gitlab.freedesktop.org/gallo/mesa/-/jobs/20779120#L2102
has failed because capture-devcoredump.sh was alive and emitting kernel
messages to the console during the LAVA signal handling, mangling the
output and making the LAVA to fail to check the results of the job.
Another problem is that CONFIG_DEBUG_STACK_USAGE Kconfig is enabled.
This causes process exit to dump a `RESULT=[ 246.756067] lava-test-case
(156) used greatest stack depth: ... bytes left` kernel message to the
logs corrupting LAVA signal message. Empirically, it happens one in
every 280 jobs. To solve that, compose the lava-test-case custom script
with a short sleep to give time for kernel to dump the message clearly
and a exit command to keep the return code from init-stage2.sh script.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938>
In order to ensure consistent results when running performance tests,
lock the frequency for Intel GPUs to ~70% of the maximum allowed by
hardware.
This seems to offer a good balance between execution speed and results
consistency.
An increase of the frequency will also increase the rate of throttling
events, with a negative impact on consistency. Such events are logged,
as in the following example:
GPU throttling detected: act=200 min=850 cur=850 RPn=100
This shows the actual GPU frequency (200 MHz) dropped below the minimum
requested (850 MHz).
For more details about the various frequency information sources, please
see the script header comments in ".gitlab-ci/common/intel-gpu-freq.sh".
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15662>
There is an out-of-sync approach regarding the location of the results
folder: some scripts refer to it via $CI_PROJECT_DIR/results, while
others just assume it is located in the current working directory.
Usually $PWD points to $CI_PROJECT_DIR, but in some cases this is not
the case, hence let's ensure the 'results' folder can always be found
in the current working directory.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15208>
In order to run a VM (e.g. crosvm) through HWCI_TEST_SCRIPT on a LAVA
target, it's necessary to download a kernel image on the target device.
When HWCI_KVM is set to 'true', we can safely assume HWCI_TEST_SCRIPT
contains a command or the path to a script which expects the kernel
image to be available under /lava-files/${KERNEL_IMAGE_NAME}.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15208>
We get a lot of useful coverage from running graphicsfuzz with spilling
enabled, but it's also pretty slow and can cause intermittent hangcheck
failures. I thought I'd categorized them when merging !14839 (device loss
on reset), but it looks like not all of them and we're now more likely to
have flakes take out the whole test run when a single flake makes the rest
of the caselist a flake.
This is a little unfortunate in that it means our test environment is not
the same as a stock system you would want to run deqp on to submit
conformance, but I think it's an improvement in the test maintenance work
vs needing to fix things up later.
We have some other tests besides turnip that can trigger hangchecks which
we might also like this increase for (some disabled traces, for example).
However, freedreno GL has a 5-second timeout waiting for idle when
mapping, and a couple of 2-second timeouts in a row can result in spurious
failures in other tests!
Fixes: #6163
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15435>
For every CI job, put JWT content into a file and unset CI_JOB_JWT
environment var
=======
* virgl jobs:
- Share JWT token file to crosvm instance
- Keep using `export -p` due to high complexity in the scripts
of these jobs. At least, the CI_JOB_JWT will not be leaked,
since it is being unset at the `before_script` phase of each
Mesa CI job.
* iris jobs: Update lava_job_submitter to take token file as argument
- generate-env with CI_JOB_JWT_TOKEN_FILE
- create token file during baremetal init stage
* baremetal jobs: Copy token file to bare-metal NFS
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14004>
Whilst we want to reuse the same init and job environment for LAVA and
bare-metal, LAVA needs to additionally inject wget and tar jobs, so we
can actually get our per-job environment, as the rootfs we run in is
just the container-generated base rootfs.
Split the init script into two stages, with the first stage doing very
base bringup of devices and networking, and the second stage setting the
job environment and running the jobs.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>