hang-detection is a vulkan-based lightweight wrapper from
parallel-deqp-runner that periodically submits empty command buffers
and waits for their completions. If the completion never happens, the
GPU is considered hung, the wrapped script is killed, and the job
should get aborted.
This should have no negative impact on the runtime of dEQP/traces/...,
but will allow saving time when the GPU gets hung as we can abort the
job immediately rather than waiting for the timeout.
In the case of B2C, we are using this tool's error message as a way to
trigger the reboot of the test machine and start again.
v2:
- Use hang-detection already with some jobs (Martin).
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11087>
When we remove the contents of the results directory, we `cd` into it.
The script expects that $PWD is /piglit, and $OLDPWD is the Mesa build
directory, however the cd into the results directory will make $OLDPWD
be $BUILDDIR/results.
This means that Piglit emits into results/results/ which looks weird,
but more importantly also fails OpenCL Piglit execution, because we
can't find our baseline result expectations.
Fix it by using an explicit variable rather than relying on history.
Fixes: 683ddf19dc ("ci: remove results directory content only with piglit runners")
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10856
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11126>
Covert the job submission process to a python script for more
robustness and control. allowing easier manipulation of job data.
As a result, it adds retry logic to deal with Infrastructure Errors in LAVA.
_call_proxy() is equipped with a robust retry logic, which I have been
using already in the past few weeks in stress testing to run hundreds
of jobs.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11079>
Use Piglit's replay profile to measure and store the time that frames
take to render in the GPU.
This job won't run automatically in regular pipelines, but will be
triggered automatically by a script for every successful pre-merge
pipeline.
This is because we want to generate performance data for every relevant
commit merged in main, but we don't want to keep a device busy during
the pre-merge run.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7987>
In order to reduce the amount of building work and network traffic, we
use docker caching. For that, we use the MESA_IMAGE_TAG and
MESA_BASE_TAG env variables which build the MESA_IMAGE variable to
identify different containers.
We are also using these tags to identify the cached artifacts produced
by other containers when those are part of the underlying OS to run
directly in DUTs through the DISTRIBUTION_TAG env variable.
The undesirable collateral effect is that we cannot combine a test job
using a container which would like to make use of some of the cached
artifacts created by another container. In other words, we cannot have
a job using a DISTRIBUTION_TAG and a MESA_IMAGE using a different
MESA_[IMAGE|BASE]_TAG variables.
Now, we split the usage in the DISTRIBUTION_TAG through the definition
of MESA_ARTIFACTS_TAG AND MESA_ARTIFACTS_BASE_TAG.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10977>
ci-fairy minio ls will try to list files in the path given, which for
trace buckets is generally forbidden. We don't really need to do any
listing in this case, so use wget instead to check that the reference
image doesn't exist yet.
Previous to this patch, trace jobs would re-upload all reference images
to minio every time because they wouldn't be able to verify that the
reference image was already there. Jobs would often take up to 4 minutes
needlessly re-uploading these files.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10953>
Panfrost has two compilers, one for Midgard GPUs and one for Bifrost
GPUs. The respective compilers are src/panfrost/midgard and
src/panfrost/bifrost. Changes internal to just one compiler (or
disassembler) cannot affect the other hardware, so there's no need to
run extra jobs in these cases.
Also split out common vs Gallium panfrost so we can do the right thing
for panvk builds in the imminent future.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10924>
Removing the directory itself can be problematic with certain runner
strategies (B2C).
v2:
- Better deleting pattern matching since the previously used one was
problematic and not pointed out by /bin/sh, as noticed by Emma.
v3:
- Check that the results directory exists before attempting to
delete its content.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10856>
Only the AMD video drivers for xorg are added since there are no other
expected users by now.
v2:
- Remove the start/stop logic from the x.sh script. We don't care
about stopping since that's already managed by gitlab-ci (Emma).
v3:
- Remove mistakenly added ".gitlab-ci/common/start-x.sh"
script (Martin).
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10856>
When I last time fixed this, I missed that continuing here would make us
leak pointers in the translate state, which is what made this avoid a
crash in the first place.
That's not great, we need to set *some* pointer in this case. The
obvious option would be NULL, but that means that the translate-code
also needs to support NULL-pointers here.
Instead, let's point to a small, static buffer that contains enough
zero-data for the largest possible vertex attribute. This avoids having
to add more NULL-checks.
Fixes: a8e8204b18 ("gallium/u_vbuf: support NULL-resources")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7773>