The image used for test jobs is only about 1/6 as big as before, which
may help avoid some issues with some of the test boards.
Inspired by https://gitlab.freedesktop.org/mesa/mesa/issues/2046 .
v2:
* Leave LIBDRM_VERSION at 2.4.99 (Daniel Stone)
* Delete more build artifacts from dEQP tree (Daniel Stone)
v3:
* Set LD_LIBRARY_PATH for ldd
Acked-by: Daniel Stone <daniels@collabora.com> # v2
Reviewed-by: Eric Anholt <eric@anholt.net> # Except for the ldd line
$PWD doesn't work for variables:, it ended up as "/ccache", always
starting with an empty cache.
v2:
* Use relative path and realpath
v3:
* Use $CI_PROJECT_DIR (Eric Anholt)
* Clear ccache stats in before_script if the cache is in $CI_PROJECT_DIR
Fixes: c9df92bf79 "ci: Switch over to an autoscaling GKE cluster for
builds."
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
The GKE pool we're using is 1-3 32-core VMs, preemptible (to keep
costs down), with 8 jobs concurrent per system. We have plenty of
memory (4G/core), so we run make -j8 to try to keep the cores busy even
when one job is in a single-threaded step (docker image download, git
clone, artifacts processing, etc.) When all jobs are generating work
for all the cores, they'll be scheduled fairly.
The nodes in the pool have 300GB boot disks (over-provisioned in space
to provide enough iops and throughput) mounted to /ccache, and
CACHE_DIR set pointing to them. This means that once a new
autoscaled-up node has run some jobs, it should have a hot ccache from
then on (instead of having to rely on the docker container cache
having our ccache laying around and not getting wiped out by some
other fd.o job). Local SSDs would provide higher performance, but
unfortunately are not supported with the cluster autoscaler.
For now, the softpipe/llvmpipe test runs are still on the shared
runners, until I can get them ported onto Bas's runner so they can be
parallelized in a single job.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
E.g. documentation-only changes cannot affect the outcome of the
pipeline, so don't waste resources on running it.
The thing we need to be careful about here is that the container stage
jobs must always run if any later stage jobs using the corresponding
docker images run. We're currently using the same .ci-run-policy
template for all jobs, so this is trivially true.
v2:
* Add bin/ and common.py (Eric Engestrom)
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> # v1
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Commit 9edcce2a32 bumped the required libdrm-amdgpu version to
2.4.100. Update the version we use in our CI scripts to avoid CI
build failures.
Also bump the debian image name for this change to take effect.
Note that amdgpu is only built with the debian-buster image,
so only this image requires an update.
Fixes: 9edcce2a ("ac: get tcc_harvested from the kernel")
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
When resolving a merge-conflict, I accidentally only updated the
ARM64-tag tag. Let's correct this.
Fixes: 3d529c1739 ("gitlab-ci: also build Zink on CI")
This adds a new CI job that runs on windows with MSVC. It currently
builds softpipe and osmesa, and runs the related unit tests. It does
rely on meson's wraps for zlib, but I've set up caching of the wrap
dependencies so hopefully that wont be a problem.
I really wanted to user powershell for this, but there just isn't an
easy way to do that, it's much easier to use batch scripts, so thats
what I used.
The leading `/` for .gitlab-ci/lava... must be removed because windows
doesn't understand it, and when it reads the file the job ends in error.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
At this point meson should be able to handle all of the non-windows
platforms just fine; we'd like to be able to stop maintaining scons for
those platforms sooner than later.
Reviewed-by: Eric Anholt <eric@anholt.net>
v2:
* Use LLVM 8 from buster-backports
v3:
* Use LLVM 7 again for armhf, llvmpipe is still broken there with LLVM 8
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
This allows running the regression tests.
One downside is that we can't easily build the Vulkan overlay layer,
because only x86 binaries of the glslang validator are available. If
that's important, we could either use those binaries via qemu, or build
it from source.
v2:
* Add :amd64 suffix to existing debian-9/10 job names (Eric Engestrom)
Acked-by: Eric Engestrom <eric.engestrom@intel.com> # v1
Apparently needs: in a definition overwrites inherited ones. So
.deqp-test effectively didn't declare needs: for debian-10, which means
any jobs based on .deqp-test could spuriously run after the debian-10
job failed or was cancelled.
The one debian provides is broken in buster+, so I've just written my
own. This allows meson to find the installed zlib and prevents it from
falling back to wraps.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
It simplifies the definitions of jobs using the Debian 10 image.
The needs: was previously missing from the llvmpipe/softpipe test jobs,
so they could spuriously run if the debian-10 job failed or was
cancelled.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
In preparation for testing drivers other than Panfrost in LAVA labs.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Include Panfrost's gitlab.ci.yml file from Mesa's main .gitlab-ci.yml so
we test on devices with Panfrost.
This uses LAVA to schedule jobs in the devices and will be the base for
testing Etnaviv, Lima, etc.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Instead of a single cache shared between all jobs, but reduce the
maximum cache size to 1.5G (from 5G).
Rationale for smaller cache:
Pulling & pushing a 5G cache could take a long time. Consider
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/684010 (click the "Show
complete raw" button to see timestamps): Pulling the cache took
1569927241-1569927194 = 47 seconds, pushing it 1569927671-1569927519
= 152, for a total of 199 seconds. The actual build took comparable
1569927518-1569927243 = 275 seconds, despite no cache hits from ccache.
In other words, the cache transfers almost doubled the job duration,
and they would have negated any build time benefits from ccache even
with a high cache hit rate.
Also, the smaller caches avoid blowing up storage requirements for them
too much.
Rationale for per-job caches:
Making a single cache significantly smaller might result in cached
build products from one job getting evicted by another job, reducing
the likelihood of cache hits from previous pipelines.
v2:
* Move up "ccache --max-size=1500M" call (Eric Engestrom)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Without this, it was theoretically possible for the jobs to run before
the docker image was ready.
v2:
* Use - list syntax instead of [] (Eric Engestrom)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows most build jobs to run before the stretch or arm64 docker
images are ready.
v2:
* Use - list syntax instead of [] (Eric Engestrom)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows the *-old-llvm jobs to run before the buster docker images
are ready.
v2:
* Use - list syntax instead of [] (Eric Engestrom)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Pros:
* Less fragile due to not mixing packages from stretch and buster
* No longer need to use third-party LLVM packages
* The buster image now uses GCC 8 for C++ as well (previously 6 for C++,
8 for C), allowing to drop some hacks
Con:
* The stretch image now only uses GCC 6 for C as well as C++
* Need separate jobs for testing old LLVM versions
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Yes, some tests fail, but we can turn those into XFAILs at meson time.
Better to keep the things that work working than not cover them at all.
Unfortunately XPASS results will not cause the build to fail until we
update CI to meson 0.51 or newer.
Reviewed-by: Daniel Stone <daniels@collabora.com>
This might allow the arm64 tests to start running earlier.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
v2:
* Preserve setting NIR_VALIDATE=0 for all arm64_* jobs
* Preserve setting DEQP_SKIPS=deqp-default-skips.txt for
arm64_a306_gles2 jobs
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
Reviewed-by: Eric Anholt <eric@anholt.net>
Support for multiple inheritance was added to GitLab recently.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows the arm64_a306_gles2 jobs to run as soon as the meson-arm64
job has finished.
Fixes: 6f0dc087b7 "freedreno: Introduce gitlab-based CI."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Since freedreno's kernel and GPU reset seem to be totally solid, we
don't need to have the complexity of the LAVA setup that panfrost has.
Instead, we can register some boards as shared gitlab runners and have
the jobs run out of a docker container just like we do for llvmpipe.
Just make sure that the DRI device node is passed through to the
containers in the gitlab config ('devices = ["/dev/dri"]' under
runners.docker).
If a runner fails (networking dies, kernel panic, etc.) it'll take out
one build but the rest can keep going since gitlab-runner is what
pulls jobs. Since the runner pulls jobs, it also means that they can
live behind firewalls instead of needing some public address to be
accessed by gitlab.fd.o.
For now, enable it just on db410c (A307) and cheza (A630) as those are
the hardware that I have plenty of. A307 is only testing GLES2 since
running all of GLES3 takes too long for the number of boards I've
brought up.
Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
[ Michel Dänzer: Dropped jessie line from debian-install.sh again ]
The GLES2 CTS takes about 8 minutes of total runtime (at parallel 4 is
~2 minutes in the test stage if runners are free), while GLES3 takes
about 25. Since the GLES3 run is pretty expensive, just do a cheap
touch test of 1 out of every 10 tests in the test list on MRs, until
we can get the runtime down.
v2: Drop the full run for now until we can bring runtime down or bring
up a dedicated mesa runner.
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1)
Reviewed-By: Gert Wollny <gert.wollny@collabora.com> (v1)