Compare commits

...

693 Commits

Author SHA1 Message Date
Eric Engestrom
35721f1986 docs: add sha sum for 25.0.7 2025-05-28 17:35:48 +02:00
Eric Engestrom
742a20f48c VERSION: bump for 25.0.7 2025-05-28 17:20:23 +02:00
Eric Engestrom
e0614f32a3 docs: add release notes for 25.0.7 2025-05-28 17:20:23 +02:00
Marek Olšák
6692869151 glsl: fix sampler and image type checking in lower_precision
Use the param type, not the referenced variable. The referenced variable
can be a structure, which wouldn't be recognized as a sampler or image.

Fixes: 733bee57eb - glsl: lower samplers with highp coordinates correctly

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel Dieter@nuetzel-hh.de on gfx8 (Polaris 20)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34959>
(cherry picked from commit bd5d623674)
2025-05-28 15:43:52 +02:00
Marek Olšák
4a43d723b5 winsys/amdgpu: fix running out of 32bit address space with high FPS
Reproduced with gfxbench5 gl_tess_off.

Fixes: 4d486888ee - winsys/amdgpu: rewrite BO fence tracking by adding a new queue fence system

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34983>
(cherry picked from commit 4bf2a28334)
2025-05-28 15:23:15 +02:00
Samuel Pitoiset
c83871ccfa radv: add radv_disable_hiz_his_gfx12 and enable for Mafia Definitive Edition
This is a workaround for random GPU hangs with HiZ/HiS on GFX12
because the correct fix is complex and it will take time to be
implemented properly.

Mafia Definitive Edition is the first known game affected by this.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13222
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35182>
(cherry picked from commit 2ebfa64be7)
2025-05-28 15:23:05 +02:00
Samuel Pitoiset
3a98f0e86f radv: fix capture/replay with sparse images and descriptor buffer
The sparse image VA needs to be returned to the application for replay.

Reported by Baldur.

VKCTS has coverage but it doesn't verify this yet.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35162>
(cherry picked from commit 63758bc093)
2025-05-28 15:22:59 +02:00
Erik Faye-Lund
726cfb8a41 mesa/main: remove non-existing function prototype
This function was removed about a decade ago, let's get rid of the
prototype as well!

Fixes: a347a0f53f ("mesa: Completely remove QuerySamplesForFormat from driver func table")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
(cherry picked from commit 439b88c619)
2025-05-28 15:22:59 +02:00
Adam Jackson
2942d3714e vtn/opencl: Handle OpenCLstd_F{Min,Max}_common
Normal fmin doesn't make any promises about NaN, common additionally
doesn't make any promises about infinities. Would be nice to hook that
up to codegen but lowering them to normal works for now.

Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34941>
(cherry picked from commit 4b1c824b67)
2025-05-28 15:22:59 +02:00
Adam Jackson
a23171c02b vtn: (Silently) handle FunctionParameterAttributeNo{Capture,Write}
Silences a few thousand warnings in sycl/test-e2e

Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34941>
(cherry picked from commit 92f07860a4)
2025-05-28 15:22:59 +02:00
Faith Ekstrand
3ccf7682e2 nouveau/mme: Don't install the HW tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35163>
(cherry picked from commit 26ba29f75b)
2025-05-28 15:22:59 +02:00
Faith Ekstrand
5c56ee02a8 nvk: Allocate the correct VAB size on Kepler
We were allocating 128 KiB but claimed 256 KiB.  Allocate the right size
and assert that the size matches.

Fixes: 970bd70584 ("nvk: allocate VAB memory area")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35172>
(cherry picked from commit 9fe2a21e93)
2025-05-28 15:22:59 +02:00
Patrick Lerda
a110ef1391 r600: fix pop-free clipping
This update is aimed at fixing pop-free clipping and follows
the advices by Vitaliy Kuzmin: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12440

This functionality requires calculating the value of the following two
registers: PA_CL_GB_HORZ_DISC_ADJ and PA_CL_GB_VERT_DISC_ADJ. These two
registers are available on all the gpus of the r600 family.

This code is built on the backport of radeonsi updates which are relevant
to this very functionality:
57e658d041 "radeonsi: rework how guardband registers are updated to decrease overhead"
146c2b7c28 "radeonsi: adjust clip discard based on line width / point size"
4d74432dd3 "radeonsi: don't discard points and lines"
63680471f9 "radeonsi: remove si_context::{scissor_enabled,clip_halfz}"

This change was tested on rv770, barts and cayman:
deqp-gles[2-3]/functional/clipping/line/wide_line_clip_viewport_center: fail pass
deqp-gles[2-3]/functional/clipping/line/wide_line_clip_viewport_corner: fail pass
deqp-gles[2-3]/functional/clipping/point/wide_point_clip: fail pass
deqp-gles[2-3]/functional/clipping/point/wide_point_clip_viewport_center: fail pass
deqp-gles[2-3]/functional/clipping/point/wide_point_clip_viewport_corner: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35052>
(cherry picked from commit df2c774a83)
2025-05-28 15:22:59 +02:00
Qiang Yu
e07cea0be5 nir/opt_varyings: fix mesh shader miss promote varying to flat
We still allow mesh shader promote constant output to flat, but
mesh shader like geometry shader may store multi vertices'
varying in a single thread. So mesh shader may store different
constant values to different vertices in a single thread, we
should not promote this case to flat.

I'm not using shader_info.mesh.ms_cross_invocation_output_access
because OpenGL does not require IO to have explicit location, so
when nir_shader_gather_info is called in OpenGL GLSL compiler to
compute ms_cross_invocation_output_access, some implicit output
has -1 location which causes ms_cross_invocation_output_access
unset for it.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13134
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35081>
(cherry picked from commit 6f2a1e19da)
2025-05-28 15:22:59 +02:00
Timothy Arceri
042736a4d4 util: add workaround for the game Foundation
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12882
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35107>
(cherry picked from commit bf24d56862)
2025-05-28 15:22:59 +02:00
Timothy Arceri
1b7e6d305b mesa: extend linear_as_nearest work around
Here we allow packed stencils to skip the completeness check also.
Will be used in the following patch for a bug in the game Foundation.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35107>
(cherry picked from commit 27945bbd8a)
2025-05-28 15:22:59 +02:00
Mike Blumenkrantz
b00d4807ac lavapipe: handle counterOffset in vkCmdDrawIndirectByteCountEXT
fixes dEQP-VK.transform_feedback.simple.draw_indirect*counter_offset*

cc: mesa-stable

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35076>
(cherry picked from commit 42b303c7b0)
2025-05-28 15:22:59 +02:00
Mike Blumenkrantz
3f251664fd llvmpipe: disable conditional rendering mem for blits
u_blitter doesn't support this, and changing u_blitter to support a niche
lavapipe feature seems like overkill

fixes dEQP-VK.conditional_rendering.conditional_ignore.resolve_image*

cc: mesa-stable

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35076>
(cherry picked from commit 753d3e71d3)
2025-05-28 15:22:58 +02:00
Lionel Landwerlin
3f68db8d8c anv: don't use pipeline layout at descriptor bind
An application is allowed to bind an empty descriptor set in a place
where a pipeline layout has no descriptor set layout. For example :

  pipeline_layout_A :
     set0 : NULL
     set1 : descriptor_set_layout_A

  vkCmdBindDescriptor :
     set0 : descriptor_set_B (with layout bindingCount=0)
     set1 : descriptor_set_C (compatible with descriptor_set_layout_A)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13227
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35125>
(cherry picked from commit 39f55541a3)
2025-05-28 15:22:58 +02:00
Paulo Zanoni
32c297e991 anv/trtt: don't avoid the TR-TT submission when there is stuff to signal
When an application issues a sparse binding operation, it may be the
case that the state the app is setting is the state that is already
there. In that case, both n_l3l2_binds and n_l1_binds are zero, so the
batch doesn't contain anything and, since 0802bbd486, we just skip
the batch submission and return.

The problem is that skipping the batch submission and returning
ignores the synchronization: there may be syncobjs that we have to
wait and, more importantly, there may be syncobjs that we have to
signal.

This case is exercised by vkd3d-proton's test suite, but I'm not aware
of any other workload that triggers it. This commit only affects
Meteor Lake and older, as TR-TT is only the default behavior for the
platforms running i915.ko.

Testcase: vkd3d-proton/d3d12/test_sparse_buffer_memory_lifetime
Fixes: 0802bbd486 ("anv/trtt: don't submit empty batches when there are no binds to do")
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35078>
(cherry picked from commit d77b49eb0a)
2025-05-28 15:22:58 +02:00
Calder Young
25316916b1 iris: set dependency between SF_CL and CC states
Applied the fix from commit 3a54e9f6 to the Iris Gallium driver

Fixes: bc42bbff4c ("iris: Wa_14016820455 for GFX_VERx10 == 12.5")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35082>
(cherry picked from commit b0eb715b50)
2025-05-28 15:22:58 +02:00
Calder Young
40fcde7e88 iris: Fix accidental writes to global dirty bit instead of local
Fixes: 0e9a26372b ("iris: implement Wa_14018912822")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35102>
(cherry picked from commit 8547f8b557)
2025-05-28 15:22:58 +02:00
Lionel Landwerlin
ce9993749e hasvk/elk: stop turning load_push_constants into load_uniform
Those intrinsics have different semantics in particular with regards
to divergence. Turning one into the other without invalidating the
divergence information breaks NIR validation. But also the conversion
means we get artificially less convergent values in the shaders.

So just handle load_push_constants in the backend and stop changing
things in Hasvk.

Fixes a bunch of tests in
   dEQP-VK.descriptor_indexing.*
   dEQP-VK.pipeline.*.push_constant.graphics_pipeline.dynamic_index_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>
(cherry picked from commit b036d2ded2)
2025-05-28 15:22:58 +02:00
Lionel Landwerlin
db2163188f anv/brw: stop turning load_push_constants into load_uniform
Those intrinsics have different semantics in particular with regards
to divergence. Turning one into the other without invalidating the
divergence information breaks NIR validation. But also the conversion
means we get artificially less convergent values in the shaders.

So just handle load_push_constants in the backend and stop changing
things in Anv.

Fixes a bunch of tests in
   dEQP-VK.descriptor_indexing.*
   dEQP-VK.pipeline.*.push_constant.graphics_pipeline.dynamic_index_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>
(cherry picked from commit df15968813)
2025-05-28 15:22:58 +02:00
Samuel Pitoiset
418569627a radv: fix missing texel scale for unaligned linear SDMA copies
texel_scale was 0 which caused GPU hangs for unaligned linear copies.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13195
Fixes: 4b73d7e817 ("radv: fix SDMA copies for linear 96-bits formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35047>
(cherry picked from commit c22d86e844)
2025-05-28 15:22:58 +02:00
Rob Clark
4d3fd27189 ci: Disable fd-farm
Take the google farm offline in preparation for shipping.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35093>
(cherry picked from commit 45a2f02876)
2025-05-28 15:22:58 +02:00
Timothy Arceri
6bbd009aae mesa: update validation when draw buffer changes
Otherwise validation that depends on the _IntegerDrawBuffers and
_FP32DrawBuffers bitfield can end up stale.

Fixes: d04d9da98c ("st/mesa: fix _IntegerBuffers bitfield use")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35063>
(cherry picked from commit 3ec68e8382)
2025-05-28 15:22:58 +02:00
Karol Herbst
5a78231290 vtn: fix use-after-free on function parameter names
Fixes: 5d7a230324 ("vtn: gather function parameter names")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35098>
(cherry picked from commit abbb0c0125)
2025-05-28 15:22:57 +02:00
Karol Herbst
b456049782 nir: fix use-after-free on function parameter names
Fixes: 3da8444be5 ("nir: add names to function parameters")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35098>
(cherry picked from commit bc444f6d26)
2025-05-28 15:22:57 +02:00
Christian Gmeiner
f1672e0be2 zink: Fix NIR validation error in cubemap-to-array lowering
The cubemap-to-array pass was changing variable types from samplerCubeArray
to sampler2DArray but leaving the corresponding deref instruction types
unchanged. This caused NIR validation to fail with "instr->type ==
instr->var->type" assertion.

Fix by updating both the variable type and the deref instruction type
to maintain consistency required by NIR validation.

Cc: mesa-stable
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35117>
(cherry picked from commit 86f7ce06be)
2025-05-28 15:22:57 +02:00
Gurchetan Singh
3b265663aa gfxstream: get rid of logspam in virtualized case
In the case of running a Linux VM using some other capability
set than gfxstream, some logspam may be triggered.  Fix this.

CC: mesa-stable

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35109>
(cherry picked from commit 126af1feb9)
2025-05-28 15:22:57 +02:00
David Rosca
f9b723f0b6 radv/video: Limit 10bit H265 decode support to stoney and newer
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12132
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35105>
(cherry picked from commit 1608bc20b5)
2025-05-28 15:22:57 +02:00
LingMan
4642f2035f entaviv/isa: Silence warnings about non snake case names
These are benign style warnings. The code is generated by bindgen and it's a bug there that these
names get generated at all.

Silences these warnings since we can't do anything about them:

```
warning: method `use__raw` should have a snake case name
   --> src/etnaviv/isa/isa_bindings.rs:358:19
    |
358 |     pub unsafe fn use__raw(this: *const Self) -> ::std::os::raw::c_uint {
    |                   ^^^^^^^^ help: convert the identifier to snake case: `use_raw`
    |
    = note: `#[warn(non_snake_case)]` on by default

warning: method `use__raw` should have a snake case name
    --> src/etnaviv/isa/isa_bindings.rs:1023:19
     |
1023 |     pub unsafe fn use__raw(this: *const Self) -> ::std::os::raw::c_uint {
     |                   ^^^^^^^^ help: convert the identifier to snake case: `use_raw`
```

Fixes: 15a784689e ("etnaviv: isa: Generate Rust FFI bindings for asm.h")
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34969>
(cherry picked from commit 040ef8f5c9)
2025-05-28 15:22:57 +02:00
Georg Lehmann
601387e0d2 aco: assume sram ecc is enabled on Vega20
There are D16 load issues on Vega20 that are expected if sram ecc is enabled.
It's a professional class chip and I found mentions of it supporting ecc,
so assume it's enabled.

Maybe this could be improved by querying ecc info from the kernel, but
I'm not sure which query should be used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13189
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12393
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35045>
(cherry picked from commit 0257644130)
2025-05-28 15:22:57 +02:00
Rhys Perry
9109dd063b aco/gfx115: consider point sample acceleration
Like 15428e0d786939a5c7629a9978947c8a9112ce96 in LLVM.

fossil-db (gfx1150):
Totals from 909 (1.14% of 79653) affected shaders:
Instrs: 5840489 -> 5840705 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31133460 -> 31134296 (+0.00%); split: -0.00%, +0.00%
Latency: 52982280 -> 53438577 (+0.86%); split: -0.00%, +0.86%
InvThroughput: 10841454 -> 10942682 (+0.93%); split: -0.00%, +0.93%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935>
(cherry picked from commit 171920ceed)
2025-05-28 15:22:57 +02:00
Matt Turner
791c1ce754 gallivm: Use llvm.roundeven in lp_build_round()
`lp_build_round` intends to implement round with ties-to-even behavior,
as can be seen by its test's use of `nearbyint` to generate reference
values and by it use in implementing `nir_op_fround_even`.

Fixes: 0d3b285360 ("gallivm: use llvm intrinsics for 16-bit round/trunc/roundeven")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34937>
(cherry picked from commit eea3ed6a37)
2025-05-28 15:22:57 +02:00
Timothy Arceri
6865ce622d mesa: fix _FP32Buffers bitfield use
Previously we were assuming that all color attachments were active.

Fixes: 070a5e5d92 ("mesa: add explicit enable for EXT_float_blend, and error condition")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35014>
(cherry picked from commit c7c4905981)
2025-05-28 15:22:57 +02:00
Timothy Arceri
51abc314f2 mesa/st: fix _IsRGBDraw bitfield use
Previously we were assuming that all color attachments were active.

Fixes: 5b51d754d0 ("st/mesa: Optionally override RGB/RGBX dst alpha blend factors")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35014>
(cherry picked from commit b7d8c195a2)
2025-05-28 15:22:56 +02:00
Timothy Arceri
c8da6675ff mesa/st: fix _BlendForceAlphaToOneDraw bitfield use
Previously we were assuming that all color attachments were active.

Fixes: 4f28e2827c ("mesa: fix blending when using luminance/intensity emulation")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35014>
(cherry picked from commit c1d00c9a1a)
2025-05-28 15:22:56 +02:00
Timothy Arceri
443a4a6d15 st/mesa: fix _IntegerBuffers bitfield use
Previously we were assuming that all color attachments were active.

Fixes: 8fb966688b ("st/mesa: Disable blending for integer formats.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13168
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35014>
(cherry picked from commit d04d9da98c)
2025-05-28 15:22:56 +02:00
Lionel Landwerlin
35d535c762 anv: enable preemption setting on command/batch correctly
The 2 helpers we're using for doing internal operations (copies,
command generation, etc...) can work on command buffers or lower level
batches.

When working with command buffers, the helpers should set the
preemption using genX(cmd_buffer_set_preemption) so that whatever
operation comes after toggles the state back to what it needs and we
minimize the toggles.

When working with batchs, the helpers should disable preemption using
genX(batch_set_preemption) and turn it back on when done.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35030>
(cherry picked from commit c570740272)
2025-05-28 15:22:56 +02:00
Ella Stanforth
c8f0e53a90 v3d/compiler: Fix ub when using memcmp for texture comparisons.
We need to zero out all memory in the struct otherwise memcmp ends up comparing
padding bytes.

Cc: mesa-stable
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34945>
(cherry picked from commit be3ce07f58)
2025-05-28 15:22:56 +02:00
Olivia Lee
ad4fa97597 util/u_printf: fix memory leak in u_printf_singleton_add_serialized
info->arg_sizes and info->strings were leaked because they were
allocated in the global context.

Fixes: 007f60c8b8 ("util/u_printf: add singleton implementation")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34987>
(cherry picked from commit 22fb7eaa8c)
2025-05-28 15:22:56 +02:00
David Rosca
2e6ea1aaab radeonsi/vce: Fix output quality and performance in speed preset
Fixes: 544a180320 ("radeonsi/vce: Support quality presets")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34894>
(cherry picked from commit bade93c447)
2025-05-28 15:22:56 +02:00
David Rosca
59e73a78bb radeonsi/vce: Only send one task per IB
There is no need to use second task for config when creating the
session, also it doesn't work now as we don't set the next task
offset in task info anymore.

Fixes: 9ca1cda2be ("radeonsi/vce: Cleanup")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34894>
(cherry picked from commit ad96031ec6)
2025-05-28 15:22:56 +02:00
David Rosca
eb020edc6b radeonsi/vce: Fix bitstream buffer size
On old VCE this was being rejected by kernel because the size here
was the buffer size, but the bitstream buffer address includes the
offset.

Fixes: 901aafb030 ("radeonsi/vce: Support raw packed headers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13128
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34894>
(cherry picked from commit fd1480c3df)
2025-05-28 15:22:56 +02:00
Mel Henning
9ad71e37a5 nouveau/headers: Ignore PermissionError in rustfmt
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13172
Fixes: 591b5da4 ("nouveau/headers: Run rustfmt on generated files")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35008>
(cherry picked from commit bfe8340296)
2025-05-28 15:22:56 +02:00
Mel Henning
c8f01c326b nouveau/headers: Run rustfmt after file is closed
If we run a subprocess while the file is still open, we may not have
flushed the file contents to disk.

Fixes: 591b5da4 ("nouveau/headers: Run rustfmt on generated files")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35008>
(cherry picked from commit da22094593)
2025-05-28 15:22:55 +02:00
José Roberto de Souza
a70e8c5315 anv: Enable preemption due 3DPRIMITIVE in GFX 12
The issues preventing it to be enabled were fixed so now we can enable
it but we need also to enable workaround 16013994831 back again.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
(cherry picked from commit 3cd972a2d3)
2025-05-28 15:22:55 +02:00
José Roberto de Souza
014c3193e1 anv: Implement missing part of Wa_1604061319
Description of this workaround are not clear but looking at Iris
implementation we need to emit all 3DSTATE_PUSH_CONSTANT_ALLOC_XS if
any 3DSTATE_PUSH_CONSTANT_ALLOC_XS is emitted.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
(cherry picked from commit 2432d6677e)
2025-05-28 15:22:55 +02:00
Hans-Kristian Arntzen
2f12a5c531 radv: Consider that DGC might need shader reads of predicated data.
Similar to indirect draw barrier, need similar fixups for conditional
rendering access.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34956>
(cherry picked from commit e674823d55)
2025-05-28 15:22:55 +02:00
Samuel Pitoiset
0e296d76a1 radv: fix conditional rendering with DGC and non native 32-bit predicate
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.

But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.

To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.

This fixes a very weird corner case with vkd3d-proton.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
(cherry picked from commit 3ca2f71f3d)
2025-05-28 15:22:55 +02:00
Samuel Pitoiset
5ac5572bc6 radv: fix fetching conditional rendering state for DGC preprocess
This state must be fetched from the stateCommandBuffer, not from the
current cmdbuf which executes the preprocess().

Partial fix for https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
(cherry picked from commit e2625fa9ca)
2025-05-28 15:22:55 +02:00
Dave Airlie
bf645654b9 nvk: Fix compute class comparison in dispatch indirect
This works by coincidence rather than design.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
(cherry picked from commit bd7777aee6)
2025-05-28 15:22:55 +02:00
Lars-Ivar Hesselberg Simonsen
404cfdd93f panvk/v9+: Set up limited texture descs for storage use
Storage access to images using LEA_TEX[_IMM] has limitations on some
fields in the texture descriptors, making them incompatible with the
descriptors required for texture access, specifically in the case
non-zero levels.

This change sets up two sets of texture descriptors for image views of
storage images, then picks the correct one when writing the image view
descriptors.

Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
(cherry picked from commit 7451bc3bef)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35025>
2025-05-28 15:22:55 +02:00
Eric Engestrom
3ccab9e7ae .pick_status.json: Update to 8965e60118 2025-05-28 15:22:52 +02:00
Eric Engestrom
5286ddbca1 .pick_status.json: Mark 29d7b90cfc as denominated 2025-05-25 21:27:01 +02:00
Georg Lehmann
2d525d5e34 radeonsi: always lower alu bit sizes
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13072

load_vs_input_from_vertex_buffer can create unsupported 16bit shifts on GFX6/7.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>

Cc: mesa-stable
(cherry picked from commit 33b5d8b2ec)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35150>
2025-05-25 21:27:01 +02:00
Lars-Ivar Hesselberg Simonsen
f1810cd9e7 panvk/v9+: Set up limited texture descs for storage use
Storage access to images using LEA_TEX[_IMM] has limitations on some
fields in the texture descriptors, making them incompatible with the
descriptors required for texture access, specifically in the case
non-zero levels.

This change sets up two sets of texture descriptors for image views of
storage images, then picks the correct one when writing the image view
descriptors.

Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
(cherry picked from commit 7451bc3bef)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35025>
2025-05-18 02:49:30 +00:00
Natalie Vock
5248c792ce driconf: Fix DOOM: The Dark Ages workaround name in 25.0.x
Before 25.1, it's radv_legacy_sparse_binding, not
radv_disable_dedicated_sparse_queue.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34998>
2025-05-15 14:41:56 +02:00
Eric Engestrom
5a92849e1d docs: add sha sum for 25.0.6 2025-05-14 19:14:15 +02:00
Eric Engestrom
d0b545b3ef VERSION: bump for 25.0.6 2025-05-14 19:05:26 +02:00
Eric Engestrom
64ef24064b docs: add release notes for 25.0.6 2025-05-14 19:05:26 +02:00
Timothy Arceri
ab1edf76ed mesa: relax EXT_texture_integer validation
This updates mesa to avoid throwing an error if an attached fbo
wont actually be drawn into.

Fixes: 705978e283 ("mesa: do integer FB / shader validation check in _mesa_valid_to_render()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13144
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34949>
(cherry picked from commit 1d4ebe79b5)
2025-05-14 19:04:31 +02:00
Thomas H.P. Andersen
349353dde6 driconf: update X4 Foundations executable name
'X4.exe' is the executable. But there is also a script 'X4' that is used to
launch the game. This script is what steam uses.
This updates driconf to match that.
This also brings the executable in line with other configs for the game.

Fixes: 5532f13566 ("driconf: override vendor id for X4 Foundations on NVK")
Fixes: 8654a7727f ("driconf: set vk_zero_vram driconf for X4 Foundations")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34168>
(cherry picked from commit a87c9bc49e)
2025-05-14 19:04:31 +02:00
Samuel Pitoiset
7b3e432d90 radv: remove the optimization for equal immutable samplers
This optimization used to optimize the allocated space for descriptors
when immutable samplers are equal. Though, this was basically broken :

- descriptor copies were broken for combiner image sampler (or sampler)
  with equal immutable samplers because 96 bytes were copied instead of
  64 bytes (cf. the linked ticket). This could be fixed but it's not
  worth it.
- the value returned by vkGetDescriptorLayoutSupport() was broken, it
  should have been 96 with no immutable samplers (or when they aren't
  equal)

This optimization was also not applied for descriptor buffers which is
the default for vkd3d-proton and Zink. DXVK doesn't use db but it
doesn't use immutable samplers, so basically only native vulkan games
would be concerned.

Note that immutable samplers would still be inlined in shaders if no
indirect access which should be 99.9% of the usecase.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11165
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34928>
(cherry picked from commit 69ff204422)
2025-05-14 19:04:31 +02:00
Samuel Pitoiset
db4b914bd5 radv: fix emitting dynamic viewports/scissors when the count is static
In a scenario where the viewports/scissors are a dynamic state but the
count is static (ie. updated when a graphics pipeline is bound), the
driver wasn't considering that and it was re-emitting the previous
number of viewports/scissors.

This fixes rendering issue with Blender.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34921>
(cherry picked from commit 9a07ccbc89)
2025-05-14 19:04:31 +02:00
David Rosca
63b0a527ed radv/video: Use ac_uvd_alloc_stream_handle
ac_uvd_alloc_stream_handle tries to avoid collisions in the case
when PID is not unique (eg. in sandboxes like Flatpak).

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12607
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34807>
(cherry picked from commit 5fee04bcae)
2025-05-14 19:04:31 +02:00
David Rosca
30367ce279 ac/uvd: Add ac_uvd_alloc_stream_handle
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34807>
(cherry picked from commit 69455e8208)
2025-05-14 19:04:31 +02:00
Natalie Vock
45235bf73c driconf: Add workarounds for DOOM: The Dark Ages
Like other idTech games, it needs radv_zero_vram and
radv_disable_dedicated_sparse_queue. It also needs
radv_force_64k_sparse_alignment.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34944>
(cherry picked from commit 4339cf0aff)
2025-05-14 19:04:31 +02:00
Natalie Vock
4c95ff61ca radv,driconf: Add radv_force_64k_sparse_alignment config
Needed by DOOM: The Dark Ages.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34944>
(cherry picked from commit e32a90b57c)
2025-05-14 19:04:31 +02:00
Eric Engestrom
63e9748a4f .pick_status.json: Mark 4b76d04f7f as denominated 2025-05-14 19:04:30 +02:00
Tapani Pälli
139c068957 mesa: add missing stencil formats to _mesa_is_stencil_format
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13070
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34931>
(cherry picked from commit 720dae85f2)
2025-05-14 19:04:30 +02:00
Samuel Pitoiset
7605ff03d6 radv: fix SDMA copies for linear 96-bits formats
The hardware requires a power of two bpe. To do that, the driver
needs to adjust the pitch/offset/extent based on a texel scale factor
which only applies to 96-bits formats.

This fixes new VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34927>
(cherry picked from commit 4b73d7e817)
2025-05-14 19:04:30 +02:00
Marek Olšák
aaf531dcd0 nir: fix gathering color interp modes in nir_lower_color_inputs
Fixes: 709ebd82 ("amd: expose nir_io_mix_convergent_flat_with_interpolated")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12800

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34942>
(cherry picked from commit a1ee6d6730)
2025-05-14 19:04:30 +02:00
Mike Blumenkrantz
ee0984f840 zink: fix broken comparison for dummy pipe surface sizing
this should create a new surface if the existing one is too small,
not if it is too big

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34933>
(cherry picked from commit ef63e3e4d2)
2025-05-14 19:04:30 +02:00
Matthieu Oechslin
4ad08d90e0 r600: Take dual source blending in account when creating target mask with RATs
This is properly checked when filling CB_... registers in
evergreen_emit_image_state(), but not when generating CB_TARGET_MASK.
It would lead to an invalid command steam if a fragment shader
uses SSBO/Image load/store alongside dual source blending.

Acked-by: Patrick Lerda <patrick9876@free.fr>
Fixes: a6b3792843
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/622
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34333>
(cherry picked from commit 4e68e422e0)
2025-05-14 19:04:30 +02:00
Rhys Perry
161e3d942d ac/llvm: correctly set alignment of vector global load/store
For coherent/volatile access, this would be too high for vector access.

Even when we didn't set the alignment, LLVM seemed to assume too high of
an alignment for 8/16-bit vector access.

Fixes generated_tests/cl/vload/vload-char-constant.cl

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>
(cherry picked from commit d0a09b6ff7)
2025-05-14 19:04:30 +02:00
Rhys Perry
3936208f3c ac/llvm: correctly split vector 8/16-bit stores
This assumes that the start of the load is 32-bit aligned.

For example, a vec3 16-bit store with align_offset=2 should split off the
first component, not the last.

This probably also fixed splitting with 8-bit stores.

Fixes arb_copy_buffer-overlap

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>
(cherry picked from commit c1ecad2b11)
2025-05-14 19:04:30 +02:00
Lars-Ivar Hesselberg Simonsen
7f17f1f03e pan/texture/v10+: Set width/height in the plane descs
We're currently not setting the v10+ width/height in the plane
descriptors. This change ensures we do.

Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34839>
(cherry picked from commit e2aa0b7566)
2025-05-14 19:04:30 +02:00
Lars-Ivar Hesselberg Simonsen
fa1f1d8308 pan/genxml/v10: Add minus1 mod for plane width/height
The width/height fields in the plane descriptors for v10 are missing
their minus(1) modifiers.

This change adds the missing modifiers, which implies also setting
default values to 1 due to how the Two-Plane YUV Overlay interacts with
the plane descriptors.

Fixes: 486c341769 ("panfrost: Add architecture description XML for v10")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34839>
(cherry picked from commit 2542857259)
2025-05-14 19:04:30 +02:00
Lars-Ivar Hesselberg Simonsen
706783e976 pan/texture: Set plane size to slice size
Rather than setting the plane size to the full allocation minus the
current offset, set it to the actual size of the plane.

Fixes: db20152c8a ("panfrost: Handle Valhall texturing")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34839>
(cherry picked from commit 6a9a4b3eef)
2025-05-14 19:04:30 +02:00
Lars-Ivar Hesselberg Simonsen
684d6f0d68 pan/texture: Correctly handle slice stride for MSAA
Currently, we will always be setting the slice stride in the plane
descriptor to the surface stride, as the check for multisampling is true
even for single sampled surfaces.

This change fixes this check.

Fixes: db20152c8a ("panfrost: Handle Valhall texturing")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34839>
(cherry picked from commit cc58e30847)
2025-05-14 19:04:30 +02:00
Marek Olšák
2da50834fe nir/opt_vectorize_io: fix a failure when vectorizing different bit sizes
Fixes: 2514999c9c - nir: add nir_opt_vectorize_io, vectorizing lowered IO
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13085

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34897>
(cherry picked from commit dbef8f1791)
2025-05-14 19:04:30 +02:00
David Rosca
ed0d06d796 frontends/vdpau: Fix creating surfaces with 422 chroma
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13103
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34831>
(cherry picked from commit f8042fa926)
2025-05-14 19:04:29 +02:00
Robert Mader
e4d9320d2d llvmpipe: Fix dmabuf import paths for DRM_FORMAT_YUYV variants
Right now llvmpipe only successfully supports single-plane formats,
limiting the number of supported YCbCr formats to a relatively small
number.

The implicit support for R8G8_R8B8 style subsampled RGB formats
causes the most common ones, YUYV and its variants, to chain up
to to lp_build_fetch_subsampled_rgba_aos() when importing (u)dmabufs
with EXT_image_dma_buf_import.
This code path currently has at least the following issues:
1. It doesn't support the YVYU/PIPE_FORMAT_R8B8_R8G8_UNORM and
    VYUY/PIPE_FORMAT_B8R8_G8R8_UNORM, resulting in asserts/crashes.
2. The supported cases, YUYV and UYVY, end up with sub-optimal results
    as they always return BT.601/narrow results, ignoring
    EGL_YUV_COLOR_SPACE_HINT_EXT and EGL_SAMPLE_RANGE_HINT_EXT.

Stopping advertising support for those formats, as well as native support
for PIPE_FORMAT_YUYV and PIPE_FORMAT_UYVY, results in all four variants
taking fallback paths which happen to be much better supported.

An additional effect is that YUYV and UYVY are correctly advertised as
external only.

Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34775>
(cherry picked from commit 4051d4ef59)
2025-05-14 19:04:29 +02:00
Gurchetan Singh
ecf46edd8a gfxstream: make sure by default descriptor is negative
Otherwise, another valid fd may be closed.

Cc: mesa-stable
Reviewed-by: Aaron Ruby <aruby@qnx.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34885>
(cherry picked from commit 03a35024a6)
2025-05-14 19:04:29 +02:00
Samuel Pitoiset
3295247e52 radv: ignore radv_disable_dcc_stores on GFX12
It's not necessary because DCC is completely transparent to the
userspace driver. Also it's causing issues with scanout.

This fixes rendering issues with scanout in Indiana Jones.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12924
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34859>
(cherry picked from commit b7d2cdd2b4)
2025-05-14 19:04:29 +02:00
Lionel Landwerlin
5e0a552fdc vulkan/runtime: fixup assert with link_geom_stages
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34874>
(cherry picked from commit 565ac1ee6a)
2025-05-14 19:04:29 +02:00
Timothy Arceri
fe40642c6b mesa: fix color material tracking
f6c8ca06 changed this code to only set color materials mask when
the VERT_BIT_COLOR0 bit is set instead of when color material
is enabled. But this meant we always skipped over the
STATE_CURRENT_ATTRIB values.

Fixes: f6c8ca06f6 ("mesa: fix material inputs in ffvertex_prog.c")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7122
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34833>
(cherry picked from commit 600892802d)
2025-05-14 19:04:29 +02:00
Sagar Ghuge
1a4f9e6c00 anv: Fix untyped data port cache pipe control dump output
Fixes: 845ab3d627
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34855>
(cherry picked from commit bb61a78911)
2025-05-14 19:04:29 +02:00
Konstantin Seurer
d30b008fb1 radv: Return VK_ERROR_INCOMPATIBLE_DRIVER for unsupported devices
VK_ERROR_INITIALIZATION_FAILED will fail physical device enumeration.
Returning VK_ERROR_INCOMPATIBLE_DRIVER means that the driver can still
be used on supported GPUs when multiple GPUs are installed.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34783>
(cherry picked from commit 84b9c281fe)
2025-05-14 19:04:29 +02:00
Faith Ekstrand
7826d7c486 nak: Set lower_pack_64_4x16
Otherwise, these can cause infinite loops in optimization because there
aren't _split variants and the optimizer tries to combine and split
things infinitely.

Reviewed-by: Mel Henning <drawoc@darkrefraction.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34849>
(cherry picked from commit efd1cddbe9)
2025-05-14 19:04:29 +02:00
Mel Henning
12de3a247f nak: Check that swizzles are none
wherever we check that src_mod is none.

This commit simply does:
s/src_mod.is_none()/is_unmodified()/
across all of nak except the definition of is_unmodified() itself.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Fixes: bad23ddb48 ("nak: Add F16 and F16v2 sources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34794>
(cherry picked from commit 9d1c38ddf1)
2025-05-14 19:04:29 +02:00
Mel Henning
99db61950d nak: Add Src::is_unmodified() helper
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Fixes: bad23ddb48 ("nak: Add F16 and F16v2 sources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34794>
(cherry picked from commit 6e72f0f81b)
2025-05-14 18:03:34 +02:00
Karol Herbst
dae6142272 iris/xe: take the grids variable_shared_mem into account
This fixes OpenCL local memory kernel arguments.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34548>
(cherry picked from commit 7c78c76181)
2025-05-14 18:03:34 +02:00
Karol Herbst
bffcd04bc5 iris/xe: fix compute shader start address
It needs to apply the offset so it selects the correct SIMD shader.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34548>
(cherry picked from commit fee9230bb5)
2025-05-14 18:03:34 +02:00
Karol Herbst
9290ce4827 iris: parse global bindings for every gen
This fixes OpenCL support on gen 12.5+

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34548>
(cherry picked from commit 57ccfd0502)
2025-05-14 18:03:34 +02:00
Samuel Pitoiset
cedd447a92 radv: fix GPU hangs with image copies for ASTC/ETC2 formats on transfer queue
Emitting compute dispatches on SDMA just hangs. It might be needed
to switch to gang submit for these to work but fixing the GPU hang is
more important for now.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34805>
(cherry picked from commit 0684dc5fa8)
2025-05-14 18:03:34 +02:00
Samuel Pitoiset
231a808f8c radv: disable SINGLE clear codes to workaround a hw bug with DCC on GFX11
This fixes a very weird cache-related corruption with DCC on GFX11 due
to a hw bug according to PAL.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12932
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34790>
(cherry picked from commit 1356d20042)
2025-05-14 18:03:33 +02:00
Samuel Pitoiset
1e5005dfbe radv: do not clear unwritten color attachments with dual-source blending
This is incorrect because the color format at slot 0 needs to be
replicated to the slot 1. But with dual-source blending the colors
written mask is only 0xf and this was clearing the color format at
slot 1.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13082
Fixes: e1483d022b ("radv: clear unwritten color attachments for monolithic PS earlier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34773>
(cherry picked from commit 55ad0fd35c)
2025-05-14 18:03:33 +02:00
Timothy Arceri
0d0cdff6bf util/driconf: add force_gl_depth_component_type_int workaround
This allow us to force mesa to use GL_UNSIGNED_INT rather than
GL_UNSIGNED_SHORT for when chosing the texture format for
GL_DEPTH_COMPONENT. The increased depth precision allows us to
match the Nvidia/AMD closed drivers default behaviour.

Here we also enable the workaround for the remastered tombraider
games.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13032
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34752>
(cherry picked from commit e0a111540f)
2025-05-14 18:03:33 +02:00
Rhys Perry
779d8cb803 aco: swap the correct v_mov_b32 if there are two of them
Previously, this function tried to swap the instruction which is not
v_mov_b32, so that it doesn't introduce any new OPY-only instructions. If
both were v_mov_b32, it swapped Y. Since this makes Y opy-only, this can't
be done if X is also opy-only.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 408fa33c09 ("aco/gfx12: don't use second VALU for VOPD's OPX if there is a WaR")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13101
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34841>
(cherry picked from commit 9ca71b52aa)
2025-05-14 18:03:33 +02:00
José Roberto de Souza
78d8ff8663 intel/tools: Fix batch buffer decoder
intel_decoder_init() initializes intel_batch_decode_ctx so later
we can call decode functions but it depends on data stored in
brw/elk_isa_info but that was being allocated in stack
of intel_decoder_init() then when the decode functions were executed
it was accessing garbage at the brw/elk_isa_info memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ec2d20a70d ("intel/tools: Add helpers for decoder_init/disasm")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34776>
(cherry picked from commit 3e5a735d01)
2025-05-14 18:03:33 +02:00
Lionel Landwerlin
160d1eb614 intel: fix null render target setup logic
Or current render target cache setting is to key on the binding table
index, meaning the HW associates a number in the range [0, 7] to a
RENDER_SURFACE_STATE description. If you want change the render target
0 between 2 draw calls, you need to insert a PIPE_CONTROL in between
the 2 draw calls with pb-stall + rt-flush in order to flush an writes
to a previous RENDER_SURFACE_STATE that has now becomed disassociated
with the [0, 7] number.

This PIPE_CONTROL taking care of the flush is dealt with in
cmd_buffer_maybe_flush_rt_writes(). This function diffs the current
BTI setup for render targets (first 0 to 7 BTIs) with what the next
fragment shader wants.

The issue here is we might have a render pass with 0 color attachments
and yet in 98cdb9349a we added one pointing to the render target 0,
but in the emit_binding_table() when we finally program the BTI, we
check the render pass color count and program a null surface state
instead of an actual surface state. And this leads to hangs because
the render target cache will end up with inconsistent state data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 98cdb9349a ("anv: ensure null-rt bit in compiler isn't used when there is ds attachment")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12955
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34603>
(cherry picked from commit 63f633557f)
2025-05-14 18:03:33 +02:00
Lionel Landwerlin
b029a79b83 anv: force fragment shader execution when occlusion queries are active
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34732>
(cherry picked from commit f7bc22e0d7)
2025-05-14 18:03:33 +02:00
Paul Gofman
ac96c18f3a radv/amdgpu: Fix hash key in radv_amdgpu_winsys_destroy().
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34774>
(cherry picked from commit 96765935e8)
2025-05-14 18:03:33 +02:00
Karol Herbst
493e844793 r600: fix r600_buffer_from_user_memory for rusticl
Not entirely sure if it's actually required, but this makes it consistence
with r600_resource_create also calling r600_compute_global_buffer_create
for global memory buffers.

Cc: mesa-stable
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Patrick Lerda <patrick9876@free.fr>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34623>
(cherry picked from commit f6e3c967d9)
2025-05-14 18:03:33 +02:00
Mike Blumenkrantz
d67267695a egl: fix sw fallback rejection in non-sw EGL_PLATFORM=device
previously progress could still be made during sw fallback here,
which would lead to unpredictable results with driver loading e.g., crashing

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34609>
(cherry picked from commit 8a339cdebc)
2025-05-14 18:03:33 +02:00
Connor Abbott
69bf3fd90d ir3: Take LB restriction on constlen into account on a7xx
On a7xx, the max constlen for compute is increased to 512 vec4s or 8KB,
however the size of the LB was not increased beyond 40KB. A quick
calculation shows that 8KB of consts multiplied by 2 banks plus the
API maximum of 32KB shared memory would exceed 40KB. This means that
we can't always use a constlen of 512, and sometimes have to fall back
to 256 when a lot of shared memory is in use.

In the future, we can use similar calculations to figure out how much
"extra" shared memory is available for the backend to spill to, but we
currently don't support spilling to shared memory.

Fixes: 5879eaac18 ("ir3: Increase compute const size on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34746>
(cherry picked from commit ea9d694a7b)
2025-05-14 18:03:33 +02:00
Connor Abbott
6a2668c16d freedreno/a6xx, turnip: Set CONSTANTRAMMODE correctly
This should fix hangs when using more than 256 constants on a7xx.

Fixes: 5879eaac18 ("ir3: Increase compute const size on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34746>
(cherry picked from commit 80bcbc0e92)
2025-05-14 18:03:32 +02:00
Connor Abbott
f6303aac40 freedreno/a6xx: Define CONSTANTRAMMODE
While we're here, give SP_CS_UNKNOWN_A9B1 a better name.

Fixes: 5879eaac18 ("ir3: Increase compute const size on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34746>
(cherry picked from commit 57986ae5ec)
2025-05-14 18:03:32 +02:00
Connor Abbott
162f668ab1 freedreno: Add compute_lb_size device info
This is really a guess except for a6xx and later, however it shouldn't
change behavior from before.

Fixes: 5879eaac18 ("ir3: Increase compute const size on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34746>
(cherry picked from commit 156ab5839d)
2025-05-14 18:03:32 +02:00
Eric Engestrom
cbc3b69d17 .pick_status.json: Mark eeffb4e674 as denominated 2025-05-14 18:03:32 +02:00
Karmjit Mahil
13d50cfb54 tu: Fix segfault in fail_submit KGSL path
Fixes: ec268fa5b6 ("tu/kgsl: Support u_trace and perfetto")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34749>
(cherry picked from commit 9dfd4a091c)
2025-05-14 18:03:32 +02:00
Mel Henning
9a7b130479 nak: Remove hfma2 src 1 modifiers
This fixes a compilation issue in Marvel Rivals where the legalization
logic and the encoding logic don't line up, which results in an
assertion failure on this instruction:

    r17 = hfma2 r17.xx -r18.xx 0x3c003c00

The fix here is a little overly restrictive because it turns out we
actually do have modifiers for all 3 sources. Those modifiers will
be added in later commits.

Fixes: 567cae69c3 ("nak: Add 16-bits float operations")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34750>
(cherry picked from commit 1ff7135691)
2025-05-14 18:03:32 +02:00
Sagar Ghuge
fb6363315e intel/compiler: Fix stackIDs on Xe2+
For Xe2+, from Bspec 64643, bit field "StackID": The maximum number of
StackIDs can be 2^12- 1.

Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34709>
(cherry picked from commit 821c1bfa7e)
2025-05-14 18:03:32 +02:00
Eric Engestrom
73481cb360 .pick_status.json: Update to e7a7d9ea2e 2025-05-14 18:03:21 +02:00
Eric Engestrom
5db4facf10 docs: add sha sum for 25.0.5 2025-04-30 19:31:46 +02:00
Eric Engestrom
a89e404408 VERSION: bump for 25.0.5 2025-04-30 19:18:00 +02:00
Eric Engestrom
aee40e4271 docs: add release notes for 25.0.5 2025-04-30 19:18:00 +02:00
Samuel Pitoiset
ef9fd1c9be radv: set radv_disable_dcc=true for WWE 2k23
This game is no longer available Steam, so it's more annoying to
reproduce the issue.

Let's disable DCC for that game to workaround rendering issues which
are likely game bugs.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10850
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34607>
(cherry picked from commit 5841d44f91)
2025-04-30 14:19:01 +02:00
Samuel Pitoiset
1e401d3d25 radv: fix re-emitting VRS state when rendering begins
This state also depends on whether a VRS attachment is used.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11693
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34735>
(cherry picked from commit 1fccc09abe)
2025-04-29 16:38:19 +02:00
Samuel Pitoiset
5c21544dd8 radv: only enable DCC for invisible VRAM on GFX12
DCC should only be allowed on invisible VRAM, otherwise the CPU could
read the data and it will read garbage if it's compressed.

This also caused GPU hangs after suspend/resume probably because
some buffers were compressed when moved back from GTT to VRAM.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12962
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12922
Fixes: 9af11bf306 ("radv: add initial DCC support on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34347>
(cherry picked from commit 410f7f9f6e)
2025-04-29 15:39:10 +02:00
David Rosca
c1ad3d28c0 radv: Use radv_format_to_pipe_format instead of vk_format_to_pipe_format
Fixes: 9af11bf306 ("radv: add initial DCC support on GFX12")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34274>
(cherry picked from commit 3ef0ee2241)
2025-04-29 15:39:06 +02:00
Eric Engestrom
1032290a4e .pick_status.json: Update to 5a55133ce7 2025-04-29 15:39:03 +02:00
Loïc Minier
32c9fc1f45 freedreno: check if GPU supported in fd_pipe_new2
fd_pipe_new2 can segfault when trying to set the is_64bit flag on new
pipes. This can happen when the current GPU is not be listed in the
fd_dev_recs table because it's not supported by mesa, but is supported by
the kernel.

Add a helper function to test if the current GPU is in the supported table,
and use it in fd_pipe_new2.

Signed-off-by: Loïc Minier <loic.minier@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33830>
(cherry picked from commit c36cd32345)
2025-04-29 14:13:41 +02:00
Mary Guillemard
390e265643 panvk: Take resource index in valhall_lower_get_ssbo_size
Previously we were not extracting the resource index from the resource
handle.

This fixes failures with PanVK+ANGLE on "dEQP-GLES31.functional.ssbo.array_length.unsized_*".

Fixes: e4613f8b23 ("panvk: Lower get_ssbo_size() on Valhall")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34697>
(cherry picked from commit 845611bb43)
2025-04-27 19:55:46 +02:00
John Anthony
bca113e890 panvk: Enable VK_EXT_direct_mode_display
Panvk already enables VK_EXT_acquire_xlib_display, but not
VK_EXT_direct_mode_display which is a dependency. This causes a failure
in dEQP-VK.info.instance_extensions.

Fixes: 8c2bfa279d ("panvk: support x11 wsi")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34672>
(cherry picked from commit 8dd578e2a4)
2025-04-27 19:55:46 +02:00
Mary Guillemard
727777ad84 panvk: Take rasterization sample into account in indirect draw on v10+
This has been an oversight when implementing indirect draw.

Fixes: 1f3b8bb918 ("panvk: Add support for Draw[Indexed]Indirect")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34674>
(cherry picked from commit c7f2bc6bed)
2025-04-27 19:55:46 +02:00
Olivia Lee
da841c11d4 panfrost: allow promoting sysval UBO to push constants
We already had a path for sysvals in panfrost_emit_const_buf, but it was
unused because we only allowed pushing the default UBO 0. Improves
glmark2 score on G610 from 3051 to 3071, but mostly we need it as a
prerequisite for dynamic blend constants.

Signed-off-by: Olivia Lee <benjamin.lee@collabora.com>
Fixes: 59a3e12039 ("panfrost: do not push "true" UBOs")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34664>
(cherry picked from commit e93261f579)
2025-04-27 19:55:46 +02:00
Rhys Perry
4cd09c4ebd aco/gfx11: create waitcnt for workgroup vmem barriers
It seems this is necessary on GFX11.

Similar to 576a2e798c

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34634>
(cherry picked from commit b03e071583)
2025-04-27 11:54:18 +02:00
Lionel Landwerlin
c4587d2c61 anv: use companion batch for operations with HIZ/STC_CCS destination
We're currently crashing a couple of tests :
   dEQP-VK.pipeline.monolithic.depth.xfer_queue_layout.*

   deqp-vk: ../src/intel/blorp/blorp_blit.c:2935:
     blorp_copy: Assertion `blorp_copy_supports_blitter(batch->blorp, src_surf->surf, dst_surf->surf, src_surf->aux_usage, dst_surf->aux_usage)' failed.

Tested on:
  dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.*
  dEQP-VK.api.copy_and_blit.multiplanar_xfer.*
  dEQP-VK.pipeline.monolithic.depth.xfer_queue_layout.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31eeb72e45 ("blorp: Add support for blorp_copy via XY_BLOCK_COPY_BLT")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34023>
(cherry picked from commit e60416b4e4)
2025-04-27 11:54:17 +02:00
Mel Henning
8f0c4ec91d wsi/headless: Override finish_create
Since headless overrides create_mem, it needs to override finish_create
too. Fixes a segfault in nvk that was caused by us mixing
wsi_create_null_image_mem with wsi_finish_create_blit_context, which
would then call CmdCopyImageToBuffer with image->blit.buffer == NULL

Fixes a cts failure on nvk in:
dEQP-VK.image.swapchain_mutable.headless.2d.r8g8b8a8_unorm_b8g8r8a8_unorm_clear_copy_format_list
and several others

Fixes: 579578f10a ("vulkan/wsi/drm: Break create_prime_image in pieces")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34646>
(cherry picked from commit 60452e016e)
2025-04-27 11:54:14 +02:00
Karol Herbst
88d7ecb68b nir_lower_mem_access_bit_sizes: fix negative chunk offsets
With a 64 bit pointer model, instead of doing -1 the pass ended up doing
+4294967295. The reason here was some implicit integer conversion going
horribly wrong, so just do the offset math in 64 bit to get a nice result.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13023
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34669>
(cherry picked from commit 33965bb21b)
2025-04-27 11:54:13 +02:00
Yinjie Yao
d6399f0f0e frontends/va: Handle properly when decoding more slices than limit
For h264/h265/av1/vp9, give warning when application is
sending more slices than allowed by limit, and stop copying
remaining slices to avoid unwanted behaviour.

Cc: mesa-stable
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34633>
(cherry picked from commit eecfb02463)
2025-04-27 11:54:08 +02:00
Ella Stanforth
a46b01a8c9 v3d/compiler: Fixup output types for all 8 outputs
Cc: mesa-stable
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33942>
(cherry picked from commit 1ec0cdb733)
2025-04-27 11:54:04 +02:00
Dmitry Baryshkov
a2d71040f3 meson: disable SIMD blake optimisations on x32 host
On X.org startup libgallium crashes on x32 hosts inside
blake3_hash_many_sse41(), most likely because of the different pointer
size. Disable SIMD blake implementation if x32 is detected.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34453>
(cherry picked from commit b9c6afd3a7)
2025-04-27 11:48:42 +02:00
Eric Engestrom
40cd43d497 .pick_status.json: Update to 3493500abb 2025-04-27 11:48:41 +02:00
José Roberto de Souza
96a3c83d60 intel: Fix the MOCS values in XY_BLOCK_COPY_BLT for Xe2+
One more instruction were the MOCS value was splited into two
registes.

Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34592>
(cherry picked from commit fcb6dfb29c)
2025-04-23 12:20:58 +02:00
José Roberto de Souza
f4dd46901a intel: Fix the MOCS values in XY_FAST_COLOR_BLT for Xe2+
Xe2 changed the MOCS field in few instructions, those now have a field
for the MOCS index and other the encryption enable bit but ISL returns
the combination of both aka MEMORY_OBJECT_CONTROL_STATE.

To minimize changes I have added 2 macros to extract the values
from the value returned by isl.

From all the instructions changed Mesa only make use of two, so the
other instruction will be handled in the next patch.

Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34592>
(cherry picked from commit 161c412a82)
2025-04-23 12:20:58 +02:00
José Roberto de Souza
094e157daa intel: Program XY_FAST_COLOR_BLT::Destination Mocs for gfx12
Copy engine is not used in gfx12 platforms on ANV but that is possible
in Iris.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34560>
(cherry picked from commit a96e280dfe)
2025-04-23 12:20:58 +02:00
Pierre-Eric Pelloux-Prayer
747a79b13f radeonsi: fix potential use after free in si_set_debug_callback
si_destroy_context needs to call context->set_debug_callback(...) to
avoid the debug logs to access the destroyed context.

Adding this change introduced a different problem: when an aux context
is destroyed from si_destroy_screen, parts of the screen have been
freed already: the shader_compiler_queue_*.

c467a87e06 ("radeonsi: Destroy queues before the aux contexts") moved
the util_queue_destroy calls above the context destruction, but with
the 59a3f38ff6 change, it's not needed anymore: si_destroy_context
will finish the screen shader queues before proceeding with releasing,
so use-after-free isn't possible.

Fixes: 59a3f38ff6 ("radeonsi: clear the debug callback on ctx destroy")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12035
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34574>
(cherry picked from commit 2a381bbc3c)
2025-04-23 12:20:58 +02:00
Karol Herbst
80ce5dcae8 rusticl/device: fix panic when disabling 3D image write support
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12985
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34649>
(cherry picked from commit 6f080ac532)
2025-04-23 11:26:50 +02:00
Danylo Piliaiev
42bd3b7907 tu,freedreno: Don't fallback to LINEAR with DRM_FORMAT_MOD_QCOM_COMPRESSED
DRM_FORMAT_MOD_QCOM_COMPRESSED forces the image to be UBWC regardless
of what's better for perf, we should respect that.

The regression is seen in GTK4 when it tries to create tiny swapchain
images.

Fixes: fc50fb35b0
("tu,freedreno: Enable linear mipmap tail for UBWC images")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34628>
(cherry picked from commit 36f22cc951)
2025-04-23 11:26:44 +02:00
Connor Abbott
53480a2fa1 tu: Fix flushing when using a staging buffer for copies
When doing the flushing, I forgot that because the staging buffer can be
used with different formats with different cpp, we need to make sure
that CCU is properly flushed and invalidated between each copy to the
staging buffer to prevent stale cache entries from creeping in, as the
CCU seems to rely on the cpp staying the same, even on a7xx which
dropped some of the other restrictions like using the same RT
index/layer. For "normal" user-visible copies this is done via
transitioning from UNDEFINED.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34611>
(cherry picked from commit ee10938bee)
2025-04-23 11:26:42 +02:00
Mary Guillemard
bc2bf6c1a7 panvk: reset dyn_bufs map count to 0 in create_copy_table
We were forgetting to reset the map count to 0 in case of dyn_bufs in
create_copy_table.

This was causing invalid copy entries to be added to the table causing
invalid copies in most situation with holes in the set definition while
still binding set 0 or at worst an assert to be triggered in
cmd_fill_dyn_bufs.

This fixes "dEQP-GLES3.functional.ubo.*" and
dEQP-GLES31.functional.ubo.*" on PanVK+ANGLE.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: e350c334b6 ("panvk: Extend the descriptor lowering pass to support Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34652>
(cherry picked from commit 8d2e16cc11)
2025-04-23 11:26:41 +02:00
Georg Lehmann
b623c683fb aco: set opsel_hi to 1 for WMMA
This is ignored by the hardware but LLVM requires it to disassemble GFX12 WMMA.

Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>
(cherry picked from commit b0c8f31600)
2025-04-23 11:26:40 +02:00
Eric Engestrom
e4f1590662 pick-ui: add missing dependency
Somehow I forgot to commit this line 🤦

Fixes: c37a468a8a ("pick-ui: make `Backport-to: 25.0` backport to 25.0 *and more recent release branches*")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34657>
(cherry picked from commit 0a41200f82)
2025-04-23 11:25:29 +02:00
Eric Engestrom
638bf84131 .pick_status.json: Update to 091d52965f 2025-04-23 11:16:08 +02:00
Janne Grunau
b734cf734e venus: virtgpu: Require stable wire format
When VMMs do not support VIRTGPU_DRM_CAPSET_VENUS the capset data
remains zeroed. By requiring the stable wire_format_version 1 this can
be detected early without initialising the renderer.

Avoids triggering `assert(capset->supports_blob_id_0);` in debug builds
under such circumstances.

Cc: mesa-stable
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34613>
(cherry picked from commit 3d3ca9b65e)
2025-04-22 19:42:38 +02:00
Yiwei Zhang
a4ae9c2143 venus: fix missing renderer destructions
With failed compatibility check, the created renderer must be destroyed
within vn_instance_init_renderer.

Cc: mesa-stable
Fixes: 25b8f4f714 ("venus: handle device probing properly.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34613>
(cherry picked from commit 2a4675ee9f)
2025-04-22 19:42:37 +02:00
Janne Grunau
fbe61933a1 venus: Do not use instance pointer before NULL check
Fixes: a753f50668 ("venus: break up vn_device.c")
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34613>
(cherry picked from commit 39e4fd98ce)
2025-04-22 19:42:36 +02:00
Tapani Pälli
e84938a428 iris: make sure to not mix compressed vs non-compressed
This commit implements the following requirement:

   "Keep any UMD-recycling of compression-enabled/disabled
    memory separate."

As additional info there are 2 related wa's for the issue:

   Wa_14018443005
   Wa_18038669374

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34499>
(cherry picked from commit 6d70ec449f)
2025-04-22 19:40:49 +02:00
Tapani Pälli
940c2cbbb6 iris: force reallocate on eglCreateImage with GFX >= 20
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34499>
(cherry picked from commit c2a4657862)
2025-04-22 19:40:48 +02:00
Ian Romanick
8e3cae7c78 elk/algebraic: Don't optimize float SEL.CMOD to MOV
Floating point SEL.CMOD may flush denorms to zero. We don't have enough
information at this point in compilation to know whether or not it is
safe to remove that.

Integer SEL or SEL without a conditional modifier is just a fancy
MOV. Those are always safe to eliminate.

See also 3f782cdd25.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
(cherry picked from commit e783930b10)
2025-04-22 19:39:44 +02:00
Ian Romanick
9af068c5e0 elk/algebraic: Clear condition modifier on optimized SEL instruction
The condition modifier on SEL means something completely different than
it means on MOV.  On MOV it means to modify the flags based on the value
written to the destination. On SEL it means to compare the sources using
that mode and pick the result (i.e., as min() or max()) without
modifying the flags.

The resulting MOV should not have a condition modifier for the same
reason it (already) doesn't have a predicate. This bug was found by
inspection, so I added a unit test.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
(cherry picked from commit f4ede9c10a)
2025-04-22 19:39:22 +02:00
Ian Romanick
ce96dcf1a6 brw/algebraic: Don't optimize float SEL.CMOD to MOV
Floating point SEL.CMOD may flush denorms to zero. We don't have enough
information at this point in compilation to know whether or not it is
safe to remove that.

Integer SEL or SEL without a conditional modifier is just a fancy
MOV. Those are always safe to eliminate.

See also 3f782cdd25.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")

No shader-db changes on any Intel platform.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209903490 -> 209903492 (+0.00%)
Cycle count: 30546025224 -> 30546021980 (-0.00%); split: -0.00%, +0.00%
Max live registers: 65516231 -> 65516235 (+0.00%)

Totals from 2 (0.00% of 706657) affected shaders:
Instrs: 3197 -> 3199 (+0.06%)
Cycle count: 361650 -> 358406 (-0.90%); split: -10.05%, +9.15%
Max live registers: 300 -> 304 (+1.33%)

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
(cherry picked from commit 6a19d8915f)
2025-04-22 19:39:21 +02:00
Ian Romanick
055cbf9836 brw/algebraic: Clear condition modifier on optimized SEL instruction
The condition modifier on SEL means something completely different than
it means on MOV.  On MOV it means to modify the flags based on the value
written to the destination. On SEL it means to compare the sources using
that mode and pick the result (i.e., as min() or max()) without
modifying the flags.

The resulting MOV should not have a condition modifier for the same
reason it (already) doesn't have a predicate. This bug was found by
inspection, so I added a unit test.

No shader-db or shader-db changes on any Intel platform.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
(cherry picked from commit 07dc1d4043)
2025-04-22 19:38:28 +02:00
Mel Henning
006af589ee nvk: Override render enable for blits and resolves
Fixes cts tests:

dEQP-VK.conditional_rendering.conditional_ignore.blit_image
dEQP-VK.conditional_rendering.conditional_ignore.blit_image_inverted
dEQP-VK.conditional_rendering.conditional_ignore.resolve_image
dEQP-VK.conditional_rendering.conditional_ignore.resolve_image_inverted

which were introduced in vk-gl-cts commit 4aa277c300

Fixes: 32f2317223 ("nvk: Use meta for doing blits with the 3D hardware")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34644>
(cherry picked from commit 2fc4c98aaf)
2025-04-22 19:37:33 +02:00
Mel Henning
af61891fed nvk: SET_STATISTICS_COUNTER at start of meta_begin
Ideally, begin/end should be roughly symmetric - the initialization
order should be the reverse of the teardown order.

Fixes: 6f85e6b06b ("nvk: Disable statistics around meta ops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34644>
(cherry picked from commit 52085f2a0e)
2025-04-22 19:37:32 +02:00
Faith Ekstrand
5f36e5961e nak/sm70: Fix the bit74_75_ar_mod assert
It's used for src2, not src0.

Fixes: 40422927dc ("nak: Pass has_mod to all form of src2 requiring it")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33107>
(cherry picked from commit 47fc468944)
2025-04-22 19:37:11 +02:00
Faith Ekstrand
61b44913f5 nak/legalize: Take a RegFile in copy_alu_src_and_lower_fmod
Otherwise, we'll screw up uniform GPRs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33107>
(cherry picked from commit 22a30bfa4f)
2025-04-22 19:36:01 +02:00
Tomeu Vizoso
70ad887eda etnaviv: Release screen->dummy_desc_reloc.bo
We are currently trying to release twice the same dummy BO, while
leaking the other one.

Fixes: bca5ef70a4 ("etnaviv: split dummy RT backing store from reloc")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34627>
(cherry picked from commit 63251d43ae)
2025-04-22 18:47:28 +02:00
Georg Lehmann
e6134c388d nir/opt_algebraic: disable fsat(a + 1.0) opt if a can be NaN
Foz-DB Navi21:
Totals from 9 (0.01% of 79789) affected shaders:
Instrs: 6782 -> 6796 (+0.21%); split: -0.03%, +0.24%
CodeSize: 40020 -> 40108 (+0.22%); split: -0.04%, +0.26%
Latency: 23764 -> 23758 (-0.03%)
InvThroughput: 6424 -> 6431 (+0.11%); split: -0.08%, +0.19%
SClause: 273 -> 275 (+0.73%)
Copies: 338 -> 339 (+0.30%)
VALU: 5138 -> 5147 (+0.18%); split: -0.06%, +0.23%
SALU: 349 -> 350 (+0.29%)
SMEM: 498 -> 500 (+0.40%)

Fixes: a4a3487aae ("nir/opt_algebraic: optimize patterns from Skia")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>
(cherry picked from commit 3e26fc4498)
2025-04-22 18:47:27 +02:00
Yinjie Yao
c72a9e2795 gallium/pipe: Increase hevc max slice to 600
According to the spec, increase max supported slices of hevc to 600.

Cc: mesa-stable
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34632>
(cherry picked from commit 2b5ca87927)
2025-04-22 18:47:26 +02:00
Eric Engestrom
cdd4f62e89 aco: help clang 20 do some additions and subtractions
clang 20 complains:

    ../src/amd/compiler/aco_assembler.cpp:837:28: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
      837 |       vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1;
          |       ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ../src/amd/compiler/aco_assembler.cpp:832:12: note: at offset 5 into destination object ‘vaddr’ of size 5
      832 |    uint8_t vaddr[5] = {0, 0, 0, 0, 0};
          |            ^~~~~
    ../src/amd/compiler/aco_assembler.cpp:837:28: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
      837 |       vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1;
          |       ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ../src/amd/compiler/aco_assembler.cpp:832:12: note: at offset 6 into destination object ‘vaddr’ of size 5
      832 |    uint8_t vaddr[5] = {0, 0, 0, 0, 0};
          |            ^~~~~
    ../src/amd/compiler/aco_assembler.cpp:837:28: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
      837 |       vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1;
          |       ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ../src/amd/compiler/aco_assembler.cpp:832:12: note: at offset 7 into destination object ‘vaddr’ of size 5
      832 |    uint8_t vaddr[5] = {0, 0, 0, 0, 0};
          |            ^~~~~

But `i < MIN2(instr->operands.back().size() - 1, 5 - num_vaddr)` means `i` is
at most `5 - num_vaddr - 1`, which means `vaddr[num_vaddr + i]` =>
`vaddr[num_vaddr + 5 - num_vaddr - 1]` => `vaddr[5 - 1]` => `vaddr[4]` which
is within the valid indices.

For some reason, using signed `int` instead allows clang to figure this
out, so let's do that since we don't need the extra range.

While at it, use ARRAY_SIZE(vaddr) instead of hard-coding the same `5`
in several places.

Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34625>
(cherry picked from commit 2bcb55f3f6)
2025-04-22 18:47:18 +02:00
Marek Olšák
fec9695e67 radv: fix incorrect patch_outputs_read for TCS with dynamic state
Fixes: 8c2f9f0665 - radv: switch to the new TCS LDS/offchip size computation

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>
(cherry picked from commit 4a51089f30)
2025-04-22 18:47:17 +02:00
Rhys Perry
65a50ce376 aco: combine VALU lanemask hazard into VALUMaskWriteHazard
This is now basically the same as the original VALUMaskWriteHazard, except
it now considers both VALU and SALU writes.

Now that it's a part of VALUMaskWriteHazard, differences from the original
VALU lanemask workaround are:
- it includes SALU reads after the write
- it includes VALU writes and SALU/VALU reads after the write which are
  not lanemasks
- it combines s_waitcnt_depctr instructions when it's a read after both a
  SALU write and a VALU write
- non-exec VALU SGPR reads reset the SGPRs read by VALU as a lanemask
- exec SGPRs are ignored

resolve_all_gfx11() is also finished.

fossil-db (navi31):
Totals from 21538 (27.13% of 79377) affected shaders:
Instrs: 27628855 -> 27552972 (-0.27%); split: -0.30%, +0.03%
CodeSize: 145968448 -> 145667616 (-0.21%); split: -0.23%, +0.02%
Latency: 209537805 -> 209509519 (-0.01%); split: -0.02%, +0.00%
InvThroughput: 36304270 -> 36301624 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12623
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11480
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34529>
(cherry picked from commit ce2be5ab8e)
2025-04-22 18:47:10 +02:00
Rhys Perry
2ff09ffbda aco/gfx12: don't use second VALU for VOPD's OPX if there is a WaR
fossil-db (gfx1201):
Totals from 38908 (49.02% of 79377) affected shaders:
Instrs: 30268107 -> 30268131 (+0.00%); split: -0.00%, +0.00%
CodeSize: 180843648 -> 180843640 (-0.00%); split: -0.00%, +0.00%
Latency: 224905962 -> 224906072 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 44322988 -> 44323004 (+0.00%)
VALU: 15124145 -> 15124167 (+0.00%)
VOPD: 4018504 -> 4018482 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34246>
(cherry picked from commit 408fa33c09)
2025-04-22 18:47:09 +02:00
Patrick Lerda
a153a481cc mesa_interface: fix legacy dri2 compatibility
These values are shared with xcb/dri2.h, and can't be changed
without breaking the legacy dri2 compatibility. This change
reverses partially the update done by 3b603d1646.

For instance this issue is triggered on dri2 i915 with
"piglit/bin/glx-copy-sub-buffer -auto" or
"piglit/bin/hiz-depth-read-window-stencil0 -auto".

Fixes: 3b603d1646 ("mesa_interface: remove unused stuff")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34561>
(cherry picked from commit 60a31156b0)
2025-04-22 18:47:02 +02:00
Mike Blumenkrantz
8a4f7476d7 zink: verify that surface exists when adding implicit feedback loop
this can be null if multiple contexts are in use

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34557>
(cherry picked from commit de6efc01c1)
2025-04-22 18:47:00 +02:00
Eric Engestrom
45aa964eb8 pick-ui: make Backport-to: 25.0 backport to 25.0 *and more recent release branches*
It is what developers expect, so make the code match it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34580>
(cherry picked from commit c37a468a8a)
2025-04-22 18:46:38 +02:00
Eric Engestrom
35d5005925 .pick_status.json: Update to 5f3a3740dc 2025-04-22 18:46:36 +02:00
Eric Engestrom
310da5f30b docs: add sha sum for 25.0.4 2025-04-17 02:22:01 +02:00
Eric Engestrom
d0f8720019 VERSION: bump for 25.0.4 2025-04-17 02:04:03 +02:00
Eric Engestrom
bd6a277901 docs: add release notes for 25.0.4 2025-04-17 02:04:03 +02:00
Pierre-Eric Pelloux-Prayer
4437cdabf0 winsys/amdgpu: disable VM_ALWAYS_VALID
The referenced commit has been identified as the root cause of
graphic artifacts / hangs on some APUs.

For now disable AMDGPU_GEM_CREATE_VM_ALWAYS_VALID on all chips
except when user queues are used.

See https://gitlab.freedesktop.org/mesa/mesa/-/issues/12809.

Fixes: 8c91624614 ("winsys/amdgpu: use VM_ALWAYS_VALID for all VRAM and GTT allocations")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34547>
(cherry picked from commit 555821ff93)
2025-04-17 01:24:17 +02:00
David Rosca
0e9f94576f radeonsi/vpe: Use float division to get scaling ratio
Fixes: e85a6b6a63 ("radeonsi/vpe: check reduction ratio")
Reviewed-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34519>
(cherry picked from commit bd6f9e8aee)
2025-04-17 01:24:17 +02:00
Marek Olšák
ba2a1ba2e5 ac/surface: select 3D tile mode without overallocating too much for gfx6-8
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
(cherry picked from commit 78cacfd9ce)
2025-04-17 01:24:17 +02:00
Marek Olšák
48bfe6dbfd ac/surface: make gfx12_estimate_size reusable by gfx6
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
(cherry picked from commit 195e7b4f75)
2025-04-17 01:24:16 +02:00
Ryan Mckeever
651c53fc1f pan/format: Update format flags to follow HW spec
Fixes: 861e7dca ("panfrost: Switch formats to table")

Signed-off-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33787>
(cherry picked from commit b9a9798c46)
2025-04-16 15:52:03 +02:00
Eric Engestrom
9cbca28609 .pick_status.json: Update to 555821ff93 2025-04-16 15:50:33 +02:00
Kenneth Graunke
bb83fd7ac0 brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs()
This allows us to create temporary VGRFs that are larger than
MAX_VGRF_SIZE(devinfo), which will be split eventually.  They may not
be split on the initial pass, because we may need LOAD_PAYLOAD lowering,
copy propagation, and so on to occur first.  So we allow registers to
exceed that size initially.

The "Register allocation relies on split_virtual_grfs()" assertion in
brw_reg_allocate.cpp still asserts that all VGRFs which reach the
register allocator have been properly split.

One case where this is useful is for vectorizing convergent block loads.
We create temporaries to splat the SIMD1 values out to SIMD(N), which
can lead to some very large temporaries.  However, copy propagation and
so on ultimately eliminate these and they'll get split down to proper
sizes or elided entirely in the end.

(Note: both this and the prior commits from this merge request are
 needed to close the linked issue.)

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12324
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit eb1ec9cf8e)
2025-04-16 15:37:06 +02:00
Kenneth Graunke
7a588a5a8e brw: Use live->max_vgrf_size in pre-RA scheduling
Post-RA scheduling doesn't use liveness analysis, so we continue using
MAX_VGRF_SIZE(devinfo).  But for pre-RA scheduling, we now use
live->max_vgrf_size.

This helps get us to a place where we can emit arbitrarily large VGRFs
early on in compilation, but which will be split and cleaned up prior to
register allocation.  It may also allocate smaller arrays in practice
since MAX_VGRF_SIZE(devinfo) assumes the worst case scenario for things
we actually could need to allocate.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit a45583f078)
2025-04-16 15:37:06 +02:00
Kenneth Graunke
0d1e83ca6a brw: Use live->max_vgrf_size in register coalescing
We already require liveness, so just use the actual maximum size we saw
instead of a hardcoded pessimal size.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit 4b27b5895c)
2025-04-16 15:37:05 +02:00
Kenneth Graunke
c906f565b6 brw: Track the largest VGRF size in liveness analysis
We're already looking at this data to calculate the per-component
vars_from_vgrf[] and vgrf_from_vars[] mappings, so just record the
largest VGRF size while we're here.  This will allow passes to size
arrays based on the actual size needed, rather than hardcoding some
fixed size.  In many cases, MAX_VGRF_SIZE(devinfo) is larger than
necessary, because e.g. vec5 sparse sampling results aren't used.
Not hardcoding this means we can also temporarily handle very large
VGRFs which we know will be split eventually, without having to
increase the maximum which is ultimately used for RA classes.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit ea468412f6)
2025-04-16 15:37:05 +02:00
Erik Faye-Lund
6c6c6873c4 panvk: claim official conformance on v10
It's official, PanVK is Vulkan 1.1 conformant on v10. Let's make this
clear.

Backport-to: 25.0
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34500>
(cherry picked from commit 65b7d2e865)
2025-04-16 15:37:05 +02:00
Erik Faye-Lund
238399e93a panvk: set shared_addr_format
We need to set this, otherwise we end up failing tests.

Fixes: 4e111c259c ("panvk: Lower shared memory")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34514>
(cherry picked from commit e77a815299)
2025-04-16 15:37:05 +02:00
Marek Olšák
1fe9f5d3ac radeonsi: add ACO-specific main shader parts
We can't have merged shaders where the first part is compiled using ACO
and the second part is compiled using LLVM.

Add ACO-specific main shader parts to fix that.

This happens when ACO is enabled for gfx12 streamout where GS can be paired
with a previous shader compiled by LLVM.

Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>
(cherry picked from commit 7f7d6deb18)
2025-04-16 15:37:05 +02:00
Marek Olšák
15ea052c20 radeonsi: make si_shader_selector::main_shader_part_* an iterable union
for the next commit

Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>
(cherry picked from commit 4865ac57cc)
2025-04-16 15:37:05 +02:00
Jose Maria Casanova Crespo
9babb23138 v3dv: avoid TFU reading unmapped pages beyond the end of the buffers
TFU units is doing a readahead of 64 bytes. This is causing invalid read
MMU errors that can be observed at the nightly full Vulkan runs on
Broadcom devices.

04:13:59.969: [   85.623205] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x4869000, pte invalid
04:14:05.408: [   91.019321] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x5209000, pte invalid
04:14:05.413: [   91.031662] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x7521000, pte invalid

Although the log reports the TLB the real culprit is the TFU. A fix
to the kernel was submitted to fix AXI ID on V3D 4.2 and 7.1

So doing an over-allocation of 64-bytes at v3dv_AllocateMemory is
the simplest method to make these MMU errors itp disapear.

Running ./deqp-vk for an hour, we can see that ~%40 of allocations
would need an extra page (4096 bytes) to accomodate this 64 bytes
padding.

Fixes: ca330f7f04 ("v3dv: implement VK_EXT_memory_budget")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34475>
(cherry picked from commit 0bcb82048c)
2025-04-16 15:37:04 +02:00
Mike Blumenkrantz
31e9893f64 zink: stop setting ArrayStride on image arrays
this is illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651>
(cherry picked from commit b4e3535650)
2025-04-16 15:37:04 +02:00
Mike Blumenkrantz
0f3b6ba7ad zink: don't set shared block stride without KHR_workgroup_memory_explicit_layout
this is illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651>
(cherry picked from commit 1c0de360bc)
2025-04-16 15:37:04 +02:00
Eric R. Smith
5a685929d3 panfrost: fix transaction elimination crc valid calculation
The setting of the clean_pixel_write_enable flag in pan_prepare_rt
was not consistent with the crc valid calculations in pan_emit_fbd.
This caused the crc_valid flag to not be accurate, causing transaction
elimination to fail.

Fixes: eac8f1d460 ("Revert "panfrost: Disable CRC by default"")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34408>
(cherry picked from commit 69a6db4b2b)
2025-04-16 15:37:04 +02:00
Erik Faye-Lund
27342a5532 nir/lower_tex: use texture_mask instead of shifting on use
In commit 292ac71a4a ("nir/lower_tex: handle deref casts"), we avoided
using texture_index when a texture instruction contained a variable
deref. There's no good reason why this should be done to some of the
lowering, but not all.

So let's fix up code-paths that were added after this change to do the
same.

The first two patches here crossed paths with the commit that introduced
texture_mask, so it's not strange that the change was missed. The last
one seems to have just copied what was done around it, propagating the
issue.

Fixes: 880b00dc59 ("nir/lower_tex: Add support for lowering YUYV formats")
Fixes: 1358d93650 ("nir/lower_tex: Add support for lowering Y41x formats")
Fixes: 65d6f5aed2 ("nir: add options to lower y_vu, yv_yu, yx_xvxu and xy_vxux")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>
(cherry picked from commit 41b136f674)
2025-04-16 15:37:04 +02:00
Faith Ekstrand
5d6c82000c nil: Multiply by array_stride_B instead of adding
Fixes: 5577128c83 ("nil: Rewrite the TIC code in Rust")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34495>
(cherry picked from commit fadac25b0c)
2025-04-16 15:37:04 +02:00
Faith Ekstrand
ea963009f0 nvk/nvkmd: Check the correct flag for the Kepler GART workaround
Fixes: 1db57bb414 ("nvk/nvkmd: Rework memory placement flags")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34495>
(cherry picked from commit 5c81b3546f)
2025-04-16 15:37:04 +02:00
Caio Oliveira
aedb7eb700 nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset
Otherwise this would require combining two values to produce a single
(new bit-size) channel, which vectorize_stores() don't handle.  The pass
can still keep trying smaller bit-sizes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12946
Fixes: ce9205c03b ("nir: add a load/store vectorization pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34414>
(cherry picked from commit 2ed79f80ba)
2025-04-16 15:37:03 +02:00
David Rosca
8ffedebf1c radv/video: Fix encode session info for VCN3+
Last dword should be 0.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
(cherry picked from commit 7249d9548e)
2025-04-16 15:37:03 +02:00
David Rosca
15b2a440da radv/video: Fix msg header total size
It needs to include also codec msg size.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
(cherry picked from commit 34031531fc)
2025-04-16 15:37:03 +02:00
Erik Faye-Lund
b839ea42bf panfrost: fixup typo in 16x sample-pattern
This is an n-queen pattern, where no two values should be on the same
row or column. But this and the second to last element has the same y
component, and neither has the negative one.

Let's fix this up by setting the first value to the negative value. This
matches the D3D 16x sample pattern.

Fixes: a61fb62966 ("panfrost: Upload sample positions on device init")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
(cherry picked from commit b4ebffa1aa)
2025-04-16 15:37:03 +02:00
Lionel Landwerlin
f018626745 brw: fix Wa_22013689345 emission
2 problems :
  - not detecting null destination correctly
  - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering
    already happened

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>
(cherry picked from commit 06ad9a25e5)
2025-04-16 15:37:03 +02:00
Lars-Ivar Hesselberg Simonsen
60a2b66f63 vk/sync: Fix execution only barriers
With vkCmdPipelineBarrier, it's possible to specify a barrier with
pipeline stages but without any memory barriers. These might not be
practical, but are legal Vulkan code.

Barriers like this are currently ignored in mesa, as we only convert
barriers with passed memory barriers into vkCmdPipelineBarrier2.

This commit adds handling of execution only barriers by converting them
into a memory barrier without access masks.

Fixes: 97f0a4494b ("vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34187>
(cherry picked from commit 20c0d169e4)
2025-04-16 15:37:03 +02:00
Tapani Pälli
f3db21ec11 mesa: various fixes for ClearTexImage/ClearTexSubImage
Fixes some upcoming CTS tests for texture clears.

* some drivers will attempt to issue clears with zero range
  and hit asserts/crashes (spec clarification for negative
  values)

* fix error thrown with negative values to match spec

* fix cases for clearing generic compressed formats

* fix negative case of using color format while having
  depth/stencil internalformat and vice versa

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>
(cherry picked from commit 30d78dc942)
2025-04-16 15:37:02 +02:00
Tapani Pälli
0824f95f92 mesa: clamp texbuf query size to MAX_TEXTURE_BUFFER_SIZE
Fixes upcoming CTS test checking for clamping.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>
(cherry picked from commit 3bc016bb6c)
2025-04-16 15:37:02 +02:00
Lionel Landwerlin
499324de9b anv: fix self dependency computation
Some upcoming changes in the runtime will make it impossible to rely
on the pipeline or runtime information to know whether a fragment
shader has input attachments.

Instead we gather that information at compile time and store it in our
shader bind_map.

At runtime we check whether the fragment shader has input attachments
and whether those map to the runtime depth/stencil input attachments
to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
(cherry picked from commit e321c438dc)
2025-04-16 15:37:02 +02:00
Boris Brezillon
fc46313072 vk/pass: Add input attachment location info
For drivers using the render pass emulation provided by the
runtime, it's important to express the mapping between
depth/stencil/color attachments and input attachments using
VkRenderingInputAttachmentIndexInfoKHR, otherwise those drivers
have to special-case emulated render passes in their
CmdBeginRendering() implementation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
(cherry picked from commit be2532fc00)
2025-04-16 15:37:02 +02:00
Boris Brezillon
b9d5a60d10 vulkan/state: Fix input attachment map state initialization/copy
vk_dynamic_graphics_state_copy() is not copying the input attachment
map, and color_attachment_count is not initialized in
vk_dynamic_graphics_state_init_ial().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
(cherry picked from commit 38e546c202)
2025-04-16 15:37:02 +02:00
Alyssa Rosenzweig
c4ef3cb651 panfrost: do not push "true" UBOs
Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for
Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it
does this with a pass that pushes UBOs. That's good!

The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend
at this point. This is where the trouble comes in:

- True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's
  write-combine memory, which we cannot read from efficiently (at least
  depending on coherency details that were never plumbed through panfrost.ko and
  unlikely to be replumbed now that panthor is the new hot stuff). So, pushing
  true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is
  dubious... When I benchmarked this on MT8192 in early 2023, this pushing
  improved FPS in SuperTuxKart but hurt FPS in Dolphin.

- True UBOs can be written on the GPU. In OpenGL, we have batch tracking
  infrastructure to sort this mess out in theory. What this means is that
  pushing UBOs requires us to flush writers AND STALL at draw-time. If this is
  ever hit, our performance is utterly trashed. But it gets worse.

- True UBOs can be written in the same batch that reads them. For example, we
  could bind a buffer as a transform feedback buffer, do a draw with XFB, then
  rebind as a UBO and do a draw reading. This is where we collapse -- our logic
  will flush the writer, which is the same batch we were in the middle of
  enqueueing a draw to. When we try to push words, we'll crash with theatrics.
  This could be solved by smartening the batch tracking logic but it's not
  trivial by any means.

So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing
it!

Long term, the solution will be to push on the GPU instead. This avoids all of
these issues. This can be done with a compute kernel or with CSF instructions.
The Vulkan driver will likely have to do this for performance, since pushing
UBOs from the CPU is utterly broken in Vulkan for the above reasons.

I have a branch somewhere doing this on v9 but I'm doing this on NIR time to
unblock a core change that was crashing piglit due to this pile of unsoundness.
Let's fix the correctness issues first, then someone can look at recovering
performance later when we're not blocking unrelated work.

Fixes corruption in Piglit test
gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with
transform feedback. (I suspect the test still doesn't pass for the same reason
it's broken on other tilers. But that's a better place to be than oodles of
memory corruption.)

According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset.

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit 59a3e12039)
2025-04-15 23:54:48 +02:00
Caterina Shablia
e98a912791 panfrost: update nr_uniform_buffers before dispatching XFB
Currently nr_uniform_buffers will be whatever the previous draw set
for its vertex shader, which is not what the XFB shader usually
expects.

Fixes: c246af0d ("panfrost: Only upload UBOs when needed")

Cc: mesa-stable

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit 2c75b6bb01)
2025-04-15 23:54:47 +02:00
Caterina Shablia
aed66adbd2 panfrost: don't overwrite push uniforms and sysvals UBO with user's UBO
ss->info.ubo_mask includes the push+sysval UBO so if there's a user
UBO bound at the same index as the push+sysval UBO, without this
change we end up writing a descriptor for the user UBO at that index.

Fixes: 3b3cd59f ("panfrost: Launch transform feedback shaders")

Cc: mesa-stable

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit 6948ab727f)
2025-04-15 23:54:46 +02:00
Alyssa Rosenzweig
5ad25a98ef panfrost: invert and rename no_ubo_to_push flag
only the GL driver actually wants this, neither panvk nor internal shaders do.

Cc'd as a prereq to the next patch

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit f179f6952f)
2025-04-15 23:54:45 +02:00
Eric Engestrom
4fde719367 .pick_status.json: Update to 58321cf2e5 2025-04-15 23:49:17 +02:00
Samuel Pitoiset
3c6e241f0d radv: apply the workaround for buggy HiZ/HiS on GFX12 for DGC
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit d2da54e6f3)
2025-04-15 17:24:55 +02:00
Samuel Pitoiset
3c932e7824 radv: add a workaround for buggy HiZ/HiS on GFX12
HiZ/HiS is buggy and can cause random GPU hangs when stencil is enabled.
There are basically two alternatives but RADV follows RadeonSI and emit
a dummy RELEASE_MEM packet after every draw which should workaround the
issue and maintain performance.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12944
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit 6388db03c8)
2025-04-15 17:24:05 +02:00
Samuel Pitoiset
5449bd2eb7 radv: determine if HiZ/HiS is enabled earlier on GFX12
To lower CPU overhead of the hardware workaround.

Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit 11b6d2ba60)
2025-04-10 18:06:29 +02:00
Patrick Lerda
9315eb140f i915: fix draw_create_fragment_shader() related memory leak
For instance, this issue is triggered with "piglit/bin/fcc-blit-between-clears -auto -fbo":
Direct leak of 16400 byte(s) in 5 object(s) allocated from:
    #0 0xb720689a in __interceptor_calloc (/usr/lib/libasan.so.6+0xb289a)
    #1 0xaf10f896 in draw_create_fragment_shader ../src/gallium/auxiliary/draw/draw_fs.c:47
    #2 0xaef64619 in i915_create_fs_state ../src/gallium/drivers/i915/i915_state.c:550
    #3 0xae16a955 in ureg_create_shader ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2194
    #4 0xae17f45f in ureg_create_shader_with_so_and_destroy ../src/gallium/auxiliary/tgsi/tgsi_ureg.h:150
    #5 0xae17f45f in ureg_create_shader_and_destroy ../src/gallium/auxiliary/tgsi/tgsi_ureg.h:159
    #6 0xae17f45f in util_make_fs_blit_zs ../src/gallium/auxiliary/util/u_simple_shaders.c:365
    #7 0xaf13300e in blitter_get_fs_texfetch_depth ../src/gallium/auxiliary/util/u_blitter.c:1157
    #8 0xaf13300e in util_blitter_cache_all_shaders ../src/gallium/auxiliary/util/u_blitter.c:1322
    #9 0xaef6b738 in i915_create_context ../src/gallium/drivers/i915/i915_context.c:233
    #10 0xacb33c49 in st_api_create_context ../src/mesa/state_tracker/st_manager.c:986
    #11 0xac845740 in dri_create_context ../src/gallium/frontends/dri/dri_context.c:178
    #12 0xac854d97 in driCreateContextAttribs ../src/gallium/frontends/dri/dri_util.c:631
    #13 0xb6ce79a3 in dri2_create_context_attribs ../src/glx/dri2_glx.c:240
    #14 0xb6c9606f in dri_common_create_context ../src/glx/dri_common.c:665
    #15 0xb6ca4f00 in CreateContext ../src/glx/glxcmds.c:322
    #16 0xb6ca5c0b in glXCreateNewContext ../src/glx/glxcmds.c:1449

Fixes: 1a69b50b3b ("i915g: Fix point sprites.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
(cherry picked from commit f0cfc1bbdc)
2025-04-10 17:12:25 +02:00
Patrick Lerda
737b18393b i915: fix nir_to_tgsi() related memory leak
For instance, this issue is triggered with "piglit/bin/glx-multithread-texture -auto -fbo":
Direct leak of 256 byte(s) in 1 object(s) allocated from:
    #0 0xb71eda62 in __interceptor_realloc (/usr/lib/libasan.so.6+0xb2a62)
    #1 0xadd5a32f in tokens_expand ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:239
    #2 0xadd5a32f in get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:262
    #3 0xadd62519 in copy_instructions ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2079
    #4 0xadd62519 in ureg_finalize ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2129
    #5 0xadd64bde in ureg_get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2206
    #6 0xade377d0 in nir_to_tgsi_options ../src/gallium/auxiliary/nir/nir_to_tgsi.c:4043
    #7 0xade3da63 in nir_to_tgsi ../src/gallium/auxiliary/nir/nir_to_tgsi.c:3831
    #8 0xaeb606c9 in i915_create_vs_state ../src/gallium/drivers/i915/i915_state.c:662
    #9 0xac781a2c in st_create_common_variant ../src/mesa/state_tracker/st_program.c:720
    #10 0xac78e8a4 in st_get_common_variant ../src/mesa/state_tracker/st_program.c:773
    #11 0xac78fc10 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1259
    #12 0xac78fc10 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
    #13 0xac790b1a in st_program_string_notify ../src/mesa/state_tracker/st_program.c:1378
    #14 0xace457a9 in _mesa_get_fixed_func_vertex_program ../src/mesa/main/ffvertex_prog.c:1397
    #15 0xac5ef8db in update_program ../src/mesa/main/state.c:281
    #16 0xac5f0ece in _mesa_update_state_locked ../src/mesa/main/state.c:560
    #17 0xac5f1653 in _mesa_update_state ../src/mesa/main/state.c:593
    #18 0xacdf9fe2 in _mesa_DrawArrays ../src/mesa/main/draw.c:1403

Fixes: 487a493325 ("i915g: Add support for per-vertex point size.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
(cherry picked from commit 5af5f508b1)
2025-04-10 17:12:25 +02:00
Patrick Lerda
35ad8014cf i915: fix slab_create() related memory leaks
For instance, this issue is triggered with "piglit/bin/fcc-blit-between-clears -auto -fbo":
Direct leak of 836 byte(s) in 1 object(s) allocated from:
    #0 0xb71eb6f2 in malloc (/usr/lib/libasan.so.6+0xb26f2)
    #1 0xaefadc78 in slab_add_new_page ../src/util/slab.c:179
    #2 0xaefadc78 in slab_alloc ../src/util/slab.c:221
    #3 0xaef7d461 in i915_texture_transfer_map ../src/gallium/drivers/i915/i915_resource_texture.c:789
    #4 0xac9e931e in pipe_texture_map ../src/gallium/auxiliary/util/u_inlines.h:555
    #5 0xac9e931e in _mesa_map_renderbuffer ../src/mesa/main/renderbuffer.c:494
    #6 0xad49c5e4 in readpixels_memcpy ../src/mesa/main/readpix.c:260
    #7 0xad49c5e4 in _mesa_readpixels ../src/mesa/main/readpix.c:898
    #8 0xad5d8cfe in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:568
    #9 0xad4a0caf in read_pixels ../src/mesa/main/readpix.c:1199
    #10 0xad4a0caf in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
    #11 0xad4a155b in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231

or "piglit/bin/fcc-read-to-pbo-after-clear -auto":
Direct leak of 772 byte(s) in 1 object(s) allocated from:
    #0 0xb726b6f2 in malloc (/usr/lib/libasan.so.6+0xb26f2)
    #1 0xaf0adc88 in slab_add_new_page ../src/util/slab.c:179
    #2 0xaf0adc88 in slab_alloc ../src/util/slab.c:221
    #3 0xaf07aad7 in i915_buffer_transfer_map ../src/gallium/drivers/i915/i915_resource_buffer.c:75
    #4 0xad10de74 in pipe_buffer_map_range ../src/gallium/auxiliary/util/u_inlines.h:398
    #5 0xad10de74 in _mesa_bufferobj_map_range ../src/mesa/main/bufferobj.c:499
    #6 0xad5677ce in _mesa_map_pbo_dest ../src/mesa/main/pbo.c:308
    #7 0xad59be3b in _mesa_readpixels ../src/mesa/main/readpix.c:894
    #8 0xad6d8cfe in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:568
    #9 0xad5a0caf in read_pixels ../src/mesa/main/readpix.c:1199
    #10 0xad5a0caf in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
    #11 0xad5a155b in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231

Fixes: e7a73b75a0 ("gallium: switch drivers to the slab allocator in src/util")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
(cherry picked from commit 92802ea90a)
2025-04-10 17:12:25 +02:00
Ian Romanick
3e789ce50d brw/nir: Use offset() for all uses of offs in emit_pixel_interpolater_alu_at_offset
This is necessary to appropriately uniformize the first component
access of a convergent vector. Without this, this is produced:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

This is the correct code:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

Without 38b58e286f, the code generated was more incorrect, but happened
to work for this test case:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+0.4<0>:F, 0.5f

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 38b58e286f ("brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset")
Closes: #12969
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34427>
(cherry picked from commit cb69d019cf)
2025-04-10 17:12:25 +02:00
Patrick Lerda
885b1cfd36 i915: fix i915_set_vertex_buffers() related refcnt imbalance and remove redundancies
Indeed, this resource was assigned twice and was not properly freed.

For instance, this issue is triggered with:
"piglit/bin/glsl-fs-pointcoord -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.

Fixes: 0278d1fa32 ("gallium: add unbind_num_trailing_slots to set_vertex_buffers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27572>
(cherry picked from commit 22c399320b)
2025-04-10 17:12:25 +02:00
Faith Ekstrand
217ed7f108 nak: Allow predicates in nir_intrinsic_as_uniform
As of 76e542e92a ("nak: Add nak_nir_mark_lcssa_invariants"), we can
now get predicates as inputs to as_uniform.  We can't assume the result
will always be a UGPR.

Fixes: 76e542e92a ("nak: Add nak_nir_mark_lcssa_invariants")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12970
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34434>
(cherry picked from commit 4d1399629b)
2025-04-10 17:12:25 +02:00
Natalie Vock
1fdd97ea51 aco: Make private_segment_buffer/scratch_offset per-resume
We need different Temps for each resume shader, because registers aren't
preserved across resume boundaries.

This was likely fine in practice because arg registers are the same for
each shader, but resulted in invalid IR and asserts.

Fixes crashes in Indiana Jones RT with assertions enabled on GFX8.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34114>
(cherry picked from commit 3d8db3cbbb)
2025-04-10 17:12:25 +02:00
Lionel Landwerlin
2e6281ea34 brw: fix shuffle with scalar/uniform index
The fixes commit isn't actually the source of the bug but likely the
biggest enabler because it creates scalar values that more easily end
up in the shuffle operations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1b24612c57 ("brw/nir: Treat load_*_uniform_block_intel as convergent")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12927
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12688
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12570
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12905
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12734
Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34393>
(cherry picked from commit 19e4dda9a2)
2025-04-10 17:12:25 +02:00
Alyssa Rosenzweig
1830d233a7 nir/lower_blend: disable logic ops for unsupported formats
Fixes new Vulkan CTS cases on Honeykrisp (and probably panvk and whatever)

dEQP-VK.pipeline.shader_object_unlinked_binary.logic_op_na_formats.*

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>
(cherry picked from commit c23201ad8a)
2025-04-10 17:12:25 +02:00
Alyssa Rosenzweig
41b93a8d0d nir/lower_blend: refactor logicop variables
This pulls out the logicop_func variable from the options struct, so we can
modify it in the next commit in a central place. It then refactors out the
format variable from the options struct since we end up duplicating
options->format[rt] a zillion times and passing in both an options struct and a
logicop func override is confusing so this will just make everything neater and
self-contained next commit.

no functional change.

Cc'd to make the next commit cherrypickable.

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>
(cherry picked from commit 54ccc8ed0b)
2025-04-10 17:12:25 +02:00
Felix DeGrood
0a9cb4e833 vk/overlay-layer: fix regression in non-control pathway
Fixes regression introduced by prior commit. Prior commit fixed
the control pathway to starting overlay-layer but broke non-
control pathway. Now both pathways should be working.

Fixes: 06423b1792 ("defer log creation to swapchain creation")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12884
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34413>
(cherry picked from commit b895c0ec05)
2025-04-10 17:12:25 +02:00
Rob Clark
939859348d tu/vdrm: Fix userspace fence cmds
Somehow the update of the fence value to write was dropped, so the
cmdstream that wrote the fence value would simply write zero over and
over again.

Fixes: 84d6eedd5e ("tu: Refactor the submit path")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
(cherry picked from commit 081869e591)
2025-04-10 17:12:25 +02:00
Patrick Lerda
f7f729287a r600: fix points clipping
This is the backport of eca57f85ee ("radeonsi: fix
gl_ClipDistance and gl_ClipVertex for points").

This change was tested on rv770, palm, barts and cayman. It
fixes 450 khr-gl tests and 64 khr-gles tests on evergreen
and cayman gpus. Here is the list:
spec/glsl-1.20/execution/clipping/vs-clip-vertex-primitives: fail pass
spec/glsl-1.30/execution/clipping/vs-clip-distance-primitives: fail pass
spec/glsl-1.50/execution/compatibility/clipping/gs-clip-vertex-primitives-points: fail pass
khr-gl(3[0-3]|4[0-5])/clip_distance/functional: fail pass
khr-gl(33|4[0-5])/cull_distance/functional_test_item_[0-8]_primitive_mode_points_max_culldist_[0-7]: fail pass
khr-gles3/clip_distance/functional: fail pass
khr-gles3/cull_distance/functional_test_item_[0-8]_primitive_mode_points_max_culldist_[0-7]: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34403>
(cherry picked from commit 58ddf6aaf0)
2025-04-10 17:12:25 +02:00
Patrick Lerda
76c1b49b18 r600: fix pa_su_vtx_cntl rounding mode
This is the backport of 9c49550163. This rounding functionality
is available on all the gpus of the r600 family.

This change was tested on rv770, palm and cayman. This change fixes
at least the "turn-on-off" tests on all these gpus and it does not
add any regression. Here are the tests fixed on palm:
spec/ext_framebuffer_multisample/interpolation 6 centroid-edges: fail pass
spec/ext_framebuffer_multisample/interpolation 8 centroid-edges: fail pass
spec/ext_framebuffer_multisample/turn-on-off 2: fail pass
spec/ext_framebuffer_multisample/turn-on-off 4: fail pass
spec/ext_framebuffer_multisample/turn-on-off 6: fail pass
spec/ext_framebuffer_multisample/turn-on-off 8: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34403>
(cherry picked from commit 8fc01db1ac)
2025-04-10 17:12:25 +02:00
Patrick Lerda
17a744e8e1 r600: fallback to util_blitter_draw_rectangle when required
This is the backport of dc293ffe50 ("radeonsi:
fallback to util_blitter_draw_rectangle").

This change was tested on rv770, palm and cayman. Here is
the test fixed:
spec/ext_framebuffer_blit/fbo-blit-check-limits: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34403>
(cherry picked from commit 4d17f8d10a)
2025-04-10 17:12:25 +02:00
Lars-Ivar Hesselberg Simonsen
7e787e683a panvk: Add barrier for interleaved ZS copy cmds
When executing CopyBufferToImage or CopyImage with multiple regions of
both depth and stencil aspects targeting an interleaved depth stencil
image, we must split the regions into one copy-command for each aspect
and add a barrier between them to avoid a write-after-write race.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 5067921349 ("panvk: Switch to vk_meta")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34384>
(cherry picked from commit 37595775a0)
2025-04-10 17:12:25 +02:00
Marek Olšák
4135fd731b radeonsi: work around a primitive restart bug on gfx10-10.3
Using the GE instead of the VGT register has no effect because it's
the same value. SQ_NON_EVENT is the fix.

Discovered by Samuel Pitoiset.

Cc: mesa-stable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
(cherry picked from commit a82705911e)
2025-04-10 17:12:25 +02:00
Mike Blumenkrantz
510e0cf34f tu: check for valid descriptor set when binding descriptors
these pointers can be null, and they are checked as null in
pipeline layout creation, but here if the pointer is null it will crash

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34412>
(cherry picked from commit b14c8128bf)
2025-04-10 17:12:24 +02:00
Ian Romanick
4480876188 brw/algebraic: Optimize derivative of convergent value
This is mostly defensive. If a convergent value ever ended up as a
source of a DDX or DDY, the eu_emit code will ignore the stride. This
will result in bad code being generated.

No shader-db or fossil-db changes on any Intel platform.

v2: DDX and DDY will always be float, but brw_imm_for_type only works
with integer types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ken
Fixes: d5d7ae22ae ("brw/nir: Fix up handling of sources that might be convergent vectors")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
(cherry picked from commit dee49f4206)
2025-04-10 17:12:24 +02:00
Ian Romanick
58f3ddadf1 brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset
The source of nir_intrinsic_load_barycentric_at_offset is a vector, so
-1 should be passed to get_nir_src. This is also done for texture
sampling intrinsics.

I skimmed the other user of get_nir_src, and I believe they are
correct. This one was just missed as LNL support landed an many, many
rebases of the original MR occurred.

v2: Fix another get_nir_src call. Suggested by Lionel.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: d5d7ae22ae ("brw/nir: Fix up handling of sources that might be convergent vectors")
Closes: #12464
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
(cherry picked from commit 38b58e286f)
2025-04-10 17:12:24 +02:00
Eric R. Smith
a29fa8d084 panfrost,lima: use index size in panfrost minmax_cache
Bifrost keeps a cache of information about buffers being
used as indices. Unfortunately, it was not keeping information
about the size of the indices (probably because this rarely
changes). If a program deliberately re-interprets the indices
as a different type (e.g. UNSIGNED_INT instead of UNSIGNED_SHORT)
then we will use incorrect values from the cache. This actually
showed up in a test program we were running.

Fix by saving the index size in the cache key.

Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34011>
(cherry picked from commit 739da17f6e)
2025-04-10 17:12:24 +02:00
Patrick Lerda
ced585e3f4 r600: fix textures with swizzles limited to zero and one
This issue seems to be specific to textureGather() which could
fail when processing some surfaces. These surfaces are configured
with non-standard one and zero swizzles. The gpu doesn't support
this very specific setup with all the possible hardware formats.
This change selects a compatible configuration when this is
possible.

This change was tested on palm, barts and cayman. This change
fixes the 216 remaining arb_texture_gather tests:
spec/arb_texture_gather/texturegather/.*-zero-.*: fail pass
spec/arb_texture_gather/texturegather/.*-one-.*: fail pass
spec/arb_texture_gather/texturegatheroffset/.*-zero-.*: fail pass
spec/arb_texture_gather/texturegatheroffset/.*-one-.*: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34293>
(cherry picked from commit f0c0997277)
2025-04-10 17:12:24 +02:00
Patrick Lerda
c207f96862 r600: move stores to the end of shader when required
This change is inspired from 1e0e521a7d ("broadcom/compiler:
move stores to the end of shader") and makes the khr cull_distance
tests which were broken after dae57e184a functionals again.

Fixes: dae57e184a ("glsl,st/mesa: always lower IO for GLSL, unlower IO for drivers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34154>
(cherry picked from commit 4c2b2c82b0)
2025-04-10 17:12:24 +02:00
Juan A. Suarez Romero
50594727bd v3dv: don't check if DRM device is master
This was added to ensure we can get its resources, but they can be
obtained also from non master.

Fixes: 2af12c5b36 ("v3dv: Check multiple DRM primary nodes before picking the display fd")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12641
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34366>
(cherry picked from commit 8742927d8f)
2025-04-10 17:12:24 +02:00
Georg Lehmann
4bb8d70fd6 spirv: fix cooperative matrix by value function params
The vtn_ssa_value for a cmat is not backed by a nir_def, but by a nir_variable, so
can't be used directly when calling a function.  In most cases the cmat is used by
reference so code will take the value of deref for it (which is a `nir_def`).

When passing a cooperative matrix to a function by value, let the caller pass the deref
value, and the callee copy to a new local variable from that deref.

Fixes: b98f87612b ("spirv: Implement SPV_KHR_cooperative_matrix")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34364>
(cherry picked from commit 0cad7b0968)
2025-04-10 17:12:24 +02:00
Timothy Arceri
6ae1a65ec5 glsl: fix regression in ubo cloning
Fixes KHR-GL46.layout_binding.block_layout_binding_block_VertexShader
with radeonsi.

Fixes: 2b2132d2ac ("nir: fix uniform cloning helper")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34337>
(cherry picked from commit d8782db3a4)
2025-04-10 17:12:24 +02:00
Benjamin Lee
e8ccf9bd1f panfrost/pps: fix omitting several counters
The cid loop in the previous implementation stopped at n_counters for a
given category, even though cid is a global id that does not start
counting from zero at the beginning of each category. As a result, we
missed most of the counters outside of the first category.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 513d1baaea ("pps: Panfrost pps driver")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34202>
(cherry picked from commit 3b66e4a438)
2025-04-10 17:12:24 +02:00
Eric Engestrom
91759a943f .pick_status.json: Update to 2f00daf67a 2025-04-10 17:12:24 +02:00
Eric Engestrom
8614baa5f8 ci: rename ci-tron priority tag to avoid conflict with the generic fdo runners
Otherwise, ci-tron runners with that tag could pick up jobs meant for the fdo
runners, as happened here:
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/73883719

The inverse (fdo runners picking up a job meant for a ci-tron runner) is not
possible though, as ci-tron jobs always include a `farm:$RUNNER_FARM_LOCATION`
tag, so the problem only exists in the other direction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34358>
(cherry picked from commit 6331441e24)
2025-04-10 17:12:24 +02:00
Benjamin Otte
50e0a3933a lavapipe: Don't advertise support for multiplane drm formats
Fixes: bd4f69a0fe
Signed-off-by: Benjamin Otte <otte@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34190>
(cherry picked from commit 0941af995a)
2025-04-10 17:12:24 +02:00
Benjamin Lee
350839a5f1 panvk/csf: fix uninitialized read in utrace_clone_init_builder
Previous code assumed that the caller of utrace_clone_init_builder would
fill some parameters of the builder config, but we were not. Instead,
initialize these from the csif props the same as all the other builder
instances.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 3096cf2a5d ("panvk/csf: flush and process trace events for all cmdbufs")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34270>
(cherry picked from commit e183650aa4)
2025-04-10 17:12:24 +02:00
Ian Romanick
d2e0c22518 brw/algebraic: Constant folding for BROADCAST and SHUFFLE
This prevents assertion failures in brw_eu_emit in a later commit in
this MR. Even though they have not been previously observed, these
assertion failures could happen even without that commit.

No shader-db or fossil-db changes on any Intel platform.

Fixes: 04e1783278 ("brw: Call brw_fs_opt_algebraic less often")

v2: Add SHUFFLE. Suggested by Ken. Fixed indentation.

v3: Update BROADCAST exec_size after rebasing on "brw/build: Use SIMD8
temporaries in emit_uniformize".

v4: Explain why munging the exec_size is correct.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
(cherry picked from commit 8b2be206f3)
2025-04-10 17:12:24 +02:00
Mike Blumenkrantz
c3952af96d gallium/util: check nr_samples in pipe_surface_equal()
this is otherwise broken

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34367>
(cherry picked from commit 12b57b34f8)
2025-04-10 17:12:24 +02:00
Timur Kristóf
a64edc0e3b radv: Call nir_opt_undef too after nir_opt_varyings.
Shaders may have undefined output stores after nir_opt_varyings.
These must be optimized out, otherwise they hit an assertion.

Fixes: 17f6ab28cc
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
(cherry picked from commit ce2138d73a)
2025-04-10 17:12:24 +02:00
Timur Kristóf
6ebea60d73 radv: Use buffers_written mask when gathering XFB info.
We need to enable these buffers regardless of whether or not the
shader actually writes any outputs to them, otherwise we break
XFB queries.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
(cherry picked from commit 15d0804670)
2025-04-10 17:12:24 +02:00
Timur Kristóf
085ae2607f nir/opt_varyings: Fix assertion when deduplicating TCS outputs.
When deduplicating TCS outputs, we may find outputs that aren't
loaded by the shader itself. This previously hit a bad assertion.

Fixes: c66967b5cb
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12410
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
(cherry picked from commit 96d11d0f56)
2025-04-10 17:12:24 +02:00
Timur Kristóf
370789bcfd nir/xfb: Preserve some xfb information when gathering from intrinsics.
We need to remember which streamout buffers and streams were enabled,
even if the shader doesn't actually write any outputs to them,
because the API requires that we count vertices created by this shader
towards queries against those streams.

That information can be gathered by nir_gather_xfb_info_with_varyings
from the original NIR I/O variables that we get from the frontend,
but it isn't included in any intrinsics so would be otherwise lost here.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>
(cherry picked from commit a29b5857f7)
2025-04-10 17:12:23 +02:00
Jan Alexander Steffens (heftig)
a4c805f0f9 gfxstream: Use proper log format for 32-bit Vulkan
On i686, where VK_USE_64_BIT_PTR_DEFINES is unset and Vulkan handles are
represented as 64-bit integers instead, the code used the wrong format
specifier, causing a build error.

Fixes: 7fb31361f4 ("Handle external fences in vkGetFenceStatus()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34124>
(cherry picked from commit 1deb0536a1)
2025-04-10 17:12:23 +02:00
Georg Lehmann
88ea564ece spirv: clamp/sign-extend non 32bit ldexp exponents
GLSL.std.450 allows any integer size here.
OpenCL only allows i32.

Cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34071>
(cherry picked from commit c21a53440f)
2025-04-10 17:12:23 +02:00
Job Noorman
1d1fe5cca3 ir3/ra: assign interval offsets to new defs after shared RA
Shared RA might insert new defs to be handled by regular RA (e.g.,
shared spills). However, their interval offsets were not initialized
which caused their intervals to sometimes be mistakenly matched with
those containing offset 0. Fix this by calling index_merge_sets after
shared RA and modifying that function to only index new defs in that
case.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33319>
(cherry picked from commit a0db2f9737)
2025-04-10 17:12:23 +02:00
Samuel Pitoiset
ef2a5bee7b radv: fix ignoring conditional rendering with vkCmdResolveImage()
This command isn't supposed to be affected by conditional rendering.

This fixes new VKCTS coverage
dEQP-VK.conditional_rendering.conditional_ignore.resolve_image*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34338>
(cherry picked from commit 4d1d6d4147)
2025-04-10 17:12:23 +02:00
Sviatoslav Peleshko
aefa768a8e vulkan/wsi/headless: Remove unnecessary wsi_configure_image()
wsi_configure_image() with the same info is already called by
configure_image() in wsi_swapchain_init(), so this second call is
unnecessary. Furthermore, calling it the second time caused a memory
leak of queue family indices array.

Fixes: d4a2c0fc ("vulkan/wsi: add a headless swapchain implementation/option")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12811
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34194>
(cherry picked from commit 64980c4f05)
2025-04-10 17:12:23 +02:00
David Rosca
511a894fd3 radeonsi/vcn: Disable AV1 unidir compound with rate control
It causes significant bitrate overshoot currently.

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34237>
(cherry picked from commit a5edb9faac)
2025-04-10 17:12:23 +02:00
Connor Abbott
773a873c5f tu: Fix layer_count with dynamic rendering + multiview
With "classic" renderpasses, the VkFramebuffer's layerCount must be 1 if
multiview is enabled. We accidentally rely on this to not disable GMEM
for multiview, and possibly for other things too. Apparently the dynamic
rendering equivalent, VkRenderingInfo::layerCount, can be anything when
multiview is enabled, and some CTS tests set it to the number of views.
Sanitize it when constructing the internal framebuffer for dynamic
rendering.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34080>
(cherry picked from commit 15660caa90)
2025-04-10 17:12:23 +02:00
Gurchetan Singh
0a63b3db7a gfxstream: follow the semantics desired by distro VK loader
- vkCreateInstance should return VK_SUCCESS absent a few specific
  conditions
- just don't add any physical devices later

Cc: mesa-stable

Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34090>
(cherry picked from commit 8f003dc2e9)
2025-04-10 17:12:23 +02:00
Gurchetan Singh
f4efaff857 gfxstream: refactor device initialization
Don't add unnecessary logspam if virtgpu isn't present.

Cc: mesa-stable
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34090>
(cherry picked from commit ef84cd928e)
2025-04-10 17:12:23 +02:00
Gurchetan Singh
5ebbbcc878 gfxstream: check device exists before using it
Segfaults in the error case otherwise.

Cc: mesa-stable
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34090>
(cherry picked from commit 5503d97bf6)
2025-04-10 17:12:23 +02:00
Aaron Ruby
34a1404c2d gfxstream: Add common interfaces in the VirtGpuDevice to query DrmInfo
and PciBusInfo

- Advertise the availability of these extensions, fully implemented as
guestOnly features

Reviewed-By: Gurchetan Singh <gurchetansingh@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33363>
(cherry picked from commit 2553d60d47)
2025-04-10 17:12:23 +02:00
Aaron Ruby
cff323108a gfxstream: Make the virtgpu device discovery for LinuxVirtGpu more robust
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33363>
(cherry picked from commit 5d2c0cc526)
2025-04-10 17:12:23 +02:00
Tapani Pälli
d2f3b9e0c7 compiler/glsl: check that bias is not used outside fragment stage
This fixes some upcoming CTS tests that attempt bias usage when
it is not valid per spec.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34285>
(cherry picked from commit b93ea155f9)
2025-04-10 17:12:23 +02:00
Eric Engestrom
8202aa085c .pick_status.json: Update to 7c5389695b 2025-04-10 17:12:23 +02:00
Eric Engestrom
2bde9b1ef7 [25.0 only] update more ci expectations
These changes happened with no mesa code change, only infrastructure
changes, which is really weird, but to be able to move on, let's simply
document the "new normal".

(Was missed in 69d6923cdb)
2025-04-10 17:12:23 +02:00
Eric Engestrom
ff386eba1d docs: add sha sum for 25.0.3 2025-04-02 18:52:26 +02:00
Eric Engestrom
c3afa2a74f VERSION: bump for 25.0.3 2025-04-02 18:35:11 +02:00
Eric Engestrom
8eab11c9ad docs: add release notes for 25.0.3 2025-04-02 18:35:11 +02:00
David Rosca
b592736211 radv: Add radv_format_description to remap 10/12bit formats to 16bit
Remapping was missing for format description which made these formats
effectively unsupported as zero format features were reported.

Fixes: 0098f8ef35 ("radv: Remap 10 and 12 bit formats to 16 bit formats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34274>
(cherry picked from commit 597f13b244)
2025-04-02 14:27:04 +02:00
Samuel Pitoiset
93a4a2ec1b Revert "radeonsi/gfx11: program SAMPLE_MASK_TRACKER_WATERMARK optimally for APUs"
This reverts commit 6ce3a95852.

This likely also causes random GPU hangs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34306>
(cherry picked from commit 5784a36fd1)
2025-04-02 14:24:58 +02:00
Samuel Pitoiset
ccc86bd62e Revert "radv: program SAMPLE_MASK_TRACKER_WATERMARK optimally for GFX11 APUs"
This reverts commit 96e9c3fe77.

This actually causes random GPU hangs like on Phoenix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12461
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12426
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12692
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34306>
(cherry picked from commit 64e6e043b3)
2025-04-02 14:24:08 +02:00
Ian Romanick
27ecb47a5a brw/nir: Lower fsign again after last call to brw_nir_optimize
No shader-db or fossil-db changes on any Intel platform.

Fixes: 13332c23 ("intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup")
Closes: #12888
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>
(cherry picked from commit e210b79ce3)
2025-04-02 14:22:09 +02:00
Samuel Pitoiset
abb47924db ac/surface: fix selecting preferred alignments for HiZ/HiS on GFX12
VK_MESA_image_alignment_control is used by vkd3d-proton to set
optimal alignments for images. Though, the preferred alignment was
only applied to the surface (or the stencil aspect) but not to the HiZ
surface due to the NULL check.

This caused rendering issues because swizzle modes didn't match.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12831
Fixes: 079f55d405 ("radv: advertise VK_MESA_image_alignment_control on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34322>
(cherry picked from commit fac44c0ca0)
2025-04-02 14:18:29 +02:00
Dave Airlie
f94216bb8c nak: add reads after setting writes
Otherwise we schedule this sort of thing wrong,
 r0    = iadd3 r0 c[0x0][0x0] rZ
 r0    = shf.l.w.i32 r0 rZ 0x2
 r0 p0 = iadd3 r0 c[0x1][0x0] rZ

since raw latencies are more important than waw, but we go do a
waw for the first two instructions instead of a raw which is correct.

Fixes: 2d4e445099 ("nak/calc_instr_deps: Rewrite calc_delays() again")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33573>
(cherry picked from commit 7a55a9afcc)
2025-04-02 14:18:26 +02:00
Eric Engestrom
d020c25bd4 .pick_status.json: Update to 0d2ebca39f 2025-04-02 13:56:30 +02:00
Erik Faye-Lund
2344060c22 mesa/main: fix regression in extension-checking
This condition accidentally got inverted when cleaning up code, whoops.

Fixes: 3251f321b8 ("mesa: some cleanups for texparam extension checks")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34248>
(cherry picked from commit eb82d65a20)
2025-04-02 11:04:13 +02:00
Erik Faye-Lund
e4f5908a57 panvk: check for texture-compression support
We currently just assume that textureCompressionETC2 and
textureCompressionASTC_LDR are always supported. And while that's true
for all the G52s, G610s abd G310s we've seen out in the wild, it's not
guaranteed to be true. An SoC vendor might disable support for one of
these formats.

So let's check properly, just for good measure.

Fixes: d970fe2e9d ("panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34206>
(cherry picked from commit e4786cf971)
2025-04-02 11:04:13 +02:00
Taras Pisetskyi
d34e17a9ce anv,driconf: Add sampler coordinate precision workaround for EVE Online
Signed-off-by: Taras Pisetskyi <taras.pisetskyi@globallogic.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12920

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34316>
(cherry picked from commit 04962975fd)
2025-04-02 11:04:13 +02:00
Erik Faye-Lund
29cf4608db panfrost: avoid accidental aliasing
We already have a variable call "alignment" here, and aliasing it
breaks things. Whoops, let's rename the variable to page_size to
avoid this.

Fixes: 22985caf3f ("panfrost: sanity-check alignment")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34156>
(cherry picked from commit 1471279203)
2025-04-02 11:04:13 +02:00
Robert Mader
82304cad79 gallivm: Re-add check for passmgr before disposing it
In can be NULL, but on LLVM >= 15 lp_passmgr_dispose() is
a no-op.

Fixes: 47cd0eee26 (gallivm: create a pass manager wrapper.)

Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34312>
(cherry picked from commit c0ec35bb42)
2025-04-02 11:04:13 +02:00
Rebecca Mckeever
42b6bd48b7 panvk: Remove lower_tg4_broadcom_swizzle from panvk_preprocess_nir()
We are already applying the .bagr swizzle in bifrost_preprocess_nir(), so
remove lower_tg4_broadcom_swizzle from nir_lower_tex_options in
panvk_preprocess_nir to avoid applying the swizzle twice.

Fixes: 4050697a8f ("panvk: So more nir_lower_tex before descriptor lowering")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34033>
(cherry picked from commit f450807b68)
2025-04-02 11:04:13 +02:00
Jordan Justen
98530340ca intel/dev: Add BMG 0xe211 PCI ID
Backport-to: 25.0
Ref: bspec 68090
Ref: https://patchwork.freedesktop.org/series/146769/
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34249>
(cherry picked from commit d3ec467031)
2025-04-02 11:04:13 +02:00
Dave Airlie
cd8715440a gallivm: check for avx512vbmi and tell LLVM the correct answer.
There are some CPUs out there which don't have vbmi and do have
other avx512 and mesa crashes on those with illegal instructions.

This was reported to Red Hat support.

Cc: mesa-stable
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34282>
(cherry picked from commit 5d6d167a7c)
2025-04-02 11:04:13 +02:00
Pierre-Eric Pelloux-Prayer
c8f9b803fa radeonsi: use composed swizzle in cdna_emu_make_image_descriptor
Otherwise the state swizzle is ignored.

Fixes: 139bc6b813 ("radeonsi: use common build buffer descriptor helpers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34241>
(cherry picked from commit 7e2c3be454)
2025-04-02 11:04:13 +02:00
Pierre-Eric Pelloux-Prayer
ae02c9a2d5 ac/nir: fix nir_metadata value of ac_nir_lower_image_opcodes
This pass can insert new blocks so 'nir_metadata_control_flow' is not
preserved.

Fixes: eaf98b1422 ("ac/nir: implement image opcode emulation for CDNA, enable it in radeonsi")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34241>
(cherry picked from commit 785df1b980)
2025-04-02 11:04:13 +02:00
Samuel Pitoiset
ffa6fd4bee radv: do not trigger FCE or FMASK decompress on compute queue
A pipeline barrier which contains an image layout transition like
COLOR_ATTACHMENT_OPTIMAL -> TRANSFER_DST_OPTIMAL on compute queue
would just hang. Such a barrier is useless in practice but it's legal.

Prevent GPU hangs by skipping FCE or FMASK_DECOMPRESS when it's not
on the graphics queue.

Fixes dEQP-VK.synchronization2.layout_transition.compute_transition*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
(cherry picked from commit 086f529bbe)
2025-04-02 11:04:13 +02:00
Trigger Huang
1c85e781ce radeonsi: Fix perfcounter start event in si_pc_emit_start
The original typo caused performance counters to send STOP events
instead of START, leading to incorrect profiling data.

Fixes: 1a1138817c ("radeonsi: add a new PM4 helper radeon_event_write")

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34236>
(cherry picked from commit f03b385d4b)
2025-04-02 11:04:13 +02:00
Faith Ekstrand
7d3212729b nvk: Disable 32k images on Pascal A
While we're here, add a comment about why we have this restriction in
the first place since NVK and the proprietary driver are different here.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34281>
(cherry picked from commit 59b01dc764)
2025-04-02 11:04:13 +02:00
Faith Ekstrand
9dd82a2e74 nvk: Use max_image_dimension for maxFramebufferWidth/Height
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34281>
(cherry picked from commit 65d06d91ca)
2025-04-02 11:04:13 +02:00
Faith Ekstrand
79f960cb1d vulkan/wsi: Signal buffer memory object when blitting
When we're using the PRIME path and using vkCmdCopyImageToBuffer to copy
to a linear image, the buffer memory is what's shared with the window
system.  For legacy drivers that depend on memory signaling via
wsi_memory_signal_submit_info, we need to tell the driver to signal the
buffer memory, not the image memory or else the window system may wait
on a driver-internal buffer and not wait for the copy to complete.

Cc: mesa-stable
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34218>
(cherry picked from commit cf23ffcbae)
2025-04-02 11:04:13 +02:00
Natalie Vock
9fb56e2780 vulkan/bvh: Move first PLOC task_count fetch inside PHASE
Otherwise, the memory fetch is not protected by the global sync and
memory barriers and there is a chance to read a stale (or just wrong)
task count.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34178>
(cherry picked from commit 8b0271050a)
2025-04-02 11:04:13 +02:00
Natalie Vock
0c3b74d562 radv/rt: Flush CP writes from the common BVH framework with INV_L2 on GFX12
a1b05991 ("radv/rt: Flush L2 after writing internal node offset on GFX12")
did this for radv-internal CP writes - we also need to do this for PLOC
sync data initialization which is done in the common framework.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34178>
(cherry picked from commit c1e1d86bd1)
2025-04-02 11:04:13 +02:00
David Rosca
e790a1caa0 frontends/va: Don't ignore rotation and mirror for conversions to RGB
Cc: mesa-stable
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34140>
(cherry picked from commit 51292976fe)
2025-04-02 11:04:12 +02:00
David Rosca
b6f9ccf0e8 gallium/vl: Fix mirror with rotation for compute shaders
The mirror needs to be reversed because the rotation is applied
before the mirroring.

VAAPI docs:
  Mirroring of an image can be performed either along the
  horizontal or vertical axis. It is assumed that the rotation
  operation is always performed before the mirroring operation.

Cc: mesa-stable
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34140>
(cherry picked from commit 962c33cbca)
2025-04-02 11:04:12 +02:00
David Rosca
d56be9adbd gallium/vl: Fix rotation with scaling for compute shaders
Cc: mesa-stable
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34140>
(cherry picked from commit c8a2f0b248)
2025-04-02 11:04:12 +02:00
Robert Mader
09da8e124f llvmpipe: Free dummy_dmabuf on shutdown
In order to stop ASAN from complaining.

Fixes: d21aa86b54 ("llvmpipe: Implement EGL_ANDROID_native_fence_sync")
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34258>
(cherry picked from commit 2034c901cc)
2025-04-02 11:04:12 +02:00
David Rosca
e09a2e808f radeonsi/vce: Support old VCE firmware
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12866
Fixes: 104f9c6654 ("radeonsi/vce: Remove support for FW 50 and older")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34152>
(cherry picked from commit a2b4617c00)
2025-04-02 11:04:12 +02:00
Connor Abbott
e9b6cf708b tu: Fix reported FDM fragment size with multiview
We were never setting has_multiview. It's not actually necessary anyway,
since we can just do the optimization we were trying to do whenever
num_views is 1 instead.

This doesn't affect the actual fragment size, which was already correct,
only gl_FragSizeEXT.

Fixes: 6f2be52487 ("tu, ir3: Handle FDM shader builtins")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33991>
(cherry picked from commit 8864ee7b0f)
2025-04-02 11:04:12 +02:00
Connor Abbott
0df2cf3ae4 tu: Fix size of frag_size_ir3 and frag_offset_ir3 driver params
They are an array, so we have to reserve extra space for extra views.
This bug was being masked by the bug fixed in the next commit.

Fixes: 76e417ca59 ("turnip,ir3/a750: Implement consts loading via preamble")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33991>
(cherry picked from commit 122f2c422a)
2025-04-02 11:04:12 +02:00
Connor Abbott
40babe1efb tu: Fix GMEM offset for multisample layered separate stencil
Fixes a bug uncovered by CTS when enabling GMEM with layered rendering.

Fixes: def56b531c ("tu: Support GMEM with layered rendering and multiview")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34082>
(cherry picked from commit 6cadc1baea)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
f2fa2ea466 nvk: Use the right sample mask for 8x/4pass on Maxwell A
Fixes: 48898c47bf ("nvk: Rework setup of sample masks")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34127>
(cherry picked from commit cbf87e82e8)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
3e475be117 nouveau/mme/fermi: Don't allow STATE and EMIT on the same op
Fixes: 162269f049 ("nouveau/mme: Add Fermi builder")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34127>
(cherry picked from commit 3354c24169)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
c0de23e92e nvk: Fix a Volta check
Fixes: e162c2e78e ("nvk: Use VM_BIND for contiguous heaps instead of copying")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34127>
(cherry picked from commit 79294fb95a)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
c84a792c96 nvk: Free owned_gart_mem correctly
Fixes: fbe171638e ("nvk: add gart forced cmd pool side buffer.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34127>
(cherry picked from commit 90b2137ac5)
2025-04-02 11:04:12 +02:00
Robert Mader
558a7d92d5 llvmpipe: Take offset into account when importing dmabufs
Which is necessary for many common YCbCr formats.

Fixes: d74ea2c117 (llvmpipe: Implement dmabuf handling)
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34240>
(cherry picked from commit 05e7ac6551)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
4aad059de8 nak: Fix a SM check for OpPCnt
This doens't really fix anything as we don't have any nir_loops on
Volta+ but the code was wrong so we should fix it.

Fixes: 9bbc692064 ("nak/nir: Rework CRS handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34201>
(cherry picked from commit af9d65e8b8)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
829c728e89 nak: Always copy sources when handling vec/pack/mov ops
It's possible that the source is uniform but the destination is not.  In
this case, we need to insert a copy or else we might accidentally
propagate a uniform into some place we don't expect it.

This fixes a bunch of fp64 KHR-Single-GL46.subgroups.arithmetic.* tests.

Fixes: d09d3f5246 ("nak/from_nir: Emit uniform instructions when !divergent")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34201>
(cherry picked from commit 1d1d79bbaa)
2025-04-02 11:04:12 +02:00
Faith Ekstrand
1fa4455b6e nak: Insert the annotation in the right spot in assign_regs
Fixes: efc4ac0d27 ("nak/sm50: sprinkle OpAnnotate in optimization passes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34201>
(cherry picked from commit 98677294b9)
2025-04-02 11:04:12 +02:00
irql-notlessorequal
ed75778536 hasvk: Fix non-functioning version override.
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27717 accidentally removed the instance check for the drirc option "hasvk_report_vk_1_3_version", rendering it useless.

Re-add the check and expose Vulkan 1.3 if the user asks.

Fixes: 2d575034f2 ("hasvk: switch to use runtime physical device properties infrastructure")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34232>
(cherry picked from commit c0c562cf6e)
2025-04-02 11:04:12 +02:00
Lionel Landwerlin
ba58320a6a anv: limit implict write with drirc
9f32e1a489 meant to amend 1e80a426c2.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9f32e1a489 ("anv/drirc: Add option to control implicit sync on external BOs")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12629
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33587>
(cherry picked from commit a88c9ea192)
2025-04-02 11:04:12 +02:00
Lionel Landwerlin
5ecc1fb189 brw: always write the VUE header
In 35df3925ca ("brw: ensure VUE header writes in HS/DS/GS stages") I
misread the PRMs and thought that the VF would initialize the header.

What actually happens is that the VF does not write valid values in
there and the PRMs explicitly say that the VS shader should overwrite
whatever is in there.

We could avoid writing the header in some cases when no HW is going to
read back the header. For example with rendering disables through
3DSTATE_STREAMOUT::RenderingDisable. But those cases are dynamic and
the compiler is not able to tell. So just always write the header.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 35df3925ca ("brw: ensure VUE header writes in HS/DS/GS stages")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12880
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34211>
(cherry picked from commit 4db4bd1d04)
2025-04-02 11:04:11 +02:00
Paulo Zanoni
9acee7d46b drirc/anv: DiggingGame.exe needs force_vk_vendor=-1
Otherwise, it fails with a message:

  "Assertion failed: IsValidIndex(Index)
   [File:D:\\build\\++UE5\\Sync\\Engine\\Source\\Runtime\\Core\\Public\\Containers\\UnrealString.h]
   [Line: 218] \nString index out of bounds: Index 0 from a string with
   a length of 0"

Thanks to the ProtonDB community for having figured this out and
documented it for us.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12695
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34103>
(cherry picked from commit e72ad49622)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
88c7326a61 radv/meta: fix color<->depth/stencil image copies
The color format needs to be compatible with depth or stencil. Also
the depth/stencil format was incorrect when it's the source.

Fixes dEQP-VK.api.ds_color_copy.*
and VKD3D_TEST_FILTER=test_copy_texture.

Fixes: d4ff011b12 ("radv: advertise VK_KHR_maintenance8")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34142>
(cherry picked from commit 2c3b9312cc)
2025-04-02 11:04:11 +02:00
Hyunjun Ko
167bcee3b7 vulkan/video: Do byte-alignment when building a h264 slice header
Fixes: ff8de6190 ("vulkan/video: adds a bitstream writer of h264 slice header")
Closes: mesa/mesa#12835

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34094>
(cherry picked from commit c22a635938)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
6ad9455f43 radv: fix compresed depth/stencil copies on transfer queue
HTILE is always pipe aligned.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
(cherry picked from commit 114fbdc534)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
7f32247d95 radv: fix bpe for the stencil aspect of depth/stencil copies on transfer queue
Using the bpe of depth+stencil when copying the stencil aspect only
doesn't work.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
(cherry picked from commit 7b15e85b95)
2025-04-02 11:04:11 +02:00
Rhys Perry
b3991dd8fc aco/ra: fix free register counting when moving variables
info.bounds might be smaller than the bounds available for the moved
variables.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 626aa7b648 ("aco: workaround GFX9 hardware bug for D16 image instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34158>
(cherry picked from commit 80fef30531)
2025-04-02 11:04:11 +02:00
Lionel Landwerlin
fad1d950f9 anv: disable replication when we don't have both VS/FS stages
Enabling this with shaders compiled separately through pipeline
libraries fails because we currently only enable it for VS and the
associated FS stage ends up with a non compatible VUE map.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34173>
(cherry picked from commit 25a695552a)
2025-04-02 11:04:11 +02:00
Lionel Landwerlin
c89250cd9f anv: fix end of pipe timestamp query writes
Currently trying to use PIPE_CONTROL on blitter/video engines.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12833
Acked-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34095>
(cherry picked from commit 6b6a4cb1e2)
2025-04-02 11:04:11 +02:00
Samuel Pitoiset
7e55a04643 radv: fix creating pipeline binary from the traversal shader
rt_stage_info is NULL.

Fixes: 8802612458 ("radv: advertise VK_KHR_pipeline_binary")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34141>
(cherry picked from commit 29b3d9f0f4)
2025-04-02 11:04:11 +02:00
Job Noorman
cabc8c606f ir3/legalize: take wrmask into account for delay updates
When updating delays, we'd update all dst regs based on reg_elems.
However, when wrmask has gaps, this would update delays for regs that
aren't actually written. Fix this by skipping regs for which the
corresponding wrmask bit is zero.

Note that this wasn't just a performance issue but could result in
illegal code because the delay is reset to zero for tex/sfu
instructions. For example, the following (post-legalization) code was
observed in the wild:

(rpt1)add.f r1.w, (r)r2.w, (r)c3.z
sam.base0 (f32)(w)r2.x, r3.y, s#0, t#1
rcp r2.x, r2.x

Here, the add would result in a required delay for r2.x which would then
be cleared by the sam (even though it doesn't write to it), resulting in
insufficient delay before the rcp.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 61b2bd861f ("ir3: Rewrite nop insertion")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34107>
(cherry picked from commit 84dbd34332)
2025-04-02 11:04:11 +02:00
Timothy Arceri
de5de2ec60 nir: fix uniform cloning helper
glsl allows for ubos to have the same name but different bindings.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Fixes: b47b8d16d9 ("nir: expose reusable linking helpers for cloning uniform loads")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12852
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34138>
(cherry picked from commit 2b2132d2ac)
2025-04-02 11:04:11 +02:00
Timothy Arceri
20e09c3081 mesa: fix potential race condition in with Programs
The call looks up a Program and creates it if it doesn't
already exist. However we weren't locking the hash between looking
up the name and adding it to the hash so it could be possible
another thread also generated the same name.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c91300f ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 786b8b2d34)
2025-04-02 11:04:11 +02:00
Timothy Arceri
04df661b13 mesa: fix potential race condition in with ATIShaders
The call looks up an ATIShader and creates it if it doesn't
already exist. However we weren't locking the hash between looking
up the name and adding it to the hash so it could be possible
another thread also generated the same name.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c91300f ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 4c1e4d7b49)
2025-04-02 11:04:11 +02:00
Timothy Arceri
f150c50170 mesa: fix potential race condition in with RenderBuffers
The calls look up a renderbuffer and create it if it doesn't
already exist. However they weren't locking the hash between looking
up the name and adding it to the hash so it could be possible
another thread also generated the same name.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c91300f ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 0e61d31e9d)
2025-04-02 11:04:11 +02:00
Timothy Arceri
835a355369 mesa: fix potential race conditions in with FrameBuffers
The calls look up a framebuffer and create it if it doesn't
already exist. However they weren't locking the hash between looking
up the name and adding it to the hash so it could be possible
another thread also generated the same name.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c91300f ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit c4ee84f3b6)
2025-04-02 11:04:11 +02:00
Timothy Arceri
4a158b971c mesa: fix reuse of deleted sampler object
Deleting a sampler object will only cause it to be unbound from the
current context. To avoid reusing something that it still bound in
another context we need to check the DeletePending flag first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c9130 ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 9bb696588d)
2025-04-02 11:04:11 +02:00
Timothy Arceri
d44e9736d1 mesa: fix potential race condition in with TexObjects
The calls look up a texture object and create it if it doesn't
already exist. However they weren't locking the hash between looking
up the name and adding it to the hash so it could be possible
another thread also generated the same name.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c9130 ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 95e87f6a6a)
2025-04-02 11:04:11 +02:00
Timothy Arceri
0c5a31f597 mesa: fix reuse of deleted texture object
Deleting a texture object will only cause it to be unbound from the
current context. To avoid reusing something that it still bound in
another context we need to check the DeletePending flag first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: 842c91300f ("mesa: enable GL name reuse by default for all drivers except virgl")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12710
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12722
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12830
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 9b85142e40)
2025-04-02 11:04:10 +02:00
Timothy Arceri
ce22e438e6 mesa: fix reuse of deleted buffer object
Deleting a buffer object will only cause it to be unbound from the
current context. To avoid reusing something that it still bound in
another context we need to check the DeletePending flag first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12810
Fixes: 842c91300f ("mesa: enable GL name reuse by default for all drivers except virgl")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34091>
(cherry picked from commit 0f0834275d)
2025-04-02 11:04:10 +02:00
Caio Oliveira
756b10a89a brw: Fix decoding of 3-src destination stride in EU validation
Fixes: f1036da345 ("intel/brw: Add vstride/width/hstride to brw_hw_decoded_inst")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>
(cherry picked from commit 676b874ca9)
2025-04-02 11:04:10 +02:00
Yiwei Zhang
66b5ae35ec panvk: fix memory requirement query for aliased disjoint image
The spec allows to create aliased disjoint image for a specific plane of
a multi-planar image, and the format can be R8. When querying memory
requirement of such image, VkImagePlaneMemoryRequirementsInfo is not
required to be chained although it has the disjoint bit.

This change fixes to look for aspect info from plane memory info only
when that's chained. The implementation can be passive here as the spec
VU has sufficient guarantees for the validity around. See below VU for
details:
- VUID-VkImageMemoryRequirementsInfo2-image-01589
- VUID-VkImageMemoryRequirementsInfo2-image-01590
- VUID-VkImageMemoryRequirementsInfo2-image-02279
- VUID-VkImageMemoryRequirementsInfo2-image-02280

Meanwhile, the existing disjoint check for size info is kept as is for
the special handling of VK_FORMAT_D32_SFLOAT_S8_UINT.

Test: dEQP-VK.ycbcr.plane_view.memory_alias.* pass with venus-on-panvk

Fixes: 412c286331 ("panvk: Enable multiplane images and image views")
Reviewed-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34134>
(cherry picked from commit 5dcb9f918d)
2025-04-02 11:04:10 +02:00
Yiwei Zhang
0a6dcd915e panvk/csf: rework cache flush reduction
Per Vulkan spec 7.9. Host Write Ordering Guarantees, queue submission
commands automatically perform a domain operation from host to device
for all writes performed before the command executes. That is to say,
host updates to the mappings can occur after the end of the command
recording and must be flushed implicitly at submission boundary.

Before this change, necessary cache flushes could be missed once the
app starts reusing pre-recorded command buffers. e.g. a simple buffer
copy cmd while the app only updates the source buffer mapping in
different submissions. This changes backs out most of the current
version of cache flush reduction while still assigning LATEST_FLUSH_ID
to at least the final batch itself. This aligns with panfrost_batch
submit behavior on the gallium side.

Test: dEQP-VK.synchronization*.timeline_semaphore.* pass w/o flakiness
      via venus-on-panvk

Fixes: 28e4d22497 ("panvk/csf: Pass a non-zero flush-id to benefit from cache flush reduction")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34093>
(cherry picked from commit 98a5acf352)
2025-04-02 11:04:10 +02:00
Yiwei Zhang
c432dfe79c venus: fix maint4 multi-planar memory requirements
Fixes: ce1bbd241e ("venus: extend image cache to vkGetDeviceImageMemoryRequirements")
Acked-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34135>
(cherry picked from commit adcb967c5c)
2025-04-02 11:04:10 +02:00
Yiwei Zhang
54d829491d venus: fix ahb usage caching
Test: dEQP-VK.api.external.memory.android_hardware_buffer.*

Fixes: fde5cebec5 ("venus: fix image format cache miss with AHB usage query")
Acked-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34135>
(cherry picked from commit ea6dc035d8)
2025-04-02 11:04:10 +02:00
Yiwei Zhang
f8f43523e0 venus: fix unexpected ring alive status expire upon owner thread switch
If the last owner thread has just unset the alive status and released
the watchdog, the new owner thread could have acquired to abort
unexpectedly if the ownership transfer occurs right before the next
owner's warn order. So we must set watchdog alive for new owner so that
it can properly check ring alive status in the next warn order.

Cc: mesa-stable
Acked-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34135>
(cherry picked from commit 8b2703fe08)
2025-04-02 11:04:10 +02:00
Yiwei Zhang
59c3485022 docs: demote VK_KHR_shader_relaxed_extended_instruction
It's not part of core 1.3.

Fixes: 8b272c8d8c ("docs: update feature matrix for VK_KHR_shader_relaxed_extended_instruction")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34104>
(cherry picked from commit d2a7c1c452)
2025-04-02 11:04:10 +02:00
Eric R. Smith
2684f37146 panfrost: consider xfb shader when calculating thread local storage size
Register spilling can cause us to require thread local storage (tls).
However, we were not adjusting the tls stack size space to account for
the tls needed for the extra xfb shader when transform feedback is
needed. We noticed this when testing register allocation in the
OpenGL CTS (for testing we had forced spilling where none happened
before).

Cc: mesa-stable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33935>
(cherry picked from commit 2ee3bef252)
2025-04-02 11:04:10 +02:00
Tomeu Vizoso
c0bc957c5d kopper: Explicitly choose zink
If we pass zink=false to pipe_loader_drm_probe_fd, it could happen that
a Gallium driver that had been already discarded because of not
supporting the graphics CAP will be chosen.

To avoid that, explicitly ask pipe_loader_drm_probe_fd to choose the
zink Gallium driver.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30096>
(cherry picked from commit 854bc2ee05)
2025-04-02 11:04:10 +02:00
Lucas Stach
b075d80fce kmsro: look for graphics capable screen as renderonly device
Exposing a rendernode from a supported driver is not a sufficient
matching criteria to qualify as the render part of a renderonly
device, as the rendernode might only expose compute or 2D accel
capabilities.

Look for a screen that actually supports gallium graphics operations
to qualify as a renderonly screen.

v2 (Tomeu): Have pipe-loader return a list of FDs for kmsro to choose
            based on capabilities.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30096>
(cherry picked from commit 7e76c67632)
2025-04-02 11:04:10 +02:00
Tomeu Vizoso
e300382920 egl/surfaceless: Only choose drivers that expose the graphics capability
This is to prevent applications to try to render to devices that have no
3D hardware (eg. NPUs).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30096>
(cherry picked from commit cfad6fb037)
2025-04-02 11:04:10 +02:00
Jordan Justen
5db78fe09e intel/dev: Add BMG PCI IDs (0xe210, 0xe215, 0xe216)
Backport-to: 24.3
Backport-to: 25.0
Ref: https://patchwork.freedesktop.org/patch/msgid/20250128162015.3288675-1-shekhar.chauhan@intel.com
Ref: bspec 68090
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33335>
(cherry picked from commit 0e648a238e)
2025-04-02 11:04:10 +02:00
Eric Engestrom
b9e649b31b .pick_status.json: Update to b60d816d6e 2025-04-02 11:04:10 +02:00
Daniel Schürmann
e159e0000c aco: don't assume that demote doesn't cause an empty exec mask
Totals from 188 (0.24% of 79377) affected shaders: (Navi31)
Instrs: 209239 -> 209473 (+0.11%); split: -0.01%, +0.12%
CodeSize: 1101124 -> 1101744 (+0.06%); split: -0.02%, +0.07%
Latency: 1672182 -> 1672748 (+0.03%); split: -0.11%, +0.14%
InvThroughput: 237276 -> 237546 (+0.11%); split: -0.00%, +0.12%
SClause: 5694 -> 5690 (-0.07%); split: -0.28%, +0.21%
Copies: 21685 -> 21682 (-0.01%); split: -0.12%, +0.10%
Branches: 5740 -> 5863 (+2.14%)
PreSGPRs: 7004 -> 7034 (+0.43%)
VALU: 123595 -> 123641 (+0.04%); split: -0.00%, +0.04%
SALU: 28418 -> 28411 (-0.02%); split: -0.09%, +0.06%

Fixes: f35e229fae ('aco: skip code if exec is empty')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33619>
(cherry picked from commit 69dcd5be3a)
2025-04-02 11:04:10 +02:00
Eric Engestrom
69d6923cdb [25.0 only] update ci expectations
These changes happened with no mesa code change, only infrastructure
changes, which is really weird, but to be able to move on, let's simply
document the "new normal".
2025-04-02 11:04:10 +02:00
Daniel Stone
0c6e647769 ci: Re-enable trace jobs with updated Piglit
mesa/piglit!996 fixed up Piglit to allow us to do trace downloads again,
so we can now bring these jobs back. The fdno trace jobs hosted at
Google are still disabled whilst we try to fix their nginx.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34245>
(cherry picked from commit f6f085f50a)
2025-03-29 20:45:02 +01:00
Eric Engestrom
caf97cb688 .pick_status.json: Update to e3433489f8 2025-03-29 20:45:02 +01:00
Eric Engestrom
a423142482 pick-ui: fix parsing of multiple backport-to: lines
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34117>
(cherry picked from commit e7b2eda39d)
2025-03-29 20:45:02 +01:00
Eric Engestrom
6b31b441f9 ci: run shader-db & zink-lvp on kvm runners
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34120>
(cherry picked from commit 6cd7b65ac0)
2025-03-29 20:45:02 +01:00
Valentine Burley
58e28017cf ci: Add missing kvm runner tags
A recent change now requires the kvm runner tag to be explicitly listed
for jobs that need to run on runners with KVM capability.
This ensures the jobs are scheduled on compatible runners.

Cc: mesa-stable

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34120>
(cherry picked from commit a36379d973)
2025-03-29 20:45:02 +01:00
Eric Engestrom
666e00cfb8 ci: replace broken s3cp command with a simple curl call
The current `s3cp` implementation does not work anymore after the
migration, and instead of fixing it and propagating the fix down to us,
it's simpler to directly use `curl`.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34120>
(cherry picked from commit 7178425ccf)
2025-03-29 20:45:02 +01:00
Eric Engestrom
2ed27e069e ci: always abort if the curl download fails
Reported-by: @Valentine
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34120>
(cherry picked from commit d425847793)
2025-03-29 20:45:02 +01:00
Eric Engestrom
bcaae89905 ci/piglit: drop usage of s3cp for a simple download
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34120>
(cherry picked from commit 213550d2e0)
2025-03-29 20:45:01 +01:00
Eric Engestrom
43d5f3ca29 .pick_status.json: Update to 85983e060c 2025-03-29 20:45:01 +01:00
Eric Engestrom
a801a4aab6 docs: add sha sum for 25.0.2 2025-03-20 15:00:31 +01:00
Eric Engestrom
06631a8876 VERSION: bump for 25.0.2 2025-03-20 14:32:27 +01:00
Eric Engestrom
0c1ea399a2 docs: add release notes for 25.0.2 2025-03-20 14:32:27 +01:00
Aaron Ruby
63ec8e94fc gfxstream: Downgrade log severity when enabling params in LinuxVirtGpu
Reviewed-By: Gurchetan Singh <gurchetansingh@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33944>
(cherry picked from commit 9447de5dc4)
2025-03-20 14:26:05 +01:00
Bas Nieuwenhuizen
9d85e7eda9 radv: Move support check out of winsys.
To get the right error code. Mostly shouldn't be winsys dependent
anyway, outside of the idea that if we explicitly emulate a device
we should just assume th euser knows what they're doing.

Fixes: c942d957b0 ("radv: fail to initialize when the AMD GPU generation is unsupported")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12792
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33964>
(cherry picked from commit 61feea6954)
2025-03-15 09:49:05 +01:00
Ganesh Belgur Ramachandra
63e1e2c926 amd: use 128B compression for scanout images when drm.minor <63
Fixes: 8328e575 ("ac/surface/gfx12: enable DCC 256B compressed blocks and reorder modifiers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33702>
(cherry picked from commit ba80a11b69)
2025-03-15 09:49:05 +01:00
Mike Blumenkrantz
43cab94575 zink: fix refcounting of zink_surface objects
this was previously a no-op because the pointers were identical,
leading to an extra unref in check_framebuffer_surface_mutable()

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34077>
(cherry picked from commit f5c66e2d4a)
2025-03-15 09:49:05 +01:00
Karol Herbst
54ee9cb342 nir/serialize: fix decoding of is_return and is_uniform
Fixes: 3321a56d1d ("nir: Serialize all parameter attributes")
Fixes: 26cbb6b933 ("nir: Add parameter divergence info")

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34052>
(cherry picked from commit 3a9954c117)
2025-03-15 09:49:05 +01:00
Georg Lehmann
d00144c8f0 aco/ra: disallow vcc definitions for pseudo scalar trans instrs
Foz-DB GFX1201:
Totals from 30 (0.04% of 79600) affected shaders:
Instrs: 58843 -> 58820 (-0.04%); split: -0.10%, +0.06%
CodeSize: 302228 -> 301944 (-0.09%); split: -0.13%, +0.04%
Latency: 204566 -> 204432 (-0.07%); split: -0.09%, +0.02%
InvThroughput: 136918 -> 136919 (+0.00%); split: -0.00%, +0.00%
SClause: 1241 -> 1249 (+0.64%); split: -0.56%, +1.21%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34006>
(cherry picked from commit d1dca26941)
2025-03-15 09:49:05 +01:00
Samuel Pitoiset
0c0f4de8ad radv: emit a dummy PS state for noop FS on GFX12
It seems the hardware requires a dummy PS state with a noop FS,
otherwise it might just hang. This used to work just fine on older
gens.

Note that RadeonSI refuses to draw if VS or PS is missing and AMDVLK
seems to also always emit this state. So, this might be a bug that AMD
didn't encounter at all.

This fixes a GPU hang during loading with Ghostwire: Tokyo.

Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34070>
(cherry picked from commit 1e4cfd9dfa)
2025-03-15 09:49:05 +01:00
Lucas Stach
e576d84f03 etnaviv: fix ETNA_MESA_DEBUG=no_early_z
This feature bit has inverted polarity from most other feature bits:
if the bit is present the driver should not use early Z. So the bit
must be set when the debug option to disable early Z is enabled.

Fixes: d600b45ccc ("etnaviv: Switch to etna_core APIs")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34064>
(cherry picked from commit 4608eef0a0)
2025-03-15 09:49:04 +01:00
Patrick Lerda
4d4568cb88 r600: update the software fp64 support
This change began by fixing an old regression related to the dceil
functionality. This issue affected palm. Now, this change adjusts
the software fp64 support to make it fully operational.

This change was tested on palm and barts. This change fixes 561
"piglit run all" tests. The khr_gl tests are fixed as well (243 tests).
Here is a summary:
spec/arb_gpu_shader_fp64/execution/built-in-functions/*
spec/arb_gpu_shader_fp64/execution/fs-isnan-dvec: fail pass
spec/arb_gpu_shader_fp64/execution/gs-isnan-dvec: fail pass
spec/arb_gpu_shader_fp64/execution/vs-isnan-dvec: fail pass
spec/glsl-4.00/execution/built-in-functions/*
spec/glsl-4.10/execution/conversion/*
khr-gl4[3-5]/compute_shader/fp64-case1: fail pass
khr-gl4[0-5]/gpu_shader_fp64/builtin/*

Fixes: aed6a39c10 ("glsl: Retire dround lowering.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33708>
(cherry picked from commit 186fb5e73a)
2025-03-15 09:49:04 +01:00
Lionel Landwerlin
56b954a37a brw: ensure VUE header writes in HS/DS/GS stages
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12820
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34041>
(cherry picked from commit 35df3925ca)
2025-03-15 09:49:04 +01:00
Lionel Landwerlin
267502f9f3 brw: fix spilling for Xe2+
The problem occurs with a series of instructions build the subgroup
invocation value :

mov(8)          g23<1>UW        0x76543210V
add(8)          g23.8<1>UW      g23<8,8,1>UW    0x0008UW
add(16)         g23.16<1>UW     g23<16,16,1>UW  0x0010UW

Our register spilling code operates on physical registers (64B on
Xe2+) and using the brw_inst::is_partial_write() helper only considers
32B registers. So the spiller doesn't see that the add(16) instruction
is doing a partial write and ends up discarding the previous value.

You can reproduce the issue by running a test like :

INTEL_DEBUG=spill_fs ./deqp-vk -n dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.constant.uint8_uint8.buffer.rowmajor.linear

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: aa494cbacf ("brw: align spilling offsets to physical register sizes")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33642>
(cherry picked from commit c60180ba63)
2025-03-15 09:49:04 +01:00
Matt Turner
c3f4bb2a7d glsl: Add missing break
Reported by clang's `-Wimplicit-fallthrough`.

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>
(cherry picked from commit 8d6deb4073)
2025-03-15 09:49:04 +01:00
Seán de Búrca
9bcdf5b859 rusticl/mem: don't create svm_pointers slice from null raw pointer
std::slice::from_raw_parts requires that the slice pointer be non-null,
even when the slice contains zero elements. Failing this invariant is
undefined behavior.

v2: reordered commits to allow cherry-picking bugfixes

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33989>
(cherry picked from commit 5e365f1674)
2025-03-15 09:49:04 +01:00
Lucas Stach
3d600b2c0e etnaviv: rs: fix slow/fast clear transitions
When a slow/fast/slow clear sequence is executed on a surface, the second
slow clear will not regenerate the clear command if the clear value of the
fast clear is the same as the one used for the second slow clear, as the
current stored surface clear value is the same as the new clear value.
The command generated on the first slow clear however may have used a
different clear value, which is now submitted unchanged to the hardware on
the second slow clear.

Fix this by only generating the clear command if there is no valid one
already. If we already have a valid clear command simply update the fill
value in that command with the new clear value. This has some marginal
overhead, but has been chosen over the alternative of adding more state by
remembering the last slow clear value.

Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34029>
(cherry picked from commit fb0f9e6352)
2025-03-15 09:49:04 +01:00
Patrick Lerda
249613cd92 r600: fix cayman main non-deterministic behavior problem
Cayman has a non-deterministic behavior issue which is
visible with the test below (arb_shader_image_size).
The tests fail randomly at the "fragment" test category.
Anyway, if the "compute" category is removed, the same
tests are working flawlessly.

The "compute" part of the driver was interfering with the
graphic pipeline. The culprit is the packet PKT3_DEALLOC_STATE
which puts the gpu in an incorrect state to perform some
graphic operations.

This change fixes this problem by issuing a PKT3_CLEAR_STATE
packet just after the PKT3_SURFACE_SYNC packet. As explained
by d51dbe048a PKT3_DEALLOC_STATE is mandatory on cayman to
avoid a gpu hang at the PKT3_SURFACE_SYNC stage.

This correction makes tests like
"spec@glsl-4.30@execution@built-in-functions@cs-.*" to pass
in an utterly deterministic way without random failures.
This change removes around 500 random failures for a
"piglit run all".

For instance, this issue is triggered on cayman with
"piglit/bin/arb_shader_image_size-builtin -auto -fbo".

Fixes: d51dbe048a ("r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33973>
(cherry picked from commit 085cfc98cc)
2025-03-15 09:49:04 +01:00
David Rosca
94a92219d7 gallium/vl: Return YUV plane order for single plane formats
The order only matters for multi plane formats, but we still need to
return valid value for single plane formats.

Fixes crash reported here: https://github.com/mpv-player/mpv/issues/15992

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33938>
(cherry picked from commit 6f35d3768d)
2025-03-15 09:49:04 +01:00
Samuel Pitoiset
ca58bc9d8f aco: do not apply OMOD/CLAMP for pseudo scalar trans instrs
This optimization seems broken because eg. v_s_log_f32 uses SGPRs
for both the source and destination but applying OMOD seems to require
VGPRs.

This fixes a GPU hang when launching Enshrouded on GFX1201.

No fossils db changes on GFX1201.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34027>
(cherry picked from commit f46830912e)
2025-03-15 09:49:04 +01:00
Eric Engestrom
46d1ff0765 meson: announce that clover is deprecated (slated for removal)
See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19385;
the timeline is not 100% decided yet, but let's warn users already.

Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34021>
(cherry picked from commit a0b457aca6)
2025-03-15 09:49:04 +01:00
Faith Ekstrand
83a18330f3 egl/kopper: Update the EGLSurface size after kopperSwapBuffers()
Otherwise, the size of the EGLSurface and the drawable may get out of
sync if kopper needs to re-create the swapchain at a different size.
This can cause problems with things like eglSetDamageRegionKHR() where
the core EGL code clamps them to the size in the EGLSurface.

With Wayland, it's up to the client to choose a size and resize by
creating a new EGLSurface with a different size.  Only on X11 can we
get a resize side-band like this.

Normally, without kopper, this goes the other direction where the X11
EGL code will detect a surface size change in dri2_x11_query_surface()
and it invalidates the drawable if they've changed, forcing
re-allocation.  Kopper, however, works more like the DRI2 path where we
just get handed buffers at some size decided by X11 and have to deal
with them.  In the DRI2 path, the size is unconditionally updated by
dri2_x11_get_buffers().  This is roughly equivalent, updating the size
right after every call to kopperSwapBuffers().

Fixes: 8ade5588e3 ("zink: add kopper api")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12797
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34015>
(cherry picked from commit ad90dbabe4)
2025-03-15 09:49:04 +01:00
Faith Ekstrand
18fc1a4aff egl/x11: Re-order an if statement
Switch on kopper first so it's easier to do other, common things on the
kopper path.

Fixes: 8ade5588e3 ("zink: add kopper api")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34015>
(cherry picked from commit dc8714c568)
2025-03-15 09:49:04 +01:00
Dave Airlie
9fb0403e27 radv/video: don't try and send events on UVD devices.
This should fix some hangs on polaris when decode is forced on.

Fixes: 95a980b61f ("radv/video: add event support for VCN4")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34013>
(cherry picked from commit 2e3b23539e)
2025-03-15 09:49:04 +01:00
John Anthony
8620d4a494 panvk: Avoid division by zero for vkCmdCopyQueryPoolResults
Stride can be zero if there are less than two queries to copy.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 7755c41b3e ("panvk/csf: Rework the occlusion query logic to avoid draw flushes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34020>
(cherry picked from commit 8a47ae456c)
2025-03-15 09:49:04 +01:00
Lionel Landwerlin
2d96b368cd anv: fix non page aligned descriptor bindings on <Gfx12.0
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ab7641b8dc ("anv: implement descriptor buffer binding")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33911>
(cherry picked from commit de2a65ade6)
2025-03-15 09:49:04 +01:00
Georg Lehmann
0be9c89310 aco/gfx11.5: remove vinterp ddx/ddy path
While the idea to take advantage of the higher throughput wasn't bad,
the hardware wasn't design with this in mind and doesn't behave like expected
with constant sources.

Fixes: bee487df48 ("aco/gfx11.5+: use vinterp for fddx/fddy")
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33969>
(cherry picked from commit 3b5e537b09)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
22ba337921 radv: update conformance version
A lot of people (including me) misinterpreted the conformanceVersion
field for so long. The Vulkan spec wasn't very clear either but it's
going to be clarified soon.

VkConformanceVersion is actually unrelated to the official CTS
conformance process in Khronos. It just reports the latest CTS version
that the driver can pass, not more.

For GFX8+, RADV should be passing CTS 1.4.0.0 on all GPUs because we
validated this CTS version recently for Vulkan 1.4.

For GFX6-7, which only suppports Vulkan 1.3, RADV should also be
passing CTS 1.4.0.0, because newer versions of the CTS can be used
to validate a driver against an older version of the spec, so
it's perfectly fine to report a higher CTS version than the Vulkan version.

Newer CTS versions likely can't pass 100% due to a DGC bug that I still
need to fix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12799
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34018>
(cherry picked from commit e519e0b9e6)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
e5f0fd5626 radv/amdgpu: fix device deduplication
To correctly deduplicate device inside the winsys, it should use the
fd or amdgpu_device_handle. Using the allocated ac_drm_device as key
is obviously broken.

Not deduplicating devices breaks memory budget and a bunch of games
were broken.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12686
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12775
Fixes: a565f2994f ("amd: move all uses of libdrm_amdgpu to ac_linux_drm")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34005>
(cherry picked from commit c627097841)
2025-03-15 09:49:03 +01:00
Sviatoslav Peleshko
22991d17a3 drirc: Apply assume_full_subgroups_with_shared_memory to Resident Evil 2
The game uses a compute shader for occlusion culling. This shader lacks
proper groupshared memory sync, and needs 32-wide subgroup to work
correctly.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7595
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23408>
(cherry picked from commit bd45b738b7)
2025-03-15 09:49:03 +01:00
Sviatoslav Peleshko
090dbbc995 anv: Add full subgroups workaround for the shaders that use shared memory
This workaround is similar to anv_assume_full_subgroups, but it applies
to the shaders that use shared memory. If they rely on the implicit
synchronization, and we choose a smaller group size than the
(broken) shader expects, it will produce incorrect results.

Cc: mesa-stable
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23408>
(cherry picked from commit 369aec5704)
2025-03-15 09:49:03 +01:00
Faith Ekstrand
3be28b42e2 vtn: Support cooperative matrices in OpConstantNull
Cooperative matrix initializers are a single scalar value that gets
broadcasted to the entire matrix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12679
Fixes: b98f87612b ("spirv: Implement SPV_KHR_cooperative_matrix")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33747>
(cherry picked from commit 7c47a3d0f7)
2025-03-15 09:49:03 +01:00
Maíra Canal
c420a3495b v3dv: don't overwrite the primary fd if it's already set
If a valid primary file descriptor is already set (e.g. from vc4),
don't overwrite it with -1.

This prevents losing a valid primary fd and resolves issues arising
when vc4 is the first node returned by `drmGetDevices2()` and v3d is
the second.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12777
Fixes: 188f1c6cbe ("v3dv: rewrite device identification")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33958>
(cherry picked from commit 7775c79035)
2025-03-15 09:49:03 +01:00
Samuel Pitoiset
2d9d444aa7 radv: fix a GPU hang with inherited rendering and HiZ/HiS on GFX1201
With secondary command buffers, inherited rendering can be used but
it's basically impossible to know if the depth/stencil attachment
enabled HiZ/HiS. But it's required to disable WALK_ALIGN8 to avoid
GPU hangs.

This assumes that HiZ/HiS is enabled for inherited rendering as long
as a depth/stencil attachment is used. It's not the most optimal
approach but it's not supposed to hurt either.

This fixes a GPU hang with
dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.contents_secondary_cmdbuffers
and friends.

GFX1200 isn't affected because it doesn't support HiZ/HiS.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33986>
(cherry picked from commit d1a2ba57f9)
2025-03-15 09:49:03 +01:00
Erik Faye-Lund
7627192919 panvk: correct VkPhysicalDeviceProperties::deviceName
We currently report a deviceName as e.g. "Mali-G610 (Panfrost)", but
panfrost has nothing to di with the physical device, and the suffix
doesn't belong there at all.

So let's remove that suffix from PanVK. This results in output like this
from vulkaninfo:

---8<---
VkPhysicalDeviceProperties:
---------------------------
        apiVersion        = 1.1.305 (4198705)
        driverVersion     = 25.0.99 (104857699)
        vendorID          = 0x13b5
        deviceID          = 0xa8670000
        deviceType        = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
        deviceName        = Mali-G610
        pipelineCacheUUID = <snip>
---8<---

We already sort of namedrop Panfrost in the driver properties:

---8<---
VkPhysicalDeviceDriverPropertiesKHR:
------------------------------------
        driverID        = DRIVER_ID_MESA_PANVK
        driverName      = panvk
        driverInfo      = Mesa 25.1.0-devel (git-136dd9f985)
        conformanceVersion:
                major    = 1
                minor    = 4
                subminor = 1
                patch    = 2
---8<---

While this might techically speaking be a regression, PanVK has been
marked as experimental until Mesa 25.0. But to reduce the risk of people
starting to depend on this behavior, let's also backport this change to
the 25.0 release.

The patch looks a bit funny, because we add the " (Panfrost)"-suffix in
common code, and this moves it to the Gallium driver. But effectively,
this means PanVK is the only driver that sees a change of behavior.

Backport-to: 25.0
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33972>
(cherry picked from commit c34c7b1f3b)
2025-03-15 09:49:03 +01:00
Pierre-Eric Pelloux-Prayer
6f0eb911f7 st/mesa: fix nir_load_per_vertex_input parameter
num_components should be 1 as we're loading an offset value.

Fixes: ec68f0492b ("st/mesa: switch GL_SELECT shader to IO intrinsics")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12774
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33982>
(cherry picked from commit 770b5bc757)
2025-03-15 09:49:03 +01:00
Faith Ekstrand
9557d9b93b nil: Relax alignment requirements for linear images
Compositors sometime try to import BOs with lower alignments than 128B.
This seems particularly common in the case of cursor images but it can
also happen on other BOs allocated by the old nouveau GL driver.  As
long as we avoid rendering to them (which NVK will do), the
texture/image hardware is fine as long as they're at least 32B-aligned.
Panicing in this case isn't very nice to compositors.

Backport-to: 25.0
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33990>
(cherry picked from commit 3c11da8aea)
2025-03-15 09:49:03 +01:00
Faith Ekstrand
fb1d8599b4 nvk: Allow rendering to linear images with unaligned strides
We can do this by just enabling the fall-back path whenever we detect
something that's not nicely aligned.

Backport-to: 25.0
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33990>
(cherry picked from commit e36f9d6909)
2025-03-15 09:49:03 +01:00
Ivan A. Melnikov
1784e9d142 gallium/radeon: Make sure radeonsi PCI IDs are also included
When importing libdrm_radeon code [1][2] it was somehow missed
that what libdrm has in one r600_pci_ids.h, Mesa has split
into r600_pci_ids.h and radeonsi_pci_ids.h. So, devices
with ids from radeonsi_pci_ids.h were not considered valid for
radeon_surface_manager_new.

This commit changes that, thus fixing radeonsi for these
devices.

[1] commit 1299f5c50a
[2] commit 3aa7497cc0

Fixes: 1299f5c50a
Signed-off-by: Ivan A. Melnikov <iv@altlinux.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33940>
(cherry picked from commit 4ad5b8f5bb)
2025-03-15 09:49:03 +01:00
Job Noorman
e4674b3d33 ir3: keep inputs at start block when creating empty preamble
It is expected that inputs and prefetches are always in the first block.
However, ir3_create_empty_preamble would create blocks before the first
one, leaving inputs after the preamble. This causes issues with
(probably among others) spilling/RA where precolored inputs could
illegally reuse the spill base register.

Fixes RA validation failures on a7xx for
dEQP-VK.ray_query.multiple_ray_queries.vertex_shader

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: f3026b3d3e ("ir3: add some preamble helpers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33977>
(cherry picked from commit c58ba21ba8)
2025-03-15 09:49:03 +01:00
Natalie Vock
9bcbdbfcf2 radv/rt: Flush L2 after writing internal node offset on GFX12
Otherwise the encoder can read a stale value and make internal nodes
point into leaf space (if 0 is read).

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33985>
(cherry picked from commit a1b0599105)
2025-03-15 09:49:03 +01:00
Natalie Vock
5603cefd94 radv/rt: Guard leaf encoding by leaf node count
For empty BVHs we shouldn't emit any leaf nodes, but there is one
invocation to encode the root node. Guard leaf node encoding so that
invocation doesn't try writing any leaves.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33985>
(cherry picked from commit cdadda2d51)
2025-03-15 09:49:03 +01:00
Ashley Smith
00f882c07a panfrost: Reset syncobj after use to avoid kernel warnings
We get a kernel message "You are adding an unorder point to timeline!"
on many CTS runs. This stems from us SIGNALing the queue syncobj then
WAITing but not reseting it. It is assumed by the time we get to
panvk_queue_submit_init_signals() that the value is 0, however it is 1
due to the previous calls.

Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 5544d39f ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33943>
(cherry picked from commit 14101ff948)
2025-03-15 09:49:02 +01:00
David Rosca
2ad0974e1e frontends/vdpau: Fix creating deinterlace filter for interleaved buffers
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12755
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33927>
(cherry picked from commit e56b906df9)
2025-03-15 09:49:02 +01:00
David Rosca
94ff2a8ddd Revert "frontends/vdpau: Alloc interlaced surface for interlaced pics"
This is not needed now when deinterlace can handle non-interlaced
buffers. Also this forces the buffer as interlaced which doesn't work
on radeonsi anymore.

This reverts commit 0ee4506c3a.

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33927>
(cherry picked from commit 6b91f13d5d)
2025-03-15 09:49:02 +01:00
David Rosca
3ad2c24988 gallium/vl: Fix video buffer supported format check
It needs to check all plane formats.

Fixes: c3ceec6cd8 ("vdpau: Refactor query for video surface formats.")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33927>
(cherry picked from commit 244cfac143)
2025-03-15 09:49:02 +01:00
Samuel Pitoiset
a4e6a8fb96 ac,radv: add a workaround for a hw bug with primitive restart on GFX10-GFX10.3
At least, NAVI10, NAVI21 and NAVI24 are affected by this what looks
like a hardware bug when primitive restart is changed and no context
registers are written between draws. It seems the hardware doesn't
consider primitive restart at all in this situation.

Adding SQ_NON_EVENT(0) as suggested by Marek seems to fix it reliably
without introducing any overhead. It's basically a NOP packet that adds
a small delay.

Fixes new VKCTS coverage dEQP-VK.transform_feedback.primitive_restart.*.
Also fixes this old vkd3d-proton issue.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7258
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33929>
(cherry picked from commit 0bc9d59c2e)
2025-03-15 09:49:02 +01:00
Yiwei Zhang
c9112e3050 venus: fix to ignore dstSet for push descriptor
Per push descriptor spec:

Each element of pDescriptorWrites is interpreted as in
VkWriteDescriptorSet, except the dstSet member is ignored.

Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33948>
(cherry picked from commit c7bc90eaec)
2025-03-15 09:49:02 +01:00
Eric Engestrom
45e99616f3 .pick_status.json: Mark 551770ccf8 as denominated 2025-03-15 09:49:02 +01:00
Timothy Arceri
4fdb2e99a8 util/u_idalloc: fix util_idalloc_sparse_alloc_range()
If the allocation didn't fit within the segment the loop incorrectly
freed ids of a range of different segments due to the loop redeclaring
i.

Fixes: d4085aaf56 ("util: add util_idalloc_sparse, solving the excessive virtual memory usage")

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33934>
(cherry picked from commit 25e008c639)
2025-03-15 09:49:02 +01:00
Alyssa Rosenzweig
4da0e7ffaf nir/lower_helper_writes: fix stores after discard
We need to use nir_is_helper_invocation instead of
nir_load_helper_invocation, to correctly predicate stores after demote.

Identified in a Piglit on AGX a year ago but I forgot to upstream this.

Fixes: 586da7b329 ("nir: Add nir_lower_helper_writes pass")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33939>
(cherry picked from commit bc6b527b52)
2025-03-15 09:49:02 +01:00
Alyssa Rosenzweig
cd9dae9931 pan/mdg: call nir_lower_is_helper_invocation
needed to avoid regression from the next patch.

backported because the next patch is too

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33939>
(cherry picked from commit e90ccf91a3)
2025-03-15 09:49:02 +01:00
Mel Henning
f2a8804927 nvk: Don't zero imported memory
This fixes eg.
dEQP-VK.drm_format_modifiers.export_import_fmt_features2.a8b8g8r8_uint_pack32
with NVK_DEBUG=zero_memory

Fixes: 0399999dec ("nvk: Support dma-buf import")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33947>
(cherry picked from commit cab80223fd)
2025-03-15 09:49:02 +01:00
Faith Ekstrand
6187b8e4c0 zink: Check queue families when binding image resources
We check for iamge layouts and feedback loops when we bind image
resources but not queue families.  If the resource isn't on the graphics
queue, we need to add it to need_barriers so we can transition it back
to our queue.

Fixes: d4f8ad27f2 ("zink: handle implicit sync for dmabufs")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33952>
(cherry picked from commit 18d206d67c)
2025-03-15 09:49:02 +01:00
Faith Ekstrand
5c0fd3e20d zink: Set needs_barrier after transitioning to QUEUE_FAMILY_FOREIGN
Otherwise, we'll transition to QUEUE_FAMILY_FOREIGN and then forget that
we left it on the foreign queue and never transition back the next time
we use the resource.  This was kind-of okay with Wayland compositors
because they always re-import the BO so it's always fresh and they pick
up on the queue transfer the first time.  X11, on the other hand, does
not re-import BOs so they get stuck in this weird QUEUE_FAMILY_FOREIGN
limbo until something happens to randomly trigger a layout transition
check and then we find it and do the transition.  We should mark them as
needing a barrier the moment we transition to QUEUE_FAMILY_FOREIGN.

Fixes: d4f8ad27f2 ("zink: handle implicit sync for dmabufs")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33952>
(cherry picked from commit 396ece1ad8)
2025-03-15 09:49:02 +01:00
Yiwei Zhang
a67d8b0a6b lavapipe: fix accel struct device query copy
This change:
1. use vulkan flags instead of pipe query flags
2. set the avail bit when requested

Fixes: a26f96ed3d ("lavapipe: Handle accel struct queries in handle_copy_query_pool_results")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33951>
(cherry picked from commit e538a38017)
2025-03-15 09:49:02 +01:00
Yiwei Zhang
26b33e2e4d lavapipe: set availability bit for accel struct host queries
Fixes: 897ccbd180 ("lavapipe: Implement VK_KHR_acceleration_structure")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33951>
(cherry picked from commit bc190cab2d)
2025-03-15 09:49:02 +01:00
Rebecca Mckeever
ff107b6123 panvk: Add STORAGE_IMAGE_BIT feature for formats supporting sampled images
All formats that support sampled images should also be suitable for
storage images.

Fixes: d970fe2e ("panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33459>
(cherry picked from commit 27037efcfd)
2025-03-15 09:49:02 +01:00
Erik Faye-Lund
f6ccb29a68 docs/features: add missing panvk feature
I forgot to document this feature when I added it, whoops!

Fixes: ac05c2a2b8 ("panvk: expose subgroup operations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33918>
(cherry picked from commit 1a1412e66e)
2025-03-15 09:49:02 +01:00
Georg Lehmann
3abbecb10d radv: enable invariant geom for DOOM(2016)
Moving alu reordered some fmuls and since we prefer the closest fmul for ffma,
this causes precision to mismatch between depth write and depth test.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12016

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33933>
(cherry picked from commit 7b1f1a107e)
2025-03-15 09:49:01 +01:00
Marek Olšák
de3dd2afdf Revert "ac/nir: clamp vertex color outputs in the right place"
This reverts commit b3fc49686e.

It was a rebase failure.

Fixes: b3fc49686e

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482>
(cherry picked from commit 177c9b173e)
2025-03-15 09:49:01 +01:00
Yiwei Zhang
49b2aad9eb venus: fix a memory corruption in query records recycle
The free list must be re-initialized. Found the bug while running:
dEQP-VK.ray_tracing_pipeline.acceleration_structures.device_compability_khr.gpu_built.top
where it invokes VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT to purge
the cmd pool resources, and the next alloc still gets cache hit with the
"empty" list.

Fixes: e2c4bafccc ("venus: free query batches for VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33908>
(cherry picked from commit 6868212774)
2025-03-15 09:49:01 +01:00
José Roberto de Souza
9a34afca0f intel/common: Retry GEM_CONTEXT_CREATE when PXP have not finished initialization
If PXP initialization is not completed and application requested a
protected context the GEM_CONTEXT_CREATE will wait up to 250ms for
PXP to finish initialization but if that do not happens it will
return a error and set errno to EIO.
This patch add the missing retry handling.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30723>
(cherry picked from commit 008ac818ba)
2025-03-15 09:49:01 +01:00
Karol Herbst
aff18d4898 rusticl/program: fix building kernels
We ended up with duplicates, but also rebuilt the same kernel over and
over again for multi dev builds.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33892>
(cherry picked from commit ce60f47e96)
2025-03-15 09:49:01 +01:00
Karol Herbst
b5ec24f356 rusticl/program: rework build_nirs so it only touches devices we care about
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33892>
(cherry picked from commit 57a7e86aa9)
2025-03-15 09:49:01 +01:00
Karol Herbst
4d5d4ead44 rusticl/program: loop over all devices inside Program::build
We want to build the kernels once and atm we are doing it several times
for each device.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33892>
(cherry picked from commit 241279ac2c)
2025-03-15 09:49:01 +01:00
Karol Herbst
9247e110cd rusticl/program: pass options by reference
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33892>
(cherry picked from commit e434ce1559)
2025-03-15 09:49:01 +01:00
Karol Herbst
88f749d049 rusticl/program: implement CL_INVALID_PROGRAM_EXECUTABLE check in clGetProgramInfo
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33892>
(cherry picked from commit b2f3933c8d)
2025-03-15 09:49:01 +01:00
Rob Clark
68520f2279 freedreno: Wait for imported syncobj fences to be available
Waiting on a fence created from an imported syncobj needs wait for the
fence_fd to become available

Fixes piglit tests added in https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/992

Fixes the following issue for freedreno: #12650

Cc: mesa-stable
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33724>
(cherry picked from commit ee787b64ed)
2025-03-15 09:49:01 +01:00
Rob Clark
19b840362d tc: Add missing tc_set_driver_thread()
Cc: mesa-stable
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33724>
(cherry picked from commit fac2c4af1b)
2025-03-15 09:49:01 +01:00
Job Noorman
a476298ae5 ir3: fix false dependencies of rpt instructions
When merging multiple instructions into one rpt instruction, the false
deps of the rpt instruction should be the union of the false deps of its
parts.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 4c4366179b ("ir3: add post-RA pass to merge repeat groups into rptN instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32454>
(cherry picked from commit 0f6ec14925)
2025-03-15 09:49:01 +01:00
Rhys Perry
6999073da9 aco: insert dependency waits in certain situations
This seems to fix some artifacts, but we're not sure why, so it might not
be a correct or optimal solution.

fossil-db (navi31):
Totals from 28424 (35.81% of 79377) affected shaders:
Instrs: 30112910 -> 30348977 (+0.78%); split: -0.00%, +0.78%
CodeSize: 159542980 -> 160485336 (+0.59%); split: -0.00%, +0.59%
Latency: 221438396 -> 221500856 (+0.03%); split: -0.00%, +0.03%
InvThroughput: 38154231 -> 38159984 (+0.02%); split: -0.00%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33853>
(cherry picked from commit 0ec174afd5)
2025-03-15 09:49:01 +01:00
Faith Ekstrand
58d30fed2f zink: Use pipe_box helpers for damage calculations
The old code got the accumulation a bit wrong.  For one thing, it always
accumulates with whatever was there instead of resetting to empty each
time.  For another, it sets with with y and height with x when it writes
back to the resource.  This is also all too complicated because it
converts between pipe_box, u_rect, and VkRect2D on every iteration.

Instead, there are helpers in util/box.h which will do most of this work
for us and they're correct.  Let's just use them to get rid of the bugs
and make everything simpler and more obvious at the same time.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12194
Fixes: 3d38c9597f ("zink: hook up KHR_partial_update")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33855>
(cherry picked from commit 11939a70df)
2025-03-15 09:49:00 +01:00
Faith Ekstrand
a8a5e94ddf util/box: Add a intersect_2d helper
Fixes: 3d38c9597f ("zink: hook up KHR_partial_update")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33855>
(cherry picked from commit 8cf921a742)
2025-03-15 09:49:00 +01:00
Mary Guillemard
6b92f95f6e pan/bi: Ensure we select b0 with halfswizzle in va_lower_constants
In case of constant lowering with halfswizzle sources, we were selecting
h01 causing an invalid instruction error to be yield later.

This can only be hit by conversion instructions and shouldn't be seen in
the wild (as this should be eliminated before entering the backend).

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 7d07fb9a67 ("pan/va: Handle 8-bit lane when lowering constants")
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33867>
(cherry picked from commit 2f1ce296d0)
2025-03-15 09:49:00 +01:00
Mary Guillemard
54fb1e47d5 pan/bi: Fix out of range access in bi_instr_replicates
For replicates, we were checking equivalence between two sources on some
instructions but some of them only had one source causing an out of
bound access and check against unrelated data.

Instead we now always return true for those instructions.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: f7d44a46cd ("pan/bi: Optimize replication")
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33867>
(cherry picked from commit 8948b74955)
2025-03-15 09:49:00 +01:00
Eric Engestrom
2b381ba435 .pick_status.json: Update to 61feea6954 2025-03-15 09:48:51 +01:00
Eric Engestrom
d09833d705 .pick_status.json: Mark 534436f863 as denominated 2025-03-05 22:49:42 +01:00
Eric Engestrom
a7da6ebcdd .pick_status.json: Mark 61b0955308 as denominated 2025-03-05 22:49:42 +01:00
Eric Engestrom
d2e943ad17 docs: add sha sum for 25.0.1 2025-03-05 22:25:44 +01:00
Eric Engestrom
c185b4a7b0 VERSION: bump for 25.0.1 2025-03-05 22:05:32 +01:00
Eric Engestrom
0634473336 docs: add release notes for 25.0.1 2025-03-05 22:05:32 +01:00
Guilherme Gallo
7e12252613 ci/lava: Add U-Boot action timeout for rockchip DUTs
Add a specific timeout for the U-Boot action in LAVA job definitions for
rockchip devices. This ensures sufficient time for U-Boot to download
the kernel and set up early network, preventing potential job failures
due to timeout constraints.

This behavior started to happen since LAVA 2025.02 version.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33839>
(cherry picked from commit 1dbebd2619)
2025-03-05 21:26:48 +01:00
Guilherme Gallo
989f2d8b34 ci/lava: Propagate errors in SSH tests
The `lava_ssh_test_case` wrapper was missing the `set -e` shell option,
which made LAVA system interpret the job was succeeding, because the
`container` namespace was exiting normally, even though the `dut`
namespace was failing.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33839>
(cherry picked from commit 1169f704d3)
2025-03-05 21:26:48 +01:00
Guilherme Gallo
9347962591 ci/lava: Drop the repeating quotes on lava-test-case
LAVA was recently patched [1] with a fix on how parameters are parsed in
`lava-test-case`, so we don't need to repeat quotes to send the
arguments properly to it.

[1] 18c9cf7976

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33839>
(cherry picked from commit 02a86b3284)
2025-03-05 21:26:48 +01:00
Faith Ekstrand
a591864851 egl/wayland: Pass the original wl_surface to kopper
The Vulkan WSI code creates its own proxies so there's no benefit to
passing the proxy in.  It only screws things up.

Fixes: 8ade5588e3 ("zink: add kopper api")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33860>
(cherry picked from commit 99b5970eb2)
2025-03-05 21:26:48 +01:00
Faith Ekstrand
4e0f86e99f egl/dri2: Rework get_wl_surface_proxy()
Instead, just make it a helper for getting the wl_surface from the
wl_egl_window.  We'll want this in the next commit.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33860>
(cherry picked from commit fddff0d1b8)
2025-03-05 21:26:48 +01:00
Mike Blumenkrantz
0feebb60ad mesa: avoid creating incomplete surfaces when multiview goes out of range
some drivers can't handle this, and it can't be used anyway, so don't bother

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33632>
(cherry picked from commit 3f7b0c3951)
2025-03-05 21:26:48 +01:00
Mike Blumenkrantz
3a2e0fb0f7 gallium: fix pipe_framebuffer_state::view_mask
this is the mask of the number of views, not the actual views being
selected

llvmpipe previously had this wrong, though I don't understand how
vkcts didn't cover it

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33632>
(cherry picked from commit 2b37f23314)
2025-03-05 21:26:48 +01:00
Mike Blumenkrantz
30455e071c llvmpipe: pass layer count to rast clear
this otherwise passes the fb layer, which is not quite right when
using multiview with view indexing

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33632>
(cherry picked from commit 5ef60aef63)
2025-03-05 21:26:47 +01:00
David Rosca
33a7918337 radeonsi/vcn: Set all pic params for H264 encode references
Fixes encoding B-frames with I-frame as L1 reference.

Cc: mesa-stable
Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33845>
(cherry picked from commit d92781508b)
2025-03-05 12:08:20 +01:00
Eric Engestrom
0d325602f1 .pick_status.json: Update to 45e771f4fb 2025-03-05 12:06:20 +01:00
Yiwei Zhang
c9a177aca2 venus: relax the requirement for sync2
The current requirement for sync is only to support WSI, and it is not
necessarily needed at all per the comment added. Will drop it later.

Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33829>
(cherry picked from commit c35b52638c)
2025-03-04 20:43:43 +01:00
Eric Engestrom
4bc19462b9 .pick_status.json: Mark 5461ed5808 as denominated 2025-03-04 20:41:40 +01:00
David Rosca
63f2fa10cc frontends/va: Set AV1 max_width/height to surface size
Ideally this would be passed in pic params as the values are
in sequence header, but using the surface size also works.
Also add sanity checks for frame size.

Fixes decoding av1-1-b8-22-svc-L2T1 and av1-1-b8-22-svc-L2T2.

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33737>
(cherry picked from commit d0414ef7fb)
2025-03-04 20:37:43 +01:00
Patrick Lerda
4ec5c2fb59 r600: fix emit_image_size() range base compatibility
This change fixes a regression introduced with 8b5d41cacb.
Indeed, lookup_resid was not updated.

This change was tested on palm and cayman. Here are the tests fixed:
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-float: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-int: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-uint: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-float: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-int: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-uint: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-float: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-int: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-uint: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-float: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-int: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-uint: fail pass

Fixes: 8b5d41cacb ("r600/sfn: Use range_base for atomics and images")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33352>
(cherry picked from commit fd874bdd0c)
2025-03-04 20:26:19 +01:00
Lars-Ivar Hesselberg Simonsen
28d34f30e6 panvk: Use RUN_COMPUTE over RUN_COMPUTE_INDIRECT
RUN_COMPUTE_INDIRECT has been found to cause intermittent hangs, so
this change replaces it with RUN_COMPUTE and a set TASK_AXIS_X.

While this task axis might be suboptimal, the performance cost is
somewhat offset by RUN_COMPUTE not being an emulated command.

Fixes: 2ffc05d8d2 ("panvk: Add support for CmdDispatchIndirect")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33841>
(cherry picked from commit fe31e7843d)
2025-03-04 20:26:18 +01:00
Lars-Ivar Hesselberg Simonsen
af767e1e3e panfrost: Use RUN_COMPUTE over RUN_COMPUTE_INDIRECT
RUN_COMPUTE_INDIRECT has been found to cause intermittent hangs, so
this change replaces it with RUN_COMPUTE and a set TASK_AXIS_X.

While this task axis might be suboptimal, the performance cost is
somewhat offset by RUN_COMPUTE not being an emulated command.

Fixes: 447075eeee ("panfrost: Add support for the CSF job frontend")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33841>
(cherry picked from commit 6bf9ad2610)
2025-03-04 20:26:15 +01:00
Tapani Pälli
915075bf66 iris: remove dead code that cannot get hit anymore
As of recent changes, MESA_SHADER_GEOMETRY is handled by the if ladder.

CID: 1643918
Fixes: c33ebf09f5 ("iris: fix handling of GL_*_VERTEX_CONVENTION")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33842>
(cherry picked from commit d0b8d7d46c)
2025-03-04 20:24:44 +01:00
Patrick Lerda
56d066e062 r600: fix the indirect draw 8-bits path
This change fixes the indirect draw 8-bits path which does
a conversion to 16-bits. This change is implemented to process
the parameters the same way as the other indirect draw paths.

This change was tested on palm and cayman. Here are the tests fixed:
deqp-gles31/functional/draw_indirect/draw_elements_indirect/indices/index_byte: fail pass
deqp-gles31/functional/draw_indirect/random/35: fail pass
deqp-gles31/functional/draw_indirect/random/45: fail pass
khr-gl40/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl41/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl42/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl43/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl44/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl45/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass

Fixes: d80701df8a ("r600g: Implement GL_ARB_draw_indirect for EG/CM")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32802>
(cherry picked from commit 9aea08e1db)
2025-03-04 20:24:40 +01:00
Faith Ekstrand
3f7abae2fc zink: Don't present to Wayland surfaces asynchronously
Wayland EGL has a driver invariant which requires that any `wl_surface`
(or wp_linux_drm_syncobj_surface_v1) calls happen inside the client's
call to eglSwapBuffers().  Submitting surface messages after
eglSwapBuffers() returns causes serialization issues with the Wayland
surface protocol and can lead to the compositor booting the app.

Fixes: 8ade5588e3 ("zink: add kopper api")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12736
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33859>
(cherry picked from commit b92117d9bb)
2025-03-04 20:24:39 +01:00
Marek Olšák
d8b47159b7 mesa: allocate GLmatrix aligned to 16 bytes
The declaration has:

typedef struct {
   alignas(16) GLfloat m[16];   /**< 16 matrix elements (16-byte aligned) */
   alignas(16) GLfloat inv[16]; /**< 16-element inverse (16-byte aligned) */
...
} GLmatrix;

We should honor that.

Fixes: 3175b63a0d - mesa: don't allocate matrices with malloc
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10237

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33856>
(cherry picked from commit 7655826243)
2025-03-04 20:24:08 +01:00
Caio Oliveira
390317a99e brw: Fix size in assembler when compacting
Calculation was wrongly walking uncompacted instructions, even if we had
some compacted in the middle, generating invalid size.  Since we are
here just drop the instruction count, since in practice the caller will
have to walk the instruction stream anyway.

Fixes: 6267585778 ("intel/brw: Also return the size of the assembled shader")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33532>
(cherry picked from commit dd1ca1588d)
2025-03-04 20:24:05 +01:00
Samuel Pitoiset
5200d13a0f radv: fix re-emitting fragment output state when resetting gfx pipeline state
When switching from pipeline to shader objects.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33840>
(cherry picked from commit 7f6e28db26)
2025-03-04 20:24:03 +01:00
Gert Wollny
9842f90fcc r600/sfn: gather info and set lowering 64 bit after nir_lower_io
After nir_lower_io we need to gather the info about 64 bit usage
to be up-to-date when deciding whether the remaining 64 bit IO ops
be lowered.

Before 89dad5618d ("gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER")
the info was eventually updated to include the use of 64 bit values
also if only some IO was using this so that SFN was handling the code
correctly. As it seems with above patch this is not always the case
anymore, and we have to take care of it.

Fixes: 89dad5618d ("gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32774>
(cherry picked from commit 6da19eafd5)
2025-03-04 20:24:03 +01:00
Mary Guillemard
41f982ddac pan/bi: Disallow FAU special page 3 and WARP_ID on message instructions
This is a constraint that apply on Valhall and later, instructions
should not use FAU special page 3 or WARP_ID if running
on the message unit.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: fd1906afea ("pan/va: Add FAU validation")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33843>
(cherry picked from commit ef0c7382c7)
2025-03-04 20:24:02 +01:00
Eric Engestrom
43b9f114cb .pick_status.json: Update to fbc55afbdf 2025-03-04 20:23:55 +01:00
Konstantin Seurer
08ae198bda llvmpipe: Skip draw_mesh if the ms did not write gl_Position
There is nothing to be done and the code will hit "assert(pos != -1);"
otherwise.

cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12684
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33812>
(cherry picked from commit 4348253db5)
2025-03-03 17:25:25 +01:00
Patrick Lerda
ebca2fafa8 r600: fix evergreen_emit_vertex_buffers() related cl regression
For instance, this issue is triggered with "piglit/bin/cl-custom-buffer-flags":
Segmentation fault

Fixes: 81889f4d5c ("r600: ensure that the last vertex is always processed on evergreen")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33351>
(cherry picked from commit ee1cb894d6)
2025-03-03 17:25:24 +01:00
Emmanuel Gil Peyrot
4607eb7eae panvk: Initialize out array with the correct length
This avoids reading past the buffer’s end in the client afterward, because the
drmFormatModifierCount hasn’t been changed from what the client passed, if it
wasn’t zero at first.

GTK triggers that bug by setting it to the length of the static array (see this
bug[0] though), but other Vulkan programs might have the same issue if they
don’t first query the count before allocating the array.

This has been tested on a Radxa ROCK 5B board running a Mali-G610 GPU.

[0] https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/8222

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 252ddaf51b ("panvk: fix VkDrmFormatModifierPropertiesListEXT query")
Fixes: https://gitlab.freedesktop.org/mstoeckl/waypipe/-/issues/127
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33657>
(cherry picked from commit b4a82110ce)
2025-03-03 17:25:23 +01:00
Hyunjun Ko
0ea91330c3 anv: Do not support the tiling of DRM modifier if DECODE_DST
Fixes: 04709e4f ("anv: fix video profile lists");

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33784>
(cherry picked from commit f7ff9b240d)
2025-03-03 17:25:22 +01:00
Mike Blumenkrantz
eff71795d0 zink: clamp UBO sizes instead of asserting
this is a nice idea, but there are apps/games that do not respect
hardware capabilities and yolo-bind fixed size buffers

fixes Ballionaire (2667120) launch on non-desktop drivers

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33819>
(cherry picked from commit b04eaa8589)
2025-03-03 17:25:18 +01:00
Job Noorman
6090162961 ir3/ra: prevent reusing parent interval of reloaded sources
We would set the `src` flag on the interval of reloaded sources.
However, the interval might be merged with its parent when inserted and
the parent wouldn't have this flag set. This caused the parent interval
to potentially be reused to reload later sources. Fix this by setting
the `src` flag on the top-level interval after insertion.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33810>
(cherry picked from commit 2d540b8074)
2025-03-03 17:25:17 +01:00
Kevin Chuang
f912436dc9 anv/bvh: Fix copy shader handling sparse buffer
Fixes: 692b5fa9f2 ("anv: Add shader to copy acceleration structures")

This commit fixes the future test "sparse_binding_structures" for
"header_bottom_address" for ray tracing pipeline.

Even on 48-bit ray tracing (Xe1/2), the software-defined part
instance_leaf_part1.bvh_ptr has to be in canonical form for copy.comp
to deference a bvh, which means we have to preserve the upper 16bits.
This is especially relevant in cases where the acceleration structure buffer
is located high, such as sparse buffer.

Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
(cherry picked from commit 87ff7b061f)
2025-03-03 17:25:16 +01:00
Kevin Chuang
614dd4999c anv/bvh: Fix encoder handling sparse buffer
Fixes: 2fe57947e3 ("anv: Implement encode shader to fit in ANV BVH")

This commit resolves the failures in the future tests
"sparse_binding_structures" for rayquery. Sparse buffers' heaps are
located high, and since it's in canonical form, the higher 16bits are
all set to 1. However, the existing encoder did not expect any non-zero
values at the higher 16bits. As a result, the instance flags got
corrupted, causing most triangle tests to fail.

Thanks for Paulo providing insights about sparse buffer properties.

Co-developed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
(cherry picked from commit b9a980ea73)
2025-03-03 17:25:14 +01:00
Eric Engestrom
7b51aa8e3f .pick_status.json: Update to 4348253db5 2025-03-03 17:25:08 +01:00
Benjamin Lee
6248bc98c2 panfrost/va: remove swizzle mod from LDEXP
This instruction does not support swizzles. This information is not used
for anything, but will be if we use the instruction tables for
bi_lower_swizzle.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 316486dd9f ("pan/va: Add initial ISA.xml for Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
(cherry picked from commit 2a70665df7)
2025-02-28 22:17:35 +01:00
Benjamin Lee
f3ee6ed43c panfrost: fix condition in bi_nir_is_replicated
The original implementation of this returned false when the src was
replicated, and true when it was not.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 21bdee7bcc ("pan/bi: Switch to lower_bool_to_bitsize")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
(cherry picked from commit 810351ad03)
2025-02-28 22:17:35 +01:00
Benjamin Lee
91c473e49a panfrost: fix large int32->float16 conversions
On vulkan, truncating to S/U16 before converting is not valid, because
out-of-range conversions are specified to be correctly rounded. IEEE 754
requires that out-of-range values round to ±inf with RTNE and ±F16_MAX
with RTZ.

On gl, truncating is valid for U16->F16, because out-of-range int->float
conversions are undefined behavior. For S16->F16, it is not valid
because S16_MAX < F16_MAX, so some in-range values will be truncated as
well.

Instead, just handle S/U16->F16 as S/U16->F32->F16.

Fixes dEQP-VK.spirv_assembly.instruction.compute.convertstof.int32_to_float16_*
when shaderFloat16 is enabled in panvk.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: be74b84e6f ("pan/bi: Fill in some more conversions")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
(cherry picked from commit a33cd3def2)
2025-02-28 22:17:35 +01:00
Eric Engestrom
919e3443e8 .pick_status.json: Mark b85c94fc89 as denominated 2025-02-28 22:17:35 +01:00
Daniel Schürmann
553ab18656 aco/assembler: Fix short jumps over chained branches
If we insert

   <code>
   s_branch 1
   s_branch Target

at the end of some block, and later hide an additional chained branch
after the existing one, then we have to update the 's_branch 1' to
also jump over the newly added branch.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Closes: #12673
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
(cherry picked from commit 6659db285a)
2025-02-28 22:17:35 +01:00
Lionel Landwerlin
4a08708ca2 vulkan/runtime: ensure robustness state is fully initialized
This is part of the hashing key :

==25753== Uninitialised byte(s) found during client check request
==25753==    at 0x93D29AE: blob_write_bytes (blob.c:164)
==25753==    by 0x93A62C6: vk_pipeline_precomp_shader_serialize (vk_pipeline.c:722)
==25753==    by 0x93AC55E: vk_pipeline_cache_add_object (vk_pipeline_cache.c:433)
==25753==    by 0x93A691B: vk_pipeline_precompile_shader (vk_pipeline.c:875)
==25753==    by 0x93A8FB9: vk_create_graphics_pipeline (vk_pipeline.c:1715)
==25753==    by 0x93A9799: vk_common_CreateGraphicsPipelines (vk_pipeline.c:1860)
==25753==  Address 0xf1adf82 is 82 bytes inside a block of size 152 alloc'd
==25753==    at 0x64FA858: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==25753==    by 0x99AAC38: vk_default_alloc (vk_alloc.c:26)
==25753==    by 0x93A403B: vk_alloc (vk_alloc.h:48)
==25753==    by 0x93A406B: vk_zalloc (vk_alloc.h:56)
==25753==    by 0x93A60A0: vk_pipeline_precomp_shader_create (vk_pipeline.c:680)
==25753==    by 0x93A689D: vk_pipeline_precompile_shader (vk_pipeline.c:866)
==25753==    by 0x93A8FB9: vk_create_graphics_pipeline (vk_pipeline.c:1715)
==25753==    by 0x93A9799: vk_common_CreateGraphicsPipelines (vk_pipeline.c:1860)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33792>
(cherry picked from commit 4dba1ad93f)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
c795725649 nvk: Only support compute shader derivatives on Turing+
Fixes: e0e7d8d910 ("nvk: Advertise VK_NV/KHR_compute_shader_derivatives")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit 8de37b142e)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
eff601577a nvk: Only support deviceGeneratedCommandsMultiDrawIndirectCount on Turing+
Indirect draws on Maxwell involve patching pushbufs together and doing
that isn't possible with device generated commands.

Fixes: 83b220f833 ("nvk: Advertise VK_EXT_device_generated_commands")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit bd04fdcb2b)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
29ae40e1aa nvk: Handle pre-Turing dispatch indirect commands
The QMD layout is a bit different.

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit 7e12ba8709)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
95d0ecd6e5 nak/qmd: Add a nak_get_qmd_cbuf_desc_layout() helper
Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit c540e5e2cc)
2025-02-28 22:17:35 +01:00
Paulo Zanoni
bac3b56d51 brw: extend the NOP+WHILE workaround
It turns out that we need to add a NOP not only in between two
consecutive WHILE instructions, but also after every control flow
instruction that immediately precedes a WHILE.

v2: Rebase after the renames.

Fixes: 5ca883505e ("brw: add a NOP in between WHILE instructions on LNL")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>
(cherry picked from commit fd10764cff)
2025-02-28 22:17:35 +01:00
Karol Herbst
62747d6bdd intel/brw, lp: enable lower_pack_64_4x16
The compiler won't be able to emit pack_64_4x16, so we should prevent
nir_opt_algebraic to optimize to it. This fixes an infinite optimization
loop inside brw_nir_optimize:

nir_copy_prop
    16x4     %77 = @load_global (%80)
    32    %61995 = pack_32_2x16_split %77.x, %77.y
    32    %61998 = pack_32_2x16_split %77.z, %77.w
    64    %61999 = pack_64_2x32_split %61995, %61998
    64       %76 = iadd %100, %79
                   @store_global (%61999, %76)

nir_opt_algebraic
    16x4     %77 = @load_global (%80)
    32    %61995 = pack_32_2x16_split %77.x, %77.y
    32    %61998 = pack_32_2x16_split %77.z, %77.w
    16x4  %62000 = vec4 %77.x, %77.y, %77.z, %77.w
    64    %62001 = pack_64_4x16 %62000
    64       %76 = iadd %100, %79
                   @store_global (%62001, %76)

nir_lower_pack
    16x4     %77 = @load_global (%80)
    16x4  %62000 = vec4 %77.x, %77.y, %77.z, %77.w
    16    %62002 = mov %62000.y
    16    %62003 = mov %62000.x
    32    %62004 = pack_32_2x16_split %62003, %62002
    16    %62005 = mov %62000.w
    16    %62006 = mov %62000.z
    32    %62007 = pack_32_2x16_split %62006, %62005
    64    %62008 = pack_64_2x32_split %62004, %62007
    64       %76 = iadd %100, %79
                   @store_global (%62008, %76)

// brw_nir_optimize loops here

nir_copy_prop
    16x4     %77 = @load_global (%80)
    32    %62004 = pack_32_2x16_split %77.x, %77.y
    32    %62007 = pack_32_2x16_split %77.z, %77.w
    64    %62008 = pack_64_2x32_split %62004, %62007
    64       %76 = iadd %100, %79
                   @store_global (%62008, %76)

llvmpipe has a similar issue inside lp_build_opt_nir

Fixes: b1bc691b0f ("nir/algebraic: add and improve pack/unpack patterns")
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33347>
(cherry picked from commit dad5ee1039)
2025-02-28 22:17:35 +01:00
Yiwei Zhang
3370a327d7 venus: fix image format cache miss with AHB usage query
should skip updating cache key instead of marking as a miss

Fixes: e48645250c ("venus: image format properties cache")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33757>
(cherry picked from commit fde5cebec5)
2025-02-28 22:17:35 +01:00
Mike Blumenkrantz
ce3806b8ee zink: always fully unwrap contexts
threaded_context_unwrap_sync() can be called safely on non-threaded
contexts

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33742>
(cherry picked from commit f9fe08740a)
2025-02-28 22:17:35 +01:00
Yogesh Mohan Marimuthu
13b2f1e72d winsys/amdgpu: same_queue variable should be set if there is only one queue
Fixes: 45fa34284f ("winsys/amdgpu: don't add fence dependency of other queues for userq")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33661>
(cherry picked from commit 659a41293b)
2025-02-28 22:17:35 +01:00
Tapani Pälli
f8e7fecd7e iris: wait for imported fences to be available in iris_fence_await
This ensures shared fence is available before we submit (and fail)
a batch with it, this fixes following issue on iris driver:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/12650

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33662>
(cherry picked from commit 41a7b58214)
2025-02-28 22:17:35 +01:00
Lionel Landwerlin
3630721dc8 anv: fix missing 3DSTATE_PS:Kernel0MaximumPolysperThread programming
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 815d2e3e8b ("anv: move 3DSTATE_PS to partial packing")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33712>
(cherry picked from commit 91f36ba5b6)
2025-02-28 22:17:35 +01:00
Benjamin Lee
16dfadd3e0 panfrost: remove NIR_PASS_V usage for noperspective lowering
The rest of the NIR_PASS_V usage in panfrost was dropped in
34beb93635, but this one was added in an
MR that was merged after.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 081438ad39 ("panfrost: add nir pass to lower noperspective varyings")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33728>
(cherry picked from commit 3b5d5c072a)
2025-02-28 22:17:35 +01:00
Dylan Baker
db51d8f8ac iris: fix handling of GL_*_VERTEX_CONVENTION
By actually setting the state packets according to the program data.
Also ensure that we correctly flag that the program may be dirty when
the geometry shader state changes

Fixes piglit tests: `spec@!opengl 3.2@gl-3.2-adj-prims * pv-first`

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33658>
(cherry picked from commit c33ebf09f5)
2025-02-28 22:17:35 +01:00
Dylan Baker
11faa02ec4 iris: Correctly set NOS for geometry shader state changes
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33658>
(cherry picked from commit 0477ee660f)
2025-02-28 22:17:34 +01:00
Hans-Kristian Arntzen
1b6da4ed52 radv: Always set 0 dispatch offset for indirect CS.
Fixes severe glitching in Avowed.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33732>
(cherry picked from commit 13a3f9a972)
2025-02-28 22:17:34 +01:00
Samuel Pitoiset
20bb982788 radv: fix missing SQTT barriers for fbfetch color/depth decompressions
SQTT layout transitions need to be inside SQTT barrier. Otherwise, this
throws an assertion in RADV and might also crash when the capture is
opened with RGP.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12664
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33719>
(cherry picked from commit 67c150bf9e)
2025-02-28 22:17:34 +01:00
Peyton Lee
539f0d88be radeonsi/vpe: check reduction ratio
Check the reduction ratio is within the hardware capablity.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33528>
(cherry picked from commit e85a6b6a63)
2025-02-28 22:17:34 +01:00
Faith Ekstrand
1f2143eea6 nvk: Do not set INVALIDATE_SKED_CACHES pre-MaxwellB
The other two uses of this are behind guards but we forgot this one.

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33716>
(cherry picked from commit 58218c7349)
2025-02-27 18:37:33 +01:00
Faith Ekstrand
7013ebec5d nvk: Don't bind a fragment shading rate image pre-Turing
Fixes: 75bcb656d9 ("nvk: Add support for binding fragment shading rate images")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33716>
(cherry picked from commit c145147871)
2025-02-27 18:37:32 +01:00
Natalie Vock
ea47f98811 radv/rt: Don't allocate the traversal shader in a capture/replay range
We never write the traversal shader address out to shader group handles,
so this is not necessary. On the flipside, it can cause conflicts if the
traversal shader is allocated in a range occupied by a replayed shader.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33711>
(cherry picked from commit 14b902c825)
2025-02-27 18:37:32 +01:00
Georg Lehmann
cb09b3f624 aco/insert_exec: fix continue_or_break on gfx6-7
s_cmp_lg_u64 is gfx8+

Fixes: 115ff5f95b ("aco/insert_exec_mask: don't restore exec in continue_or_break blocks")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33715>
(cherry picked from commit c249556bf4)
2025-02-27 18:37:31 +01:00
Rhys Perry
36e1923284 ac/nir: fix tess factor optimization when workgroup barriers are reduced
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b49eab68a8 ("ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12632
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33645>
(cherry picked from commit 2a3dce1b59)
2025-02-27 18:37:30 +01:00
Daniel Schürmann
f9c3499918 aco/ssa_elimination: insert parallelcopies for p_phi immediately before branch
Totals from 2499 (3.15% of 79377) affected shaders: (Navi31)
Instrs: 6011729 -> 6011761 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31573216 -> 31574236 (+0.00%); split: -0.00%, +0.00%
Latency: 83364734 -> 83365781 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 13545643 -> 13545783 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 302678df91)
2025-02-27 18:37:30 +01:00
Daniel Schürmann
4118fef567 aco/insert_exec_mask: don't restore exec in continue_or_break blocks
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 115ff5f95b)
2025-02-27 18:37:29 +01:00
Daniel Schürmann
1bb39be75e aco/insert_exec_mask: Don't immediately set exec to zero in break/continue blocks
Instead, only indicate that exec should be zero and do
so in the successive helper block. This allows to insert
the parallelcopies from logical phis directly before the
branch in break and continue blocks.

Totals from 56 (0.07% of 79377) affected shaders: (Navi31)
Latency: 2472367 -> 2472422 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 253053 -> 253055 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 7f7c1d463a)
2025-02-27 18:37:28 +01:00
Karol Herbst
33a7ae1f0a rusticl/platform: advertise all extensions supported by all devices
There is a spec issue about this to clarify this behavior, but the current
wording can be interpreted that the platform always lists all extensions
supported by all drivers.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33667>
(cherry picked from commit 0fd70ee9de)
2025-02-27 18:37:27 +01:00
Dave Airlie
1ce4feb1c0 vulkan/wsi/x11: don't use update_region for damage if not created
If we don't have a region in the X no MIT-SHM case don't go using
the damage call set region.

Fixes: bbdf7e45b1 ("wsi/x11: Hook up KHR_incremental_present")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33592>
(cherry picked from commit c49423ca2c)
2025-02-27 18:37:26 +01:00
Eric Engestrom
a2fd6237cb .pick_status.json: Update to 55c476efed 2025-02-27 18:37:25 +01:00
Mike Blumenkrantz
7b2a0da25f zink: wait on tc fence before checking for fd semaphore
this forces sync with pending flushes

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33652>
(cherry picked from commit f7002369fa)
2025-02-21 17:07:29 +01:00
Daniel Schürmann
c4d64f9f83 aco/scheduler: always respect min_waves on GFX10+
It could theoretically happen that for large workgroups,
the scheduler used more registers than allowed.

No fossil changes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
(cherry picked from commit 676b39d31f)
2025-02-21 17:07:28 +01:00
Erik Faye-Lund
33f1bbed67 mesa/main: wire up glapi bits for EXT_multi_draw_indirect
Turns out we were missing the glapi bits, making it impossible to use get
the function pointers for this extension. Whoops?!

[daniels: Squashed in a618 SkQP fails, presumably caused by these not
          being skipped anymore.]

Fixes: 9f5af68995 ("mesa/main: expose `EXT_multi_draw_indirect`")
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Tested-by: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33546>
(cherry picked from commit fde6aeb886)
2025-02-21 17:07:26 +01:00
Faith Ekstrand
275a14e3c8 zink: Use persistent semaphores for PIPE_FD_TYPE_SYNCOBJ
These are persistant objects that you can use to signal and wait over.
We need to import without VK_SEMAPHORE_IMPORT_TEMPORARY_BIT and we can't
throw away the Vulkan semaphore after each submit.

Fixes: 32597e116d ("zink: implement GL semaphores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33549>
(cherry picked from commit 651864151f)
2025-02-21 17:06:57 +01:00
Faith Ekstrand
3055ca6ff6 zink: Use the correct array size for signal_values[]
When the size of the signals[] array was changed to 3, the
signal_values[] array was not updated accordingly.  If we have a
signal_semaphore and are presenting at the same time, this can lead to
an array overflow and the driver will read some random stack value as
the signal value.  This is causing chromium to lock up when running
WebGL.

Fixes: 7f56fd9655 ("zink: it's kopperin' time")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33549>
(cherry picked from commit 1ffa782227)
2025-02-21 17:06:48 +01:00
Karol Herbst
ce12f4c6f8 rusticl/mem: set num_samples and num_mip_levels to 0 when importing from GL
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33653>
(cherry picked from commit e0b62d7e2e)
2025-02-21 17:06:47 +01:00
Faith Ekstrand
ff0f49e0ba nak: Only use suld.constant on Ampere+
Turing doesn't support it so we'll use suld.weak instead.  While we're
here, get rid of an accidental copy+paste condition.

Fixes: ffdc0d8e98 ("nak: Use suld.constant when ACCESS_CAN_REORDER is set")
Reviewed-by: Mel Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33649>
(cherry picked from commit 13f7ea7b3d)
2025-02-21 17:06:39 +01:00
Roland Scheidegger
43851b6850 llvmpipe: Fix alpha-to-coverage without dithering
Implementing alpha-to-coverage dithering broke the non-dithering case.
(Discovered by accident, not really a big deal since it's almost always
enabled and can only be disabled by using a Nvidia GL extension, and
can't be disabled with Vulkan.)

Fixes: ad4635d6ef
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33586>
(cherry picked from commit 61911b6a4b)
2025-02-21 17:06:22 +01:00
Juan A. Suarez Romero
651b27cffa broadcom/simulator: use string copy instead of memcpy
Using memcpy with the max size generates a global-buffer-overflow, as
the performance counter strings are smaller than the max size.

Instead, use a string copy function to get a copy.

This was detected with address sanitizer enabled and running vulkaninfo.

Fixes: 3e8b2fe053 ("broadcom/simulator: Add DRM_IOCTL_V3D_GET_COUNTER to simulator")
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33627>
(cherry picked from commit 2d91798561)
2025-02-21 17:05:49 +01:00
Juan A. Suarez Romero
85fb4b4d9b v3dv: duplicate key for texel_buffer cache
We can't use the local variable key to insert in the hashtable, as the
key needs to be persistent for future searches.

This makes a copy of the key in the pipeline, which is kept persistent
in the hashtable.

This fixes a stack-buffer-overflow.

Backport-to: 25.0
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33630>
(cherry picked from commit eb8017ca68)
2025-02-21 17:04:56 +01:00
Pierre-Eric Pelloux-Prayer
ace414493d mesa/st: call _mesa_glthread_finish before _mesa_make_current
_mesa_make_current will use st_flush(ctx) to execute pending
commands before switching to the new context.

Since we can't have multiple threads using a pipe_context at
the same time, we must finish glthread to avoid having the
unmarshalling thread executing at the same time.

It's fixing random crashes where a thread would do:
  st_destroy_context ->
      _mesa_make_current ->
          st_glFlush(save_ctx) ->
            tc_execute_batch
While there's a glthread unmarshalling thread that's still
adding commands to TC.

Fixes: 08d97aadd1 ("st/mesa: fix texture deletion context mix-up issues (v2)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33552>
(cherry picked from commit f062c83f3a)
2025-02-21 17:04:55 +01:00
Pierre-Eric Pelloux-Prayer
d2c5c71775 tc: add missing TC_SENTINEL for TC_END_BATCH
Fixes: c2983d93da ("gallium/u_threaded: use TC_END_BATCH to terminate the loop")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33552>
(cherry picked from commit a893a87625)
2025-02-21 17:04:53 +01:00
Samuel Pitoiset
5b8b81618e radv/video: fix adding the query pool BO to the cmdbuf list
Video queries work differently but the BO still need to be added to the
cmdbuf list.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33620>
(cherry picked from commit 5ba10cc57f)
2025-02-21 17:04:51 +01:00
Iago Toral Quiroga
5e76850ce3 pan/va: fix FAU validation
Validation was checking that if an instruction was accessing FAU RAM,
only one 64-bit slot was accessed, and if it was accessing a FAU special
value, only one was accessed, however it was not checking if both  RAM
and special were used, which is only allowed in messaging instructions
except ATEST and BLEND.

Fixes Piglit:
spec/ati_fragment_shader/ati_fragment_shader-render-ops/mov c0.r

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Fixes: fd1906afea ("pan/va: Add FAU validation")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33608>
(cherry picked from commit e504825813)
2025-02-21 17:04:49 +01:00
Lorenzo Rossi
7b38cf8b5e nvk: Fix MSAA sparse residency lowering crash
Previously deqp tests with *.multisampled_image_sparse_residency.* would
crash with "Unknown image intrinsic" because
nir_intrinsic_bindless_image_sparse_load was not handled in the lowring
code.

This commits handles MSAA sparse residency lowering as with other cases.

Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Fixes: 7604697ec6 ("nvk: Implement shaderStorageImageMultisample")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33625>
(cherry picked from commit bce9e851c6)
2025-02-21 17:04:46 +01:00
James Hogan
b34120348d mesa: Handle getting GL_MAX_VIEWS_OVR
Add support for GL_OVR_multiview's GL_MAX_VIEWS_OVR which can be
accessed with glGetIntegerv().

MaxViews is accessed via the hash table set up by get_hash_params.py as
a constant (MAX_VIEWS_OVR) using GL_MAX_VIEWS_OVR.

v2: Add this patch (thanks to Mike's guidance)
v3: Drop unnecessary enum size element in OVR_multiview.XML
v4: Switch to CONST(MAX_VIEWS_OVR) instead of gl_constants::MaxViews
    (Marek's suggestion)

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: James Hogan <james@albanarts.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32992>
(cherry picked from commit be106bd6c6)
2025-02-21 17:04:45 +01:00
James Hogan
d24eac8915 mesa: OVR_multiview framebuffer attachment parameters
Implement the OVR_multiview framebuffer attachment parameters in
get_framebuffer_attachment_parameter():
- GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_NUM_VIEWS_OVR: This reads the
  attachment's NumViews.
- GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_BASE_VIEW_INDEX_OVR: This reads the
  attachment's Zoffset, but only if NumViews is non-zero.

This allows apitrace (PR 937[1]) to show the correct layers for
multiview framebuffer attachment surfaces, as well as to show this
information in the framebuffer attachments state.

[1]: https://github.com/apitrace/apitrace/pull/937

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: James Hogan <james@albanarts.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32992>
(cherry picked from commit a282a130fb)
2025-02-21 17:04:44 +01:00
James Hogan
280d2fee72 mesa: Check views don't exceed GL_MAX_ARRAY_TEXTURE_LAYERS
The OVR_multiview spec specifies the INVALID_VALUE error to be generated
by FramebufferTextureMultiviewOVR if:
"- <texture> is a two-dimensional array texture and <baseViewIndex> +
   <numViews> is larger than the value of MAX_ARRAY_TEXTURE_LAYERS."

Implement this in check_multiview_texture_target(), similar to the test
in check_layer().

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: James Hogan <james@albanarts.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32992>
(cherry picked from commit 60509e187f)
2025-02-21 17:04:43 +01:00
James Hogan
b0a3bb0d6d mesa: Handle GL_FRAMEBUFFER_INCOMPLETE_VIEW_TARGETS_OVR
The OVR_multiview spec adds the following condition for framebuffer
completeness:
  "The number of views is the same for all populated attachments.
  { FRAMEBUFFER_INCOMPLETE_VIEW_TARGETS_OVR }"

So add a condition to _mesa_test_framebuffer_completeness to check that
all attachments have identical NumViews. This avoids an infinite
recursion between zink_clear() and zink_clear_depth_stencil() in the
event of an incomplete FBO.

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: James Hogan <james@albanarts.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32992>
(cherry picked from commit 7819d322c4)
2025-02-21 17:04:42 +01:00
James Hogan
97eb3f5cc2 mesa: Consider NumViews to reuse FBO attachments
NumViews needs considering along with the other attachment data when
reusing a multiview framebuffer texture attachment (i.e. shared depth
and stencil texture).

The depth and stencil attachments should match in all respects including
NumViews before reusing the existing one, and NumViews should also be
copied when reusing.

This avoids an infinite recursion between zink_clear() and
zink_clear_depth_stencil() in the case of reuse of a multiview
depth/stencil attachment.

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32992>
(cherry picked from commit 65f18c4787)
2025-02-21 17:04:41 +01:00
Eric Engestrom
5c84fff059 .pick_status.json: Update to b331713f20 2025-02-21 17:04:38 +01:00
Eric Engestrom
64552db2f8 docs: add sha sum for 25.0.0 2025-02-19 17:20:22 +01:00
Eric Engestrom
4fa244fddf VERSION: bump for 25.0.0 2025-02-19 15:57:11 +01:00
Eric Engestrom
45be2424ec docs: add release notes for 25.0.0 2025-02-19 15:57:10 +01:00
Pierre-Eric Pelloux-Prayer
e4831adc20 radeonsi: disable dcc when external shader stores are used
See comment.

Fixes: 666a6eb871 ("radeonsi/gfx12: disable display dcc for front buffer rendering")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12552
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33469>
(cherry picked from commit 6b20b06584)
2025-02-19 14:18:30 +01:00
Samuel Pitoiset
ef610a0d25 radv: fix adding the BO for unaligned SDMA copies to the cmdbuf list
It shouldn't be only added at creation time.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33600>
(cherry picked from commit efa23ef664)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
52e3f30992 nvk: Use suld.constant for EDB uniform texel buffers
In 2183bc73a6 ("nvk: Use suld for EDB uniform texel buffers"), we
started using suld instead of tld for EDB uniform texel buffers because
we needed it for correctness.  However, it's slow as mud.  Using
suld.constant seems to fix the performance regression.  I don't know if
it's quite tld performance, but it's close.

Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33612>
(cherry picked from commit eb27cbf25a)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
f18483d265 nak: Use suld.constant when ACCESS_CAN_REORDER is set
This is way faster than suld.sys, which is what we're using today.  So
far I haven't seen it matter for anything but texel buffers but it
likely helps some app somewhere.

Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33612>
(cherry picked from commit ffdc0d8e98)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
b90c99c3dc nvk: Align UBO/SSBO addresses down rather than up
This should never happen as the client should always give us aligned
addresses.  However, in the off chance that it does, aligning down is
probably safer than aligning up as it won't cause the top end of the
range increase and potentially fault.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33610>
(cherry picked from commit 5762586c6d)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
1a911f3d75 nvk: Use suld for EDB uniform texel buffers
The tricks we play for texel buffers with VK_EXT_descriptor_buffer don't
work with tld with very large buffers.  suld, on the other hand, doesn't
seem to have these limitations.

Fixes: 3b94c5c22a ("nvk: Lower descriptors for VK_EXT_descriptor_buffer buffer views")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33610>
(cherry picked from commit 2183bc73a6)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
1d6206a82c nak: Handle sparse texops with unused color destinations
Fixes: b17f139281 ("nak: Wire up sparse residency for texture ops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33610>
(cherry picked from commit 1c7a4c4f38)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
6482efdaba nvk: Allow sparse loads on EDB buffers
Fixes: 3b94c5c22a ("nvk: Lower descriptors for VK_EXT_descriptor_buffer buffer views")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33610>
(cherry picked from commit 0ec760af66)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
b948e3f3a6 nvk: Handle shader==NULL in nvk_cmd_upload_qmd()
We can theoretically hit this if CmdProcessGeneratedCommandsEXT is
called with a state command buffer that doesn't have compute shader set
if execute commands bind a shader.  We do, however, need to still call
nvk_cmd_upload_qmd() because it also uploads push constants and we need
those regardless of whether or not there's a shader bound.

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33610>
(cherry picked from commit ca06a57702)
2025-02-19 14:18:30 +01:00
Faith Ekstrand
3445cf4f96 nvk: Pull shaders from the state command buffer in nvk_cmd_process_cmds()
Found by the VKD3D test suite.

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33610>
(cherry picked from commit 39ae06e153)
2025-02-19 14:18:30 +01:00
Eric Engestrom
8154790767 .pick_status.json: Update to 6b20b06584 2025-02-19 14:18:27 +01:00
Samuel Pitoiset
a026515817 radv: add initial DCC support on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33255>
(cherry picked from commit 9af11bf306)
2025-02-19 13:16:03 +01:00
Samuel Pitoiset
ceaf6b2231 ac/gpu_info: add gfx12_supports_dcc_write_compress_disable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33255>
(cherry picked from commit 827cef7f7f)
2025-02-19 13:16:03 +01:00
Samuel Pitoiset
9b60c38646 ac,radv,radeonsi: add new GFX12_DCC_WRITE_COMPRESS_DISABLE tiling flag
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33301>
(cherry picked from commit 9993f3dd6a)
2025-02-19 13:16:03 +01:00
Georg Lehmann
f5bace5bf6 nir: fix frsq range analysis
Foz-DB Navi21:
Totals from 98 (0.12% of 79377) affected shaders:
Instrs: 157311 -> 157675 (+0.23%); split: -0.03%, +0.26%
CodeSize: 844296 -> 846648 (+0.28%); split: -0.00%, +0.28%
Latency: 1275467 -> 1276259 (+0.06%); split: -0.00%, +0.06%
InvThroughput: 266980 -> 267098 (+0.04%); split: -0.03%, +0.07%
Copies: 11094 -> 11093 (-0.01%)
PreVGPRs: 5945 -> 5977 (+0.54%)
VALU: 110585 -> 110953 (+0.33%); split: -0.04%, +0.38%
SALU: 18481 -> 18476 (-0.03%)

Cc: mesa-stable

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33557>
(cherry picked from commit 81b4629636)
2025-02-18 22:46:11 +01:00
Georg Lehmann
5c65587861 nir: fix range analysis for frcp
Foz-DB Navi21:
Totals from 448 (0.56% of 79377) affected shaders:
Instrs: 669306 -> 669318 (+0.00%); split: -0.00%, +0.00%
CodeSize: 3736580 -> 3738840 (+0.06%); split: -0.00%, +0.06%
Latency: 5860916 -> 5860961 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 1344094 -> 1344135 (+0.00%); split: -0.00%, +0.00%
VClause: 13878 -> 13879 (+0.01%)
Copies: 58538 -> 58532 (-0.01%)
VALU: 479807 -> 479820 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33557>
(cherry picked from commit 25300ac18a)
2025-02-18 22:46:09 +01:00
Paulo Zanoni
d8ffce96d2 brw: increase brw_reg::subnr size to 6 bits
Since Xe2, the registers are bigger and even the instruction
structures got updated to have 6 bits.

The way I detected this issue was when I tried to use
src/intel/executor to add the following instruction:

    add(8)          g6.8<1>UD      g4<8,8,1>UD    0x00000008UD    { align1 WE_all 1Q I@1 };

Executor would read this and end up emitting an add with dst being
g6<1>UD instead of what we wanted. It turns out that inside
brw_gram.y, at dstoperand and dstoperandex we do:

    $$.subnr = $$.subnr * brw_type_size_bytes($4);

which would overflow subnr back to 0.

The overflow doesn't seem to be a problem with code we emit directly
(unlike the code we parse, like above) due to the fact that we seem to
treat Xe2 registers as smaller all the way until we call phys_nr() and
phys_subnr() during code generation. The phys_subnr() function can
generate a value that would overflow reg.subnr, but this value is
never written back to reg.subnr, it's just returned as an unsigned
int.

Fixes: e9f63df2f2 ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33539>
(cherry picked from commit 927d7b322b)
2025-02-18 22:46:08 +01:00
Tapani Pälli
3194cae6d0 anv: apply cache flushes on pipeline select with gfx20
This fixes rendering artifacts seen with Hogwarts Legacy and Black
Myth Wukong. Assumption is that we can get rid of these flushes once
RESOURCE_BARRIER work lands but until then we need them.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12540
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12489
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33397>
(cherry picked from commit 765f3b78d5)
2025-02-18 22:46:07 +01:00
David Rosca
19e2eed688 radv/video: Move IB header from begin/end to encode_video
For decode this is also done in decode_video.

This breaks if app doesn't call vkCmdEncodeVideoKHR before end, eg:

  vkCmdBeginVideoCodingKHR
  vkCmdControlVideoCodingKHR
  vkCmdEndVideoCodingKHR

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33582>
(cherry picked from commit ebd8893710)
2025-02-18 22:46:05 +01:00
David Rosca
70bb670e9f radv/video: Fix setting balanced preset for HEVC encode with SAO enabled
FW disables SAO in speed preset, so we need to switch to balanced.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12615
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33582>
(cherry picked from commit 77ff18aa3b)
2025-02-18 22:46:03 +01:00
Samuel Pitoiset
27b7056835 radv: fix adding the VRS image BO to the cmdbuf list on GFX11
This might cause random faults.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33584>
(cherry picked from commit e0070bc68b)
2025-02-18 22:46:02 +01:00
Tapani Pälli
961a3fc760 anv: tighten condition for changing barrier layouts
Assertion (or attempting the layout change) is causing crash when
launching Steel Rats. Tighten the condition for change so that it should
affect only when runtime has made changes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12602
Fixes: eed788213b ("anv: ensure consistent layout transitions in render passes")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33523>
(cherry picked from commit d8381415a6)
2025-02-18 22:46:01 +01:00
Faith Ekstrand
0cef98b71a nvk: Implement descriptorBufferPushDescriptors
The only thing we really need to do here is to make sure we don't try
to use the EDB path for push descriptors since those aren't really
descriptor buffers.

Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33589>
(cherry picked from commit 86e217e7df)
2025-02-18 22:45:55 +01:00
Eric Engestrom
b6ffd0cd80 .pick_status.json: Update to 56aac9fdec 2025-02-18 22:44:11 +01:00
Danylo Piliaiev
dc633a3560 tu: Handle mismatched mutability when resolving from GMEM
Apparently fast path cannot handle mismatched mutability and we
should use CP_BLIT which has SP_PS_2D_SRC_INFO.MUTABLEEN to signal
src mutability. Previously it was partially handled by
tu_attachment_store_mismatched_swap.

Fixes: a104a7ca1a
("tu: Handle non-identity GMEM swaps when resolving")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
(cherry picked from commit 97f851e7c5)
2025-02-17 20:13:17 +01:00
Danylo Piliaiev
05f1528235 tu: Get correct src view when storing gmem attachment
Fixes: a104a7ca1a
("tu: Handle non-identity GMEM swaps when resolving")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
(cherry picked from commit bdf0f61d4a)
2025-02-17 20:13:17 +01:00
Faith Ekstrand
eae4213ccb nvk: Respect VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_EXT
This is part of VK_EXT_mutable_descriptor_type but we never did anything
with it.  Since we use local memory for descriptor sets, copying from
them means reading VRAM through a WC map and it's pretty expensive.
Using malloc() for HOST_ONLY should be a nice perf boost for things
which give us the hint.

This massively improves the performance Dragon Age: The Veilguard,
taking it from 7 FPS to 25 FPS on an RTX 4060.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12622
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33571>
(cherry picked from commit 607686f6bf)
2025-02-17 20:13:17 +01:00
Faith Ekstrand
a21604ce78 nvk: Rename nvk_descriptor_set::mapped_ptr
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33571>
(cherry picked from commit b8508726f0)
2025-02-17 20:13:17 +01:00
Yiwei Zhang
8ef1017e36 venus: fix maintenance5 props init and create flags2
More are found missed from prior maint5 support. This change has
properly initialized the maint5 props as well as fixing its new
VkPipelineCreateFlags2CreateInfo integrations.

Verified with dEQP-VK.*maintenance5*

Fixes: be6fece6e1 ("venus: enable VK_KHR_maintenance5")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33566>
(cherry picked from commit 8741be3365)
2025-02-17 20:13:17 +01:00
Eric Engestrom
af2b8d745f .pick_status.json: Update to 2361ed27f3 2025-02-17 20:13:08 +01:00
Konstantin Seurer
81fe589ccb gallivm: Remove loop limiting
This is not conformant and it can cause hard to debug issues or hide
existing bugs. Getting rid of this limit will allow lavapipe to use the
common bvh building framework since the ploc build shader has a loop
that waits to start the next phase.

cc: mesa-stable

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31426>
(cherry picked from commit ac0f643d4b)
2025-02-15 20:10:09 +01:00
Roland Scheidegger
e1f713bf63 llvmpipe: Fix overflow issues calculating loop iterations for aniso
iceil can return bogus (negative) values in case there's an overflow
(or a NaN). This would then take forever to run due to a couple billion
loop iterations.
Use unsigned minimum instead which will clamp iterations to max aniso
(not sure if that makes more sense than clamping negative values to 0,
probably doesn't really matter).

Fixes: 350a0fe632 ("llvmpipe: Use a simpler and faster AF implementation")

Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33537>
(cherry picked from commit 24076eb3f9)
2025-02-15 20:10:08 +01:00
Lorenzo Rossi
3106363a95 nvk: fix preprocess buffer alignment
Previously DGC alignment requirements declared by
getGeneratedCommandsMemoryRequirementsExt were not also reported by
getDeviceBufferMemoryRequirements for preprocess buffers.

This fixes 1554 dEQP-VK failures related to device-generated commands
that previously failed with "DGC alignment requirement larger than
preprocess buffer alignment requirement".

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Reviewed-by: Faith Ekstrand <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33555>
(cherry picked from commit cc30e35306)
2025-02-15 20:06:24 +01:00
Eric Engestrom
85f4342382 .pick_status.json: Update to 06d8afff64 2025-02-15 20:05:34 +01:00
Samuel Pitoiset
7e54da043a radv/meta: disable conditional rendering for fill/update buffer operations
These commands shouldn't be affected by conditional rendering, similar
to the copy buffer operation.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33511>
(cherry picked from commit 5684c1687c)
2025-02-15 00:11:30 +01:00
Simon Ser
b0a094edfa gbm: fix get_back_bo() failure with gbm_surface and implicit modifiers
Before 361f362258 ("dri: Unify createImage and
createImageWithModifiers"), gbm_surface_create_with_modifiers() would
fail with ENOSYS on drivers missing explicit modifiers support. After
that commit, it succeeds and fails later when it tries to allocate a
new back buffer.

Restore the previous behavior.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 105fcb9cfd ("dri: revert INVALID modifier special-casing")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12283
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32726>
(cherry picked from commit 5a19323d0e)
2025-02-15 00:03:01 +01:00
Erik Faye-Lund
e074dcbbbb panvk: report passing the VK CTS
This will be needed in order to check off passing the VK CTS properly.

Please note, this does *not* mean that we are formally conformant, only
that we have passed the VK CTS at least once. Those are not the same
thing.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33254>
(cherry picked from commit 2653a3988f)
2025-02-15 00:02:59 +01:00
Eric R. Smith
3a9d9099d4 panfrost: fix backward propagation of values in loops
bi_opt_mod_prop_backward tries to propagate values backwards, but
stops checking for uses when it reaches the SSA definition. For
ordinary blocks that's fine, but for loops the definition can come
after a PHI that uses the value. This causes incorrect code to be
generated in shaderdb test `shaders/skia/2134.shader_test`. Fix this
by special casing PHI instructions, in a manner similar to done in
asahi/compiler/agx_optimizer.c.

This bug has been present a long time, so we want it back-ported to
stable.

Cc: mesa-stable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33483>
(cherry picked from commit 18a14c4522)
2025-02-15 00:02:58 +01:00
Yiwei Zhang
5f2343889d venus: fix to handle pipeline flags2 from maint5
Fixes: be6fece6e1 ("venus: enable VK_KHR_maintenance5")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33520>
(cherry picked from commit a7fccbbf85)
2025-02-15 00:02:56 +01:00
Lionel Landwerlin
e2232c0be4 anv: ensure Wa_16012775297 interacts correctly with Wa_18020335297
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: dddd765553 ("anv: implement VF_STATISTICS emit for Wa_16012775297")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>
(cherry picked from commit 6b99bf76ca)
2025-02-15 00:02:54 +01:00
Lionel Landwerlin
399de9dd00 anv: disable VF statistics for memcpy
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>
(cherry picked from commit 462d8e3fab)
2025-02-15 00:02:53 +01:00
Eric R. Smith
b0891768d5 panfrost: fix YUV center information for 422
It turns out that the change from CENTER_Y to CENTER_X for
422 YUV didn't actually happen until generation 14 of the
hardware, not generation 10 as some documents claimed. This
fixes the failing piglit tests ext_image_dma_buf_import-sample_yuv
associated with 422 formats (which apparently we aren't running on CI).

Fixes: 23aa784c
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33516>
(cherry picked from commit c7fed8b053)
2025-02-15 00:02:52 +01:00
Eric Engestrom
df3ad61978 .pick_status.json: Update to a9b6a54a8c 2025-02-15 00:02:27 +01:00
Eric Engestrom
90e72c54d8 ci/yaml-toml-shell-py-test: run on direct push pipelines
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33412>
(cherry picked from commit 7b018945e8)
2025-02-12 19:52:38 +01:00
Eric Engestrom
b01077c27a ci/yaml-toml-shell-py-test: don't run on post-merge pipelines
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33412>
(cherry picked from commit c8ad134d46)
2025-02-12 19:52:32 +01:00
Eric Engestrom
58540dd004 ci: debian-testing-ubsan is used by tests
Fixes: 37ee035e42 ("ci/build: add ubsan build jobs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33509>
(cherry picked from commit e41438275e)
2025-02-12 19:52:27 +01:00
Eric Engestrom
7445240551 .pick_status.json: Update to e41438275e 2025-02-12 19:52:21 +01:00
Eric Engestrom
3a8abfa39b VERSION: bump for 25.0.0-rc3 2025-02-12 17:04:29 +01:00
Eric Engestrom
7b1e97928c .pick_status.json: Mark 13e987669c as denominated 2025-02-12 12:59:42 +01:00
Mel Henning
017ea57804 driconf: force_vk_vendor on Deep Rock Galactic+NVK
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33502>
(cherry picked from commit f887ae2f3c)
2025-02-12 12:05:39 +01:00
Eric Engestrom
7e549546d4 ci: run containers builds on staging branches
Fixes: 7152f343d6 ("ci: only trigger the CI for release managers when pushing to staging branch")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33468>
(cherry picked from commit b08f9a2dbd)
2025-02-12 12:05:32 +01:00
Mike Blumenkrantz
a917c1f0bf zink: never try to oom flush during unsync texture upload
this is very broken

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33484>
(cherry picked from commit 52dfe1e955)
2025-02-12 12:05:31 +01:00
Mike Blumenkrantz
7bd126c0e6 zink: only enable unsynchronized_texture_subdata with HIC
this is otherwise useless

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33484>
(cherry picked from commit 2304078261)
2025-02-12 12:05:31 +01:00
David Rosca
0c862e61e2 radeonsi/uvd: Set correct chroma format for H264 decode
Fixes decoding monochrome (chroma_format_idc = 0).

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33396>
(cherry picked from commit 441252e9e1)
2025-02-12 12:05:30 +01:00
David Rosca
7ee94ef063 radeonsi/vcn: Set correct chroma format for H264 decode
Fixes decoding monochrome (chroma_format_idc = 0).

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33396>
(cherry picked from commit 110d406302)
2025-02-12 12:05:29 +01:00
David Rosca
326ea58650 frontends/vdpau: Set H264 chroma_format_idc
We don't get the actual value from VDPAU, so hardcode to 4:2:0.

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33396>
(cherry picked from commit c28702c35a)
2025-02-12 12:05:28 +01:00
Lionel Landwerlin
cef16493a8 anv,driconf: Add sampler coordinate precision workaround for Dynasty Warriors
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12584
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33488>
(cherry picked from commit 4864c0a5fc)
2025-02-12 12:05:27 +01:00
Boris Brezillon
35063c7764 panvk: Initialize device virtual address space after the VM creation
Make sure we're not lacking a lock/heap destroy when we fail to
create the VM.

Fixes: 53fb1d99ca ("panvk: Transition to explicit VA assignment on v10+")
Reported-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33471>
(cherry picked from commit 4ae12cc6ff)
2025-02-12 12:05:26 +01:00
Boris Brezillon
015835fc59 panvk/csf: Don't free the resources twice when init_render_desc_ringbuf() fails
init_queue() calls cleanup_queue() if anything fails in the middle, which
means finish_render_desc_ringbuf() will be automatically called if
init_render_desc_ringbuf() failed. Get rid of the the error path and
return directly instead. The one exception we have is the dev_addr
allocation, which needs to be explicitly freed if an error occurs between
util_vma_heap_alloc() and pan_kmod_vm_bind().

Reported-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33471>
(cherry picked from commit 5f3c6a0f27)
2025-02-12 12:05:25 +01:00
Faith Ekstrand
6d3863b41b nvk: Fix scissor bounds
This code is old, copied from the old nouveau GL driver.  As of Pascal,
we have have 32k images so we need 32k scissors as well.  Use the
max_image_dimension() helper instead of hard-coding it.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33478>
(cherry picked from commit 6f64962f27)
2025-02-12 12:05:23 +01:00
Eric Engestrom
6285eefd88 .pick_status.json: Update to 30a3d567c8 2025-02-12 12:05:20 +01:00
Patrick Lerda
9fd4deead3 r600: fix r600_init_shader_caps() has_atomics issue
Indeed, has_atomics is not yet initialized at the time of the
call of r600_init_shader_caps(). This change fixes this issue.

For instance, this issue is triggered with
"piglit/bin/clearbuffer-depth-cs-probe -auto -fbo":
clearbuffer-depth-cs-probe: ../src/gallium/drivers/r600/evergreen_state.c:5039: evergreen_emit_atomic_buffer_setup: Assertion `resource' failed.
Aborted

Fixes: 7cd606f01b ("r600: add r600_init_screen_caps")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33438>
(cherry picked from commit bb44052ee9)
2025-02-11 18:07:31 +01:00
David Rosca
b86196935f radeonsi/video: Avoid stream handle duplicates in PID namespace
Add current time when generating the stream handle initial value.

When running inside PID namespace there can be multiple processes
in the system that will share the same PID and with current code
this could result in the same stream handle being used at the same
time from different processes.

This can easily happen with Flatpak when running two instances of the
same application - both processes will have the same PID and we
will use the same stream handles.

For older UVDs kernel will reject the CS if we use duplicated handles.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12575
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33374>
(cherry picked from commit fdf747af3a)
2025-02-11 18:05:33 +01:00
Ian Romanick
2ea6b340ac brw/copy: Fix handling of offset in extract_imm
The offset is measured in bytes. Some of the code here acted as though
it were measured in src.type units. Also modify the assertion to check
that all extracted bits come from data in the immediate value.

Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Fixes: da395e6985 ("intel/brw: Fix extract_imm for subregion reads of 64-bit immediates")

Yes, I missed this error *twice* in code review.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33049>
(cherry picked from commit ac4b93571c)
2025-02-11 18:05:27 +01:00
Yiwei Zhang
415338d3e1 venus: use dedicated allocation for ANB image memory import
On most platforms, deidcated allocation is preferred for the dma-buf
import done by Venus. In special cases, this is required but missed so
far.

Cc: mesa-stable

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33434>
(cherry picked from commit d92f9c3d51)
2025-02-11 18:05:27 +01:00
Yiwei Zhang
f66772f1b1 venus: enable VK_EXT_external_memory_acquire_unmodified if needed
When used internally, we have to conditionally enable it behind the app.

Fixes: 969cb02de7 ("venus: chain VkExternalMemoryAcquireUnmodifiedEXT for wsi ownership transfers")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33419>
(cherry picked from commit 1d668233ba)
2025-02-11 18:05:26 +01:00
Iago Toral Quiroga
93d004ab64 v3dv: fix crash on 32-bit builds
Command buffer private object destroy callbacks receive a 64-integer so their
signature should respect that to avoid alignment issues when passing pointers.
This is the same we were already doing for color pipelines, but now for D/S
pipelines too.

Fixes crash on 32-bit build with:
dEQP-VK.synchronization2.op.single_queue.fence.write_clear_attachments_read_copy_image_to_buffer.image_128x128_d16_unorm

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33463>
(cherry picked from commit daa48cbaef)
2025-02-11 18:05:24 +01:00
Eric Engestrom
33065515bc .pick_status.json: Update to 18f0807408 2025-02-11 18:05:18 +01:00
Qiang Yu
88cd974aae radeonsi: fix GravityMark corruption when use aco
aco may use smem load for ssbo when possible.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12518
Cc: mesa-stable
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33440>
(cherry picked from commit ee9edd4625)
2025-02-10 11:50:51 +01:00
Qiang Yu
a9f218a966 radeonsi: fix has_non_uniform_tex_access info
Fixes: f859436b55 ("radeonsi: add has_non_uniform_tex_access shader info")
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33440>
(cherry picked from commit c805ea6792)
2025-02-10 11:50:50 +01:00
Mel Henning
92e02eebea nak/opt_copy_prop: Force alu src for IAdd2X/IAdd3X
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33420>
(cherry picked from commit 48edb9cec2)
2025-02-10 11:50:48 +01:00
Mel Henning
ea52e480cb nak/opt_copy_prop: Add force_alu_src_type
This is just a code cleanup - it shouldn't change any shaders.

Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33420>
(cherry picked from commit 2fa557d29d)
2025-02-10 11:50:47 +01:00
Mel Henning
2583fde8bc nak/opt_copy_prop: Fix IAdd3 overflow check
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33420>
(cherry picked from commit a5b267980a)
2025-02-10 11:50:46 +01:00
Rebecca Mckeever
6378d22e68 panvk: Fix assertion in is_disjoint()
We were not correctly following VUID-VkImageCreateInfo-format-01577:

If format is not a multi-planar format, and flags does not
include VK_IMAGE_CREATE_ALIAS_BIT, flags must not contain
VK_IMAGE_CREATE_DISJOINT_BIT.

Fixes: 412c2863 ("panvk: Enable multiplane images and image views")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
(cherry picked from commit 2ddd021bae)
2025-02-10 11:50:45 +01:00
Rebecca Mckeever
c3afddf561 panvk: Allow a 32-bit binding value in desc id key and use 64-bit keys
Since the binding value can be any 32-bit number, we cannot assume that
it is <= 27 bits. We need 64-bit keys to accommodate a 32-bit binding.

This will also provide more bits to store the subdesc id, which will be
needed for multiplane texture and sampler descriptors.

Fixes: 7bea6f86 ("panvk: Overhaul the Bifrost descriptor set implementation")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
(cherry picked from commit 9c4b530c49)
2025-02-10 11:50:44 +01:00
Pavel Ondračka
90c4d44969 i915: rework shader compile failures reporting
Report compile errors from create_fs_state instead of finalize_nir.
The current way is broken, since nir_to_tgsi is called in finalize_nir,
however it can't handle lowered IO.

Fixes: dae57e184a
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12373
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33341>
(cherry picked from commit 4d4a3a6d6b)
2025-02-10 11:37:28 +01:00
Marek Olšák
ef741dad68 gallium,st/mesa: allow reporting compile failures from create_vs/fs/.._state
This adds a proper interface for reporting shader compile failures.
They are propagated to the GLSL linker.

Reporting errors from finalize_nir will be deprecated.

Fixes: dae57e184a
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33341>
(cherry picked from commit dc1b719e1f)
2025-02-10 11:37:18 +01:00
Martin Roukala (né Peres)
d7e6adfa2c turnip/ci: re-introduce the multiviewport flakes
This is a partial revert of 5f3cad0026, as the commit did not
actually fix the flakes it claimed to do.

Fixes: 5f3cad0026 ("tu: Add missing assignment to shared_viewport")
Suggested-by: @Valentine (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33446#note_2770035)
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33446>
(cherry picked from commit f3b1f5ba2c)
2025-02-10 11:37:10 +01:00
Martin Roukala (né Peres)
a112f94c76 ci/b2c: fix the S3 artifact for amd64 manual vk/gl
Fixes: 5b291c7ce6 ("ci: Move r300/nine/nvk builds out of critical path")
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33446>
(cherry picked from commit c63041c0ed)
2025-02-10 11:37:05 +01:00
Rebecca Mckeever
8eb769a540 util/hash_table: Add _mesa_hash_table_u64_replace()
This function updates the data of a u64 hash_table entry and is safe to
use inside a hash_table_u64_foreach() loop.

Fixes: 7bea6f86 ("panvk: Overhaul the Bifrost descriptor set implementation")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32563>
(cherry picked from commit 1d0f44739d)
2025-02-10 11:36:56 +01:00
Ian Romanick
20f09fc0d4 crocus: Add missing nir_metadata_preserve in crocus_lower_storage_image_derefs
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Closes: #12589
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33450>
(cherry picked from commit 40948b9715)
2025-02-10 11:36:53 +01:00
Ian Romanick
1c8f1e820f iris: Add missing nir_metadata_preserve in iris_lower_storage_image_derefs
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 26a54ae4b2 ("iris: lower storage image derefs")
Closes: #12589
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33450>
(cherry picked from commit f2a01be57e)
2025-02-10 11:36:53 +01:00
Eric R. Smith
eabe6ec941 panfrost: avoid potential divide by 0 calculating timer_resolution
On armhf integer divide by 0 can raise SIGFPE, whereas on aarch64
it just returns 0. This has become an issue because the recently
added panfrost_init_screen_caps always calls pan_gpu_time_to_ns to
calculate caps->timer_resolution, whereas before we only called it
when PIPE_CAP_TIMER_RESOLUTION was queried, and only OpenCL
does that (and not always).

Fixes: 205669e3a9 ("panfrost: add panfrost_init_screen_caps")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33435>
(cherry picked from commit e550a3cab0)
2025-02-10 11:36:51 +01:00
Erik Faye-Lund
4736448bde panvk: correct number of read bytes for dynamic buffers
This function takes the number of bytes, not number of entries. This
should hopefully fix start-up issues on Citra.

While we're at it, fixup the alignment of the line that writes the
bytes.

Fixes: 27beadcbdb ("panvk: Extend the shader logic to support Valhall")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12539
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33429>
(cherry picked from commit 2ae97a4eb6)
2025-02-10 11:36:50 +01:00
David Rosca
19f1546fb3 ac/vcn_dec: Fix AV1 film grain on VCN5
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33376>
(cherry picked from commit 62b0f84981)
2025-02-10 11:36:50 +01:00
Karol Herbst
741763f32c rusticl/mem: do not apply offset with in copy_image_to_buffer
The offset already gets applied when mapping the destination buffer, so we
ended up applying it twice.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33426>
(cherry picked from commit a2a3be3baa)
2025-02-10 11:36:49 +01:00
Samuel Pitoiset
1cf778e011 radv: fix fetching draw vertex data from counter buffers with transform feedback
counterOffset was just ignored and nobody noticed (missing VKCTS
coverage).

VGT_STRMOUT_DRAW_OPAQUE_OFFSET will do the computation in hw for us.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33407>
(cherry picked from commit 8625decbcc)
2025-02-10 11:36:48 +01:00
Benjamin ROBIN
cd1ec4d20e util/disk_cache: Do not try to delete old cache if cache is disabled
Prevent following warning if not running as a normal user:
Failed to create /home for shader cache (Permission denied)---disabling

disk_cache_delete_old_cache() is going to create first the cache directory
using disk_cache_generate_cache_dir(). From mkdir_if_needed(), the stat()
of "/home" is failing with "Permission denied" under some circumstances
when using Firefox.

Fixes: #12168
Fixes: c3bc6991d2 ("util/disk_cache: Delete the old multifile cache if using the default.")

Signed-off-by: Benjamin ROBIN <dev@benjarobin.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32886>
(cherry picked from commit 622f7407d7)
2025-02-10 11:36:46 +01:00
Eric Engestrom
01b75c49a3 ci: only trigger the CI for release managers when pushing to staging branch
The release branch contains only what was on the staging branch first,
so testing it again is a waste of resources.

To do this, we split the rule into specifically "default branch" and
"staging branch", and "release branch" gets dropped by virtue of no
longer being caught by any rule.

Cc: mesa-stable
Reviewed-by: Martin Roukala <None>
Reviewed-by: Dylan Baker <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33411>
(cherry picked from commit 7152f343d6)
2025-02-10 11:36:36 +01:00
Eric Engestrom
fc5cbf4bce ci: don't run on tag pipelines
It's too late to run all the tests by then, the release has been made
based on the staging pipelines results

Cc: mesa-stable
Reviewed-by: Martin Roukala <None>
Reviewed-by: Dylan Baker <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33409>
(cherry picked from commit 31f0a9be3f)
2025-02-10 11:35:39 +01:00
Eric Engestrom
be81537a63 llvmpipe/tests: include math.h for INFINITY
This might be the cause of #12557, but we should do this regardless.

Fixes: d366520e85 ("gallivm: fix rsqrt failures")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33424>
(cherry picked from commit e4551ac69e)
2025-02-10 11:35:38 +01:00
Erik Faye-Lund
3d18ce09de pan/ci: add fail from llvm 19 upgrade
This was missed while testing the LLVM 19 upgrade, because the
panfrost-t860-cl:arm64 job doesn't run pre-merge.

Fixes: 101065642d ("ci/debian: Upgrade Debian images to LLVM 19")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33425>
(cherry picked from commit a6e0492da1)
2025-02-10 11:35:32 +01:00
Karmjit Mahil
c081723541 loader/wayland: Fix missing timespec.h include
`loader_wayland_dispatch()` also makes use of `timespec` so we
need `timespec.h`. Otherwise it fails to build due to
`timespec_sub_saturate()` missing.

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Fixes: 90effcceab ("wsi/wayland: refactor wayland dispatch")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12580
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33423>
(cherry picked from commit 54928d643e)
2025-02-10 11:35:25 +01:00
Eric Engestrom
495b369693 .pick_status.json: Update to ee9edd4625 2025-02-10 11:35:17 +01:00
Eric Engestrom
78411d5666 gfxstream: mark unused variables as such
It's unclear to me whether this is dead code that should be removed or
dead code that should be used, so I just marked it as unused to remove
a few thousand warnings when compiling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33384>
(cherry picked from commit 93a720f81a)
2025-02-06 10:18:23 +01:00
Eric Engestrom
93dcef8408 gfxstream: use range variable for its intended purpose
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33384>
(cherry picked from commit b2b37cb1de)
2025-02-06 10:18:19 +01:00
Eric Engestrom
79e67633c9 gfxstream: drop dead variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33384>
(cherry picked from commit 96c183c759)
2025-02-06 10:18:14 +01:00
Eric Engestrom
d3e9c337fd gfxstream: fix signedness of shifts
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33384>
(cherry picked from commit 74d0a8cdd6)
2025-02-06 10:18:08 +01:00
Samuel Pitoiset
765cdedcd0 radv: fix adding the BO to cmdbuf list when starting conditional rendering
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33403>
(cherry picked from commit 9b827556f5)
2025-02-06 10:15:10 +01:00
Martin Roukala (né Peres)
440f3359a7 zink/ci: use the debian-built-testing for nvk
Fixes: 5b291c7ce6 ("ci: Move r300/nine/nvk builds out of critical path")
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33401>
(cherry picked from commit a55613ce8d)
2025-02-06 10:15:08 +01:00
Mike Blumenkrantz
78577b19bc radv: fix error reporting for VkExternalMemoryTypeFlagBitsKHR
wrong type name is confusing

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33323>
(cherry picked from commit ca8a740e3b)
2025-02-06 10:15:03 +01:00
Job Noorman
62dbbe79ec ir3: fix emitting descriptor prefetches at end of preamble
The fix in e7ac1094f6 to emit preamble defs in the correct block would
move the cursor of the builder that is later used to insert descriptor
prefetches, emitting them at the wrong place. Fix this by resetting the
cursor before emitting the prefetches.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: e7ac1094f6 ("ir3: rematerialize preamble defs in block dominated by sources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33399>
(cherry picked from commit 8404e7428b)
2025-02-06 10:15:00 +01:00
Eric Engestrom
7368b3f409 .pick_status.json: Mark 5f54beb307 as denominated 2025-02-06 10:14:45 +01:00
Eric Engestrom
36b67f71d5 docs/android: drop libglapi.so now that it's gone
Fixes: 44bda7c258 ("dri: put shared-glapi into libgallium.*.so")
Reviewed-by: Antonio Ospite <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33388>
(cherry picked from commit 4bbbbb96be)
2025-02-06 10:14:38 +01:00
Eric Engestrom
05d2f1c24a .pick_status.json: Update to fdaf7c7b96 2025-02-06 10:14:19 +01:00
Samuel Pitoiset
c00d4230ba radv: fix caching on-demand meta shaders
This switches to disk_cache instead of our own mechanism which only
stored meta shaders when the logical was destroyed.

Meta shaders are still stored separately from the application shaders
because they are common to all applications on a given GPU/Mesa version.
The default cache is 32MiB which should be large enough.

This fixes massive stuttering in FF7 Rebirth but all apps are
technically affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33405>
2025-02-06 07:33:37 +00:00
Samuel Pitoiset
f0a4a71b3a vulkan/runtime: allow to use a different disk cache
Instead of using the default one provided by the physical device.
This will be used by RADV to store meta shaders to a separate single
cache file.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33405>
2025-02-06 07:33:37 +00:00
Samuel Pitoiset
30e0d3da66 util/disk_cache: add a new helper to create a disk cache
This will be used by RADV to store the meta shaders to a separate
cache directory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33405>
2025-02-06 07:33:37 +00:00
Samuel Pitoiset
03c3250e04 radv/meta: stop using string keys also for DGC and query objects
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33405>
2025-02-06 07:33:37 +00:00
Samuel Pitoiset
5443c23983 radv/meta: add missing pipeline lookups
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33405>
2025-02-06 07:33:36 +00:00
Konstantin Seurer
662bcc8717 radv/meta: Stop using strings for meta keys
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33405>
2025-02-06 07:33:36 +00:00
Eric Engestrom
1d051e5cb1 VERSION: bump for 25.0.0-rc2 2025-02-05 18:42:06 +01:00
Jung-uk Kim
b38918d1b4 FreeBSD: Disable support for "-mtls-dialect" for FreeBSD
Clang 19 supports "-mtls-dialect=" but FreeBSD does not support "-mtls-dialect=gnu2".
Skip auto-detection for FreeBSD.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31931>
(cherry picked from commit f9fc7392fa)
2025-02-05 16:09:27 +01:00
Mary Guillemard
32f0add871 panvk: Disallow unknown GPU models early in physical device init
We rely on the panfrost_model details around the codebase, if it's not
known this is a problem.

As a result, we will now disallow anything that isn't known like what
we do on Gallium.

Fixes: c95ef9e323 ("panvk: Fix NULL deref on model name when device isn't supported")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Suggested-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit e3b8d1da6d)
2025-02-05 16:08:35 +01:00
Mary Guillemard
687790670f pan/decode: Fix indirect branch calculation for 64-bit
THe enum variant for u64 was actually 32-bit making all 64-bit operation
wrong.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 7d0dc3d30c ("pan/decode: Add a helper to print CS binaries without interpreting them")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit 7bb6ebe938)
2025-02-05 16:08:35 +01:00
Mary Guillemard
56233d338b pan/bi: Use 2D dimension with TEX_FETCH with CUBE on Valhall
TEX_FETCH doesn't have the CUBE dimension, this was working on v9 and
v10 but this fails on Avalon.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: ce52b6d359 ("pan/bi: Rework indices for tex on Valhall")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit 135aeddc9b)
2025-02-05 16:08:34 +01:00
Mary Guillemard
45f57e0047 pan/bi: Fix invalid CLPER encoding
This src1 expect lanes, isn't widen and have a size of 8-bit (5-bit on
Valhall, 4-bit on Avalon)

We also now disallow swizzle lowering on it. (even on Bifrost)

Fixes: 316486dd9f ("pan/va: Add initial ISA.xml for Valhall")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit b00c09b920)
2025-02-05 16:08:34 +01:00
Mary Guillemard
ee5713a418 pan/bi: Remove shift lanes invalid encodings
We were wrongly defining values that select more than one byte.

The swizzle used for H01 was working fine for v9 and v10, but this
generate an invalid encoding on Avalon.

This fixes this by using B00 variant as we are only using 8-bit sources.

Fixes: f45654af59 ("pan/va: Add packing routines")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit 637cb0a993)
2025-02-05 16:08:33 +01:00
Mary Guillemard
f5e6b891fa pan/bi: Properly encode LEA_BUF_IMM
We were hardcoding table 61 and index 0 for IDVS based usage and this
could have been misused.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: f45654af59 ("pan/va: Add packing routines")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit fbd5d58e36)
2025-02-05 16:08:33 +01:00
Mary Guillemard
fa03018d28 panfrost: Fix PROGRESS_LOAD destination register
The offset of dest should be 40, not 48.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 486c341769 ("panfrost: Add architecture description XML for v10")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit 38a3cd8c76)
2025-02-05 16:08:32 +01:00
Mary Guillemard
442c29633d panfrost: Fix group priorities in drm-shim
Those were supposed to use BITFIELD_BIT.

Fixes: 2237cff1af ("panfrost: Report default value for GROUP_PRIORITIES_INFO in drm-shim")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33371>
(cherry picked from commit 05c2abcfea)
2025-02-05 16:08:32 +01:00
Erik Faye-Lund
fa31c1f713 pan/ci: add flaky tests to the flake-list
These have been switching between failing and passing recently. Not
really sure what's going on here, but we don't want the CI to flip
randomly between failing and passing, so let's mark them as flakes.

Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33381>
(cherry picked from commit 4d86a1c928)
2025-02-05 16:08:31 +01:00
Erik Faye-Lund
00472fd105 panvk/ci: add back incorrectly removed crash
Turns out, this was only fixed on G610, not on G52.

Fixes: f93a48e4e3 ("panfrost: fix hang by using MALI_PIXEL_KILL_WEAK_EARLY in color preload")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33381>
(cherry picked from commit 6f70425ef5)
2025-02-05 16:08:31 +01:00
Lionel Landwerlin
cb0d551424 brw: fixup scoreboarding for find_live_channels
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32895>
(cherry picked from commit c08b437db7)
2025-02-05 16:08:29 +01:00
Qiang Yu
ebe6878a6a gallium: fix ddebug and noop screen caps init
Fixes: a036231c09 ("gallium: add u_init_pipe_screen_caps")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33176>
(cherry picked from commit 2af8172b62)
2025-02-05 16:08:24 +01:00
Qiang Yu
59865a1b1e lavapipe: fix min_vertex_pipeline_param
Fixes: d91a549b67 ("lavapipe: check all vertex-stages")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33176>
(cherry picked from commit 0f656756ec)
2025-02-05 16:08:23 +01:00
Iago Toral Quiroga
1579ff453e v3dv: fix missing access bit flag when checking for texel buffer reads
VK_ACCESS_2_SHADER_READ_BIT matches all types of reads from shaders,
texel buffers too.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33367>
(cherry picked from commit a6dc8fa426)
2025-02-05 16:08:18 +01:00
Eric Engestrom
6c580e547d .pick_status.json: Mark 39969409f6 as denominated 2025-02-05 16:08:06 +01:00
Martin Roukala (né Peres)
729f1b1112 ci: fix the artifact name
This has probably no incidence on anything else but human-visible names
but let's fix it anyway.

Fixes: ef3091736c ("ci: use CI_PROJECT_NAME for artifacts name")
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32927>
(cherry picked from commit 978c0989eb)
2025-02-05 16:05:42 +01:00
Eric Engestrom
52439657be .pick_status.json: Update to e192d7d615 2025-02-05 16:05:35 +01:00
Pavel Ondračka
84f297e9d1 i915/ci: use debian-build-testing instead of debian-testing
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33345>
(cherry picked from commit f7e5daaedd)
2025-02-04 21:10:16 +01:00
Valentine Burley
82b697ed69 amd/ci: Revert to 6.6 kernel on Raven
There's been a high number of GPU resets on Raven that amdgpu couldn't
recover from, leading to jobs timing out.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33317>
(cherry picked from commit 5c44d70684)
2025-02-04 21:10:10 +01:00
Erik Faye-Lund
5b1fc670a7 panvk: fix line-rasterization of bifrost
Vulkan defines the line rasterization to *always* use perpendicular
rather than aligned line ends (unless otherwise specified by
VK_EXT_line_rasterization). So let's remove the code that conditionally
sets the bit, we always want the default value (0) here.

It might seem confusing because we kinda named this field wrong. It's
really about perpendicular vs aligned line ends. That's a cleanup we
might want to deal with later, but deleting the assignment is sufficient
to fix this issue. This is also what we do for v10.

This was probably just copied from the Gallium-driver, where this logic
is more or less correct.

Fixes: d970fe2e9d ("panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33373>
(cherry picked from commit 1d64095410)
2025-02-04 20:47:26 +01:00
Karol Herbst
a1d5a8ea97 rusticl/kernel: call nir_lower_variable_initializers earlier
Fixes spirv_new spirv14_nonwriteable_decoration

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33327>
(cherry picked from commit 2f4931353f)
2025-02-04 20:47:26 +01:00
James Hogan
8d50d42514 mesa: Fix FramebufferTextureMultiviewOVR num_views check
The check in check_multiview_texture_target() whether numViews <= 0 (as
required by the OVR_multiview spec) is never triggered since it is only
called by frame_buffer_texture() when numviews > 1, as numviews of 0 is
passed in by non multiview FramebufferTexture functions. Such cases are
incorrectly treated as non-multiview attachments.

Tweak frame_buffer_texture() to take an extra bool argument "multiview"
to distinguish between a multiview call with numviews=0, and a
non-multiview call.

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33346>
(cherry picked from commit 7f493b45ae)
2025-02-04 20:47:26 +01:00
James Hogan
def5f68269 mesa: Fix multiview attachment completeness check
Fix the FBO attachment completeness test to ensure that multiview
attachments have all views referring to layers in range of the
underlying texture.

The OVR_multiview spec states:
  Add the following to the list of conditions required for framebuffer
  attachment completeness in section 9.4.1 (Framebuffer Attachment
  Completeness):

  "If <image> is a two-dimensional array and the attachment
  is multiview, all the selected layers, [<baseViewIndex>,
  <baseViewIndex> + <numViews>), are less than the layer count of the
  texture."

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33346>
(cherry picked from commit 39491da1b6)
2025-02-04 20:47:26 +01:00
James Hogan
fdb7f38da0 glsl: Expose gl_ViewID_OVR back to GLSL 1.30
OVR_multiview requires OpenGL 3.0, so expose gl_ViewID_OVR builtin back
to GLSL 1.30 on OpenGL.

v2: Minor whitespace fix

Fixes: 328c29d600 ("mesa,glsl,gallium: add GL_OVR_multiview")
Signed-off-by: James Hogan <james@albanarts.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33346>
(cherry picked from commit b774b615d2)
2025-02-04 20:47:26 +01:00
Pavel Ondračka
7d0081b108 ci: fix debian-build-testing BUILDTYPE
Fixes: 5b291c7ce6
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33368>
(cherry picked from commit 60e1bc55bf)
2025-02-04 20:47:26 +01:00
Eric Engestrom
e0039516fc .pick_status.json: Update to e49df902b4 2025-02-04 20:47:26 +01:00
Rebecca Mckeever
76fdc6dada pan/texture: Only use plane_chroma_2p for chroma planes
In a 3-plane uncompressed YUV surface, only the chroma planes should use
MALI_PLANE_TYPE_CHROMA_2P plane_type or set secondary_pointer.

Fixes: 144f9324a3 ("panfrost: prepare v9+ to support YUV sampling")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33104>
(cherry picked from commit 58bd1356cc)
2025-02-04 20:47:26 +01:00
Rebecca Mckeever
d91b19ac13 pan/format: Use HW version to determine siting for YUV 422 formats
On v10, only YUV 420 formats support center_y or center siting.

On previous HW versions, YUV 422 formats support center_y siting but not
center_x or center siting.

Fixes: 83c76cceaf ("panfrost: advertise YUV formats for valhall")

Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33104>
(cherry picked from commit 23aa784c05)
2025-02-04 20:47:26 +01:00
Mike Blumenkrantz
1ea9e1e364 zink: guard rebar check against fallback heap detection
if there is no heap with device-local and host-visible, then
rebar cannot exist. the previous detection did not account for
the rebar heap using the device-local fallback, which of course
would have the same size as the device-local heap and pass the threshold
check

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33359>
(cherry picked from commit 3064bfc312)
2025-02-04 20:47:26 +01:00
Ernst Persson
26ad2f9149 intel/vulkan: Add bvh build dependency
Fixes: 41baeb3810 ("anv: Implement acceleration structure API")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12558
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33333>
(cherry picked from commit c64871accc)
2025-02-04 20:47:26 +01:00
Karol Herbst
de28085f27 rusticl/queue: check device error status
If the underlying GPU context hit any execution errors (e.g. it times out
or something) we want to report it to the application as well.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32929>
(cherry picked from commit 3129fd8dcf)
2025-02-04 20:47:26 +01:00
Karol Herbst
0b7bee3e09 rusticl/mesa: add PipeContext::device_reset_status
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32929>
(cherry picked from commit 2c52ddd1a6)
2025-02-04 20:47:26 +01:00
Karol Herbst
3aa3ec625d rusticl/mem: set bind flags for gl imports
We have to tell the driver how we want to use the resource.

Fixes: 2645003bdc ("rusticl: Create CL mem objects from GL")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33325>
(cherry picked from commit 46454f01d3)
2025-02-04 20:47:26 +01:00
Boris Brezillon
f2f488ced5 pan/decode: Fix the blend_count mask
The blend count field is 4 bits not 3 bits.

Fixes: f2740ac69c ("pan/decode: Add support for decoding CSF")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33321>
(cherry picked from commit 438652654b)
2025-02-04 20:47:26 +01:00
Boris Brezillon
6911634820 panvk: Don't clobber registers if the render pass was suspended
Commit 2d3c50d484 ("panvk: Fix barriers in secondary cmdbufs w/o rp's")
started resetting the render flags we were relying on to decide to
clobber registers or not. Introduce a new field to restore that check.

Fixes: 2d3c50d484 ("panvk: Fix barriers in secondary cmdbufs w/o rp's")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33321>
(cherry picked from commit 127af6f38a)
2025-02-04 20:47:26 +01:00
Eric Engestrom
85bd87de30 .pick_status.json: Mark 0ee5015da4 as denominated 2025-02-04 20:47:26 +01:00
Mike Blumenkrantz
ab687c3983 zink: also refcount needs_present from frontbuffer flush
Fixes: 4b0f2d1a2b ("zink: refcount needs_present resource")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33324>
(cherry picked from commit 41296aab47)
2025-02-04 20:47:26 +01:00
Lars-Ivar Hesselberg Simonsen
c96c123114 panvk: Set missing shader_modifies_coverage flag
The shader_modifies_coverage-flag is currently not set for PanVK. This
might lead to issues down the line, so ensure it's set correctly.

Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33300>
(cherry picked from commit 375116a3a0)
2025-02-04 20:47:26 +01:00
Lars-Ivar Hesselberg Simonsen
056775eb40 Revert "panfrost: fix hang by using MALI_PIXEL_KILL_WEAK_EARLY in color preload"
This reverts commit f93a48e4e3.

Backport-to: 25.0
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33300>
(cherry picked from commit 2c855c1f4c)
2025-02-04 20:47:26 +01:00
Lars-Ivar Hesselberg Simonsen
fbf86a1c11 Revert "panfrost: remove is_blit flag"
This reverts commit 6d6a43518a.

Backport-to: 25.0
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33300>
(cherry picked from commit 41cb2e73c2)
2025-02-04 20:47:26 +01:00
Lars-Ivar Hesselberg Simonsen
8379aef572 panfrost: Do not evaluate_per_sample for non-MSAA
Enabling evaluate_per_sample in non-MSAA cases might cause issues and
hangs for subsequent ZS cases.

Therefore, only enable the flag when MSAA is active.

Fixes: 26d339ef8a ("panfrost: Generate Valhall Malloc IDVS jobs")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33300>
(cherry picked from commit 46256f3e39)
2025-02-04 20:47:26 +01:00
Hyunjun Ko
cd4ffc319f anv: Fix to set CDEF flter flag correctly for AV1 decoding
and relevant tiny clean-up.

Fixes: 8432b8b282 ("anv: add initial support for AV1 decoding")

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33316>
(cherry picked from commit 52d9edbf05)
2025-02-04 20:47:26 +01:00
Pierre-Eric Pelloux-Prayer
efdd9452fe radeonsi: update si_need_gfx_cs_space upper bound
radeon_emit_alt_hiz_logic can add 8 extra dw per draw.

Fixes: cdecbee922 ("radeonsi/gfx12: adjust HiZ/HiS logic")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33288>
(cherry picked from commit b3f2435994)
2025-02-04 20:47:26 +01:00
Mike Blumenkrantz
3be9a52a1a zink: emit SpvCapabilityDemoteToHelperInvocation for IsHelperInvocation
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31228>
(cherry picked from commit b4f3136fea)
2025-02-04 20:47:26 +01:00
Tim Keller
845a60dc35 dril: Check for null config in dril_target.c
fixes: 06d417af

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33336>
(cherry picked from commit 4ecd183c56)
2025-02-04 20:47:26 +01:00
Eric Engestrom
66b260fb4f .pick_status.json: Update to 5b856a741d 2025-02-04 20:47:26 +01:00
Eric Engestrom
f43f541c71 [25.0-only] hk: comment out dead variable
Removing a warning during compilation.
2025-02-04 20:47:26 +01:00
Eric Engestrom
001a665ca3 VERSION: bump for 25.0.0-rc1 2025-01-30 21:17:34 +01:00
663 changed files with 64061 additions and 6039 deletions

View File

@@ -30,11 +30,15 @@ workflow:
# do not duplicate pipelines on merge pipelines
- if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS && $CI_PIPELINE_SOURCE == "push"
when: never
# tag pipelines are disabled as it's too late to run all the tests by
# then, the release has been made based on the staging pipelines results
- if: $CI_COMMIT_TAG
when: never
# merge pipeline
- if: &is-merge-attempt $GITLAB_USER_LOGIN == "marge-bot" && $CI_PIPELINE_SOURCE == "merge_request_event"
variables:
MESA_CI_PERFORMANCE_ENABLED: 1
VALVE_INFRA_VANGOGH_JOB_PRIORITY: "" # Empty tags are ignored by gitlab
CI_TRON_JOB_PRIORITY_TAG: "" # Empty tags are ignored by gitlab
JOB_PRIORITY: 75
# fast-fail in merge pipelines: stop early if we get this many unexpected fails/crashes
DEQP_RUNNER_MAX_FAILS: 40
@@ -53,7 +57,11 @@ workflow:
# Note: 0 = infinity = gitlab's job `timeout:` applies, which is 1h
BUILD_JOB_TIMEOUT_OVERRIDE: 0
# pipeline for direct pushes that bypassed the CI
- if: &is-direct-push $CI_PROJECT_NAMESPACE == "mesa" && $CI_PIPELINE_SOURCE == "push" && $GITLAB_USER_LOGIN != "marge-bot"
- if: &is-direct-push $CI_PROJECT_NAMESPACE == "mesa" && $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
variables:
JOB_PRIORITY: 70
# pipeline for direct pushes from release maintainer
- if: &is-staging-push $CI_PROJECT_NAMESPACE == "mesa" && $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME =~ /^staging\//
variables:
JOB_PRIORITY: 70
@@ -102,7 +110,7 @@ variables:
# Avoid the wall of "Unsupported SPIR-V capability" warnings in CI job log, hiding away useful output
MESA_SPIRV_LOG_LEVEL: error
# Default priority for non-merge pipelines
VALVE_INFRA_VANGOGH_JOB_PRIORITY: priority:low
CI_TRON_JOB_PRIORITY_TAG: ci-tron:priority:low
JOB_PRIORITY: 50
DATA_STORAGE_PATH: data_storage
@@ -248,6 +256,9 @@ include:
# Build everything after someone bypassed the CI
- if: *is-direct-push
when: on_success
# Build everything when pushing to staging branches
- if: *is-staging-push
when: on_success
# Build everything in scheduled pipelines
- if: *is-scheduled-pipeline
when: on_success
@@ -258,7 +269,7 @@ include:
.ci-deqp-artifacts:
artifacts:
name: "{CI_PROJECT_NAME}_${CI_JOB_NAME}"
name: "${CI_PROJECT_NAME}_${CI_JOB_NAME}"
when: always
untracked: false
paths:
@@ -284,11 +295,11 @@ make git archive:
# Compactify the .git directory
- git gc --aggressive
# Download & cache the perfetto subproject as well.
- rm -rf subprojects/perfetto ; mkdir -p subprojects/perfetto && curl https://android.googlesource.com/platform/external/perfetto/+archive/$(grep 'revision =' subprojects/perfetto.wrap | cut -d ' ' -f3).tar.gz | tar zxf - -C subprojects/perfetto
- rm -rf subprojects/perfetto ; mkdir -p subprojects/perfetto && curl --fail https://android.googlesource.com/platform/external/perfetto/+archive/$(grep 'revision =' subprojects/perfetto.wrap | cut -d ' ' -f3).tar.gz | tar zxf - -C subprojects/perfetto
# compress the current folder
- tar -cvzf ../$CI_PROJECT_NAME.tar.gz .
- ci-fairy s3cp --token-file "${S3_JWT_FILE}" ../$CI_PROJECT_NAME.tar.gz https://$S3_HOST/git-cache/$CI_PROJECT_NAMESPACE/$CI_PROJECT_NAME/$CI_PROJECT_NAME.tar.gz
- s3_upload ../$CI_PROJECT_NAME.tar.gz "https://$S3_HOST/git-cache/$CI_PROJECT_NAMESPACE/$CI_PROJECT_NAME/"
# Sanity checks of MR settings and commit logs
sanity:

View File

@@ -16,7 +16,7 @@
# We don't want to download any previous job's artifacts
dependencies: []
artifacts:
name: "{CI_PROJECT_NAME}_${CI_JOB_NAME}"
name: "${CI_PROJECT_NAME}_${CI_JOB_NAME}"
when: always
paths:
- _build/meson-logs/*.txt
@@ -72,6 +72,8 @@
optional: true
- job: debian-testing-asan
optional: true
- job: debian-testing-ubsan
optional: true
- job: debian-build-testing
optional: true
- job: debian-arm32
@@ -238,7 +240,6 @@ debian-build-testing:
extends: .meson-build
stage: build-for-tests
variables:
BUILDTYPE: debug
UNWIND: "enabled"
DRI_LOADERS: >
-D glx=dri
@@ -255,7 +256,7 @@ debian-build-testing:
-D gallium-rusticl=false
GALLIUM_DRIVERS: "i915,iris,nouveau,r300,r600,freedreno,llvmpipe,softpipe,svga,v3d,vc4,virgl,etnaviv,panfrost,lima,zink,d3d12,asahi,crocus"
VULKAN_DRIVERS: "intel_hasvk,imagination-experimental,microsoft-experimental,nouveau,swrast"
BUILD_TYPE: "debugoptimized"
BUILDTYPE: "debugoptimized"
EXTRA_OPTION: >
-D spirv-to-dxil=true
-D osmesa=true
@@ -297,6 +298,8 @@ shader-db:
paths:
- shader-db
timeout: 15m
tags:
- kvm # FIXME: this is a hack, should not be needed
# Test a release build with -Werror so new warnings don't sneak in.
debian-release:

View File

@@ -230,7 +230,7 @@ cleanup
# upload artifacts
if [ -n "$S3_RESULTS_UPLOAD" ]; then
tar --zstd -cf results.tar.zst results/;
ci-fairy s3cp --token-file "${S3_JWT_FILE}" results.tar.zst https://"$S3_RESULTS_UPLOAD"/results.tar.zst;
s3_upload results.tar.zst https://"$S3_RESULTS_UPLOAD"/
fi
# We still need to echo the hwci: mesa message, as some scripts rely on it, such

View File

@@ -7,7 +7,7 @@ set -o xtrace
# network transfer, disk usage, and runtime on test jobs)
# shellcheck disable=SC2154 # arch is assigned in previous scripts
if curl -X HEAD -s "${ARTIFACTS_PREFIX}/${FDO_UPSTREAM_REPO}/${ARTIFACTS_SUFFIX}/${arch}/done"; then
if curl --fail -X HEAD -s "${ARTIFACTS_PREFIX}/${FDO_UPSTREAM_REPO}/${ARTIFACTS_SUFFIX}/${arch}/done"; then
ARTIFACTS_URL="${ARTIFACTS_PREFIX}/${FDO_UPSTREAM_REPO}/${ARTIFACTS_SUFFIX}/${arch}"
else
ARTIFACTS_URL="${ARTIFACTS_PREFIX}/${CI_PROJECT_PATH}/${ARTIFACTS_SUFFIX}/${arch}"

View File

@@ -110,7 +110,7 @@ tar --zstd -cf "${ANDROID_LLVM_ARTIFACT_NAME}.tar.zst" "$LLVM_INSTALL_PREFIX"
# version does not change, and delete it.
# The file is not deleted for non-CI because it can be useful in local runs.
if [ -n "$CI" ]; then
ci-fairy s3cp --token-file "${S3_JWT_FILE}" "${ANDROID_LLVM_ARTIFACT_NAME}.tar.zst" "https://${S3_HOST}/${S3_ANDROID_BUCKET}/${CI_PROJECT_PATH}/${ANDROID_LLVM_ARTIFACT_NAME}.tar.zst"
s3_upload "${ANDROID_LLVM_ARTIFACT_NAME}.tar.zst" "https://${S3_HOST}/${S3_ANDROID_BUCKET}/${CI_PROJECT_PATH}/"
rm "${ANDROID_LLVM_ARTIFACT_NAME}.tar.zst"
fi

View File

@@ -25,11 +25,10 @@ if [ "${SKIP_UPDATE_FLUSTER_VECTORS}" != 1 ]; then
# Build fluster vectors archive and upload it
tar --zstd -cf "vectors.tar.zst" fluster/resources/
ci-fairy s3cp --token-file "${S3_JWT_FILE}" "vectors.tar.zst" \
"https://${S3_PATH_FLUSTER}/vectors.tar.zst"
s3_upload vectors.tar.zst "https://${S3_PATH_FLUSTER}/"
touch /lava-files/done
ci-fairy s3cp --token-file "${S3_JWT_FILE}" /lava-files/done "https://${S3_PATH_FLUSTER}/done"
s3_upload /lava-files/done "https://${S3_PATH_FLUSTER}/"
# Don't include the vectors in the rootfs
rm -fr fluster/resources/*

View File

@@ -10,7 +10,7 @@ uncollapsed_section_start piglit "Building piglit"
# DEBIAN_TEST_VK_TAG
# KERNEL_ROOTFS_TAG
REV="631b72944f56e688f56a08d26c8a9f3988801a08"
REV="68658566da1c9cd6a378b5ca36999617e26440e7"
git clone https://gitlab.freedesktop.org/mesa/piglit.git --single-branch --no-checkout /piglit
pushd /piglit

View File

@@ -19,7 +19,7 @@ git clone \
pushd /va-utils
# Too old libva in Debian 11. TODO: when this PR gets in, refer to the patch.
curl -L https://github.com/intel/libva-utils/pull/329.patch | git am
curl --fail -L https://github.com/intel/libva-utils/pull/329.patch | git am
meson setup build -D tests=true -Dprefix=/va ${EXTRA_MESON_ARGS:-}
meson install -C build

View File

@@ -12,7 +12,7 @@ case "${FDO_DISTRIBUTION_VERSION%-*},${LLVM_VERSION}" in
esac
if [ "$NEED_LLVM_REPO" = "true" ]; then
curl -s https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
curl --fail -s https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
export LLVM_APT_REPO="deb [trusted=yes] https://apt.llvm.org/${FDO_DISTRIBUTION_VERSION%-*}/ llvm-toolchain-${FDO_DISTRIBUTION_VERSION%-*}-${LLVM_VERSION} main"
echo "$LLVM_APT_REPO" | tee /etc/apt/sources.list.d/llvm.list
fi

View File

@@ -444,8 +444,7 @@ popd
. .gitlab-ci/container/container_post_build.sh
ci-fairy s3cp --token-file "${S3_JWT_FILE}" /lava-files/"${ROOTFSTAR}" \
https://${S3_PATH}/"${ROOTFSTAR}"
s3_upload /lava-files/"${ROOTFSTAR}" "https://${S3_PATH}/"
touch /lava-files/done
ci-fairy s3cp --token-file "${S3_JWT_FILE}" /lava-files/done https://${S3_PATH}/done
s3_upload /lava-files/done "https://${S3_PATH}/"

View File

@@ -28,9 +28,9 @@ variables:
DEBIAN_X86_64_TEST_ANDROID_IMAGE_PATH: "debian/x86_64_test-android"
DEBIAN_TEST_ANDROID_TAG: "20250130-vvless"
DEBIAN_TEST_GL_TAG: "20250130-vvless"
DEBIAN_TEST_VK_TAG: "20250130-vvless"
KERNEL_ROOTFS_TAG: "20250130-vvless"
DEBIAN_TEST_GL_TAG: "20250327-piglit-250"
DEBIAN_TEST_VK_TAG: "20250327-piglit-250"
KERNEL_ROOTFS_TAG: "20250327-trace-250"
DEBIAN_PYUTILS_IMAGE: "debian/x86_64_pyutils"
DEBIAN_PYUTILS_TAG: "20250129-lavacli"

View File

@@ -52,7 +52,7 @@ cp artifacts/ci-common/init-*.sh results/job-rootfs-overlay/
cp "$SCRIPTS_DIR"/setup-test-env.sh results/job-rootfs-overlay/
tar zcf job-rootfs-overlay.tar.gz -C results/job-rootfs-overlay/ .
ci-fairy s3cp --token-file "${S3_JWT_FILE}" job-rootfs-overlay.tar.gz "https://${JOB_ROOTFS_OVERLAY_PATH}"
s3_upload job-rootfs-overlay.tar.gz "https://${JOB_ARTIFACTS_BASE}"
# Prepare env vars for upload.
section_switch variables "Environment variables passed through to device:"

View File

@@ -162,6 +162,16 @@ class LAVAJobDefinition:
"minutes": 5
* NUMBER_OF_ATTEMPTS_LAVA_BOOT,
},
"uboot-action": {
# For rockchip DUTs, U-Boot auto-login action downloads the kernel and
# setup early network. This takes 72 seconds on average.
# The LAVA action that wraps it is `uboot-commands`, but we can't set a
# timeout for it directly, it is overridden by one third of `uboot-action`
# timeout.
# So actually, this timeout is here to enforce that `uboot-commands`
# timeout to be 100 seconds (300 sec / 3), which is more than enough.
"minutes": 5
},
},
},
}

View File

@@ -68,7 +68,7 @@ EOF
ping -c 5 -w 60 $(lava-target-ip)
lava_ssh_test_case() {
set -x
set -ex
local test_case="${1}"
shift
lava-test-case \"${test_case}\" --shell \\
@@ -170,7 +170,7 @@ def generate_docker_test(
# maintainers with monitoring
f"lava_ssh_test_case '{args.project_name}_{args.mesa_job_name}' "
# Changing directory to /, as the HWCI_SCRIPT expects that
"'\"cd / && /init-stage2.sh\"'",
"'cd / && /init-stage2.sh'",
]
return init_stages_test

View File

@@ -13,7 +13,6 @@ set -ex
export PAGER=cat # FIXME: export everywhere
INSTALL=$(realpath -s "$PWD"/install)
S3_ARGS="--token-file ${S3_JWT_FILE}"
export PIGLIT_REPLAY_DESCRIPTION_FILE="$INSTALL/$PIGLIT_TRACES_FILE"
@@ -120,7 +119,7 @@ replay_s3_upload_images() {
fi
__S3_PATH="$PIGLIT_REPLAY_REFERENCE_IMAGES_BASE"
__DESTINATION_FILE_PATH="${line##*-}"
if curl -L -s -I "https://${__S3_PATH}/${__DESTINATION_FILE_PATH}" | grep -q "content-type: application/octet-stream" 2>/dev/null; then
if curl --fail -L -s -I "https://${__S3_PATH}/${__DESTINATION_FILE_PATH}" | grep -q "content-type: application/octet-stream" 2>/dev/null; then
continue
fi
else
@@ -128,8 +127,7 @@ replay_s3_upload_images() {
__DESTINATION_FILE_PATH="$__S3_TRACES_PREFIX/${line##*-}"
fi
ci-fairy s3cp $S3_ARGS "$RESULTS_DIR/$__PREFIX/$line" \
"https://${__S3_PATH}/${__DESTINATION_FILE_PATH}"
s3_upload "$RESULTS_DIR/$__PREFIX/$line" "https://${__S3_PATH}/${__DESTINATION_FILE_PATH%/*}"
done
}
@@ -169,7 +167,9 @@ rm -rf replayer-db
if [ -n "$PIGLIT_REPLAY_ANGLE_TAG" ]; then
ARCH="amd64"
FILE="angle-bin-${ARCH}-${PIGLIT_REPLAY_ANGLE_TAG}.tar.zst"
ci-fairy s3cp $S3_ARGS "https://s3.freedesktop.org/mesa-tracie-private/${FILE}" "${FILE}"
curl --location --fail --retry-all-errors --retry 4 --retry-delay 60 \
--header "Authorization: Bearer $(cat "${S3_JWT_FILE}")" \
"https://s3.freedesktop.org/mesa-tracie-private/${FILE}" --output "${FILE}"
mkdir -p replayer-db/angle
tar --zstd -xf ${FILE} -C replayer-db/angle/
fi

View File

@@ -53,7 +53,7 @@ if [ -n "$S3_ARTIFACT_NAME" ]; then
# Pass needed files to the test stage
S3_ARTIFACT_TAR="$S3_ARTIFACT_NAME.tar.zst"
tar cv artifacts/ | zstd -o "${S3_ARTIFACT_TAR}"
ci-fairy s3cp --token-file "${S3_JWT_FILE}" "${S3_ARTIFACT_TAR}" "https://${PIPELINE_ARTIFACTS_BASE}/${S3_ARTIFACT_TAR}"
s3_upload "${S3_ARTIFACT_TAR}" "https://${PIPELINE_ARTIFACTS_BASE}/"
rm "${S3_ARTIFACT_TAR}"
fi

View File

@@ -84,7 +84,7 @@ if [ -n "$S3_ARTIFACT_NAME" ]; then
# Pass needed files to the test stage
S3_ARTIFACT_NAME="$S3_ARTIFACT_NAME.tar.zst"
zstd --quiet --threads ${FDO_CI_CONCURRENT:-0} artifacts/install.tar -o ${S3_ARTIFACT_NAME}
ci-fairy s3cp --token-file "${S3_JWT_FILE}" ${S3_ARTIFACT_NAME} https://${PIPELINE_ARTIFACTS_BASE}/${S3_ARTIFACT_NAME}
s3_upload "${S3_ARTIFACT_NAME}" "https://${PIPELINE_ARTIFACTS_BASE}/"
fi
section_end prepare-artifacts

View File

@@ -140,5 +140,21 @@ function trap_err {
export -f error
export -f trap_err
s3_upload() {
x_off
local file=$1 s3_folder_url=$2
if [ ! -f "$file" ] || [[ "$s3_folder_url" != https://* ]]
then
echo "s3_upload used incorrectly: first argument is the file, second argument is the s3 folder url"
exit 1
fi
curl --fail --retry-all-errors --retry 4 --retry-delay 60 \
--header "Authorization: Bearer $(cat "${S3_JWT_FILE}")" \
-X PUT --form file=@"$file" \
"$s3_folder_url"
x_restore
}
export -f s3_upload
set -E
trap 'trap_err $?' ERR

View File

@@ -64,7 +64,8 @@ yaml-toml-shell-py-test:
- !reference [.disable-farm-mr-rules, rules]
- !reference [.never-post-merge-rules, rules]
- !reference [.no_scheduled_pipelines-rules, rules]
- if: $GITLAB_USER_LOGIN == "marge-bot"
# merge pipeline
- if: $GITLAB_USER_LOGIN == "marge-bot" && $CI_PIPELINE_SOURCE == "merge_request_event"
changes: &lint_files
- .gitlab-ci/test/gitlab-ci.yml
- .gitlab-ci/**/*.sh
@@ -74,6 +75,14 @@ yaml-toml-shell-py-test:
- .gitlab-ci/tests/**/*
- bin/ci/**/*
when: on_success
# direct pushes that bypassed the CI
- if: $CI_PROJECT_NAMESPACE == "mesa" && $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
changes: *lint_files
when: on_success
# direct pushes from release manager
- if: $CI_PROJECT_NAMESPACE == "mesa" && $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_REF_NAME =~ /^staging\//
changes: *lint_files
when: on_success
- changes: *lint_files
when: manual
tags:
@@ -135,6 +144,8 @@ yaml-toml-shell-py-test:
artifacts:
paths:
- results/
tags:
- kvm
.b2c-vkd3d-proton-test:
variables:
@@ -143,7 +154,7 @@ yaml-toml-shell-py-test:
.piglit-traces-test:
artifacts:
when: on_failure
name: "{CI_PROJECT_NAME}_${CI_JOB_NAME}"
name: "${CI_PROJECT_NAME}_${CI_JOB_NAME}"
reports:
junit: results/junit.xml
paths:
@@ -177,7 +188,7 @@ yaml-toml-shell-py-test:
- ./install/fossilize-runner.sh
artifacts:
when: on_failure
name: "{CI_PROJECT_NAME}_${CI_JOB_NAME}"
name: "${CI_PROJECT_NAME}_${CI_JOB_NAME}"
paths:
- results/
@@ -205,7 +216,7 @@ yaml-toml-shell-py-test:
BM_ROOTFS: /rootfs-${DEBIAN_ARCH}
artifacts:
when: always
name: "{CI_PROJECT_NAME}_${CI_JOB_NAME}"
name: "${CI_PROJECT_NAME}_${CI_JOB_NAME}"
paths:
- results/
- serial*.txt
@@ -399,7 +410,7 @@ yaml-toml-shell-py-test:
artifacts:
when: always
name: "{CI_PROJECT_NAME}_${CI_JOB_NAME}"
name: "${CI_PROJECT_NAME}_${CI_JOB_NAME}"
paths:
- results
reports:
@@ -425,6 +436,8 @@ yaml-toml-shell-py-test:
extends:
- .use-debian/x86_64_test-vk
- .b2c-x86_64-test
variables:
S3_ARTIFACT_NAME: "debian-build-testing"
needs:
- debian/x86_64_test-vk
- debian-build-testing
@@ -443,6 +456,8 @@ yaml-toml-shell-py-test:
extends:
- .use-debian/x86_64_test-gl
- .b2c-x86_64-test
variables:
S3_ARTIFACT_NAME: "debian-build-testing"
needs:
- debian/x86_64_test-gl
- debian-build-testing

View File

@@ -16,6 +16,8 @@ timeouts:
minutes: 1
depthcharge-action:
minutes: 15
uboot-action:
minutes: 5
actions:
- deploy:
timeout:
@@ -43,7 +45,8 @@ actions:
steps:
- cat Image.gz my_dtb_filename.dtb > Image.gz+dtb
- mkbootimg --kernel Image.gz+dtb --cmdline "root=/dev/nfs rw nfsroot=$NFS_SERVER_IP:$NFS_ROOTFS,tcp,hard,v3
ip=dhcp init=/init rootwait usbcore.quirks=0bda:8153:k" --pagesize 4096 --base 0x80000000 -o boot.img
ip=dhcp init=/init rootwait usbcore.quirks=0bda:8153:k" --pagesize 4096
--base 0x80000000 -o boot.img
namespace: dut
- deploy:
timeout:
@@ -118,7 +121,7 @@ actions:
ping -c 5 -w 60 $(lava-target-ip)
lava_ssh_test_case() {
set -x
set -ex
local test_case="${1}"
shift
lava-test-case "${test_case}" --shell \
@@ -137,6 +140,6 @@ actions:
sed -i '/S3_RESULTS_UPLOAD/d' /set-job-env-vars.sh
EOF
- export SSH_PTY_ARGS=-tt
- lava_ssh_test_case 'test-project_dut' '"cd / && /init-stage2.sh"'
- lava_ssh_test_case 'test-project_dut' 'cd / && /init-stage2.sh'
docker:
image:

View File

@@ -16,6 +16,8 @@ timeouts:
minutes: 1
depthcharge-action:
minutes: 15
uboot-action:
minutes: 5
actions:
- deploy:
timeout:
@@ -42,7 +44,8 @@ actions:
steps:
- cat Image.gz my_dtb_filename.dtb > Image.gz+dtb
- mkbootimg --kernel Image.gz+dtb --cmdline "root=/dev/nfs rw nfsroot=$NFS_SERVER_IP:$NFS_ROOTFS,tcp,hard,v3
ip=dhcp init=/init rootwait usbcore.quirks=0bda:8153:k" --pagesize 4096 --base 0x80000000 -o boot.img
ip=dhcp init=/init rootwait usbcore.quirks=0bda:8153:k" --pagesize 4096
--base 0x80000000 -o boot.img
- deploy:
timeout:
minutes: 2

View File

@@ -16,6 +16,8 @@ timeouts:
minutes: 1
depthcharge-action:
minutes: 15
uboot-action:
minutes: 5
actions:
- deploy:
timeout:
@@ -90,7 +92,7 @@ actions:
ping -c 5 -w 60 $(lava-target-ip)
lava_ssh_test_case() {
set -x
set -ex
local test_case="${1}"
shift
lava-test-case "${test_case}" --shell \
@@ -109,6 +111,6 @@ actions:
sed -i '/S3_RESULTS_UPLOAD/d' /set-job-env-vars.sh
EOF
- export SSH_PTY_ARGS=-tt
- lava_ssh_test_case 'test-project_dut' '"cd / && /init-stage2.sh"'
- lava_ssh_test_case 'test-project_dut' 'cd / && /init-stage2.sh'
docker:
image:

View File

@@ -16,6 +16,8 @@ timeouts:
minutes: 1
depthcharge-action:
minutes: 15
uboot-action:
minutes: 5
actions:
- deploy:
timeout:

View File

@@ -211,7 +211,7 @@ def test_lava_job_definition(
job_dict = yaml.load(job_definition)
# Uncomment the following to update the expected YAML files
# yaml.dump(job_dict, Path(f"../../data/{mode}_force_uart={force_uart}_job_definition.yaml"))
# yaml.dump(job_dict, load_data_file(f"{mode}_force_uart={force_uart}_job_definition.yaml"))
# Check that the generated job definition matches the expected one
assert job_dict == expected_job_dict

49692
.pick_status.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
25.0.0-devel
25.0.7

View File

@@ -29,6 +29,7 @@ import subprocess
import typing
import attr
from packaging.version import Version
if typing.TYPE_CHECKING:
from .ui import UI
@@ -292,11 +293,13 @@ async def resolve_nomination(commit: 'Commit', version: str) -> 'Commit':
commit.nominated = True
return commit
if backport_to := IS_BACKPORT.search(out):
if version in backport_to.groups():
commit.nominated = True
commit.nomination_type = NominationType.BACKPORT
return commit
if backport_to := IS_BACKPORT.findall(out):
for match in backport_to:
if any(Version(version) >= Version(backport_version)
for backport_version in match if backport_version != ''):
commit.nominated = True
commit.nomination_type = NominationType.BACKPORT
return commit
if cc_to := IS_CC.search(out):
if cc_to.groups() == (None, None) or version in cc_to.groups():

View File

@@ -252,9 +252,8 @@ class TestRE:
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
""")
backport_to = core.IS_BACKPORT.search(message)
assert backport_to is not None
assert backport_to.groups() == ('19.2', None)
backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.2', '')]
def test_multiple_release_space(self):
"""Tests commit with more than one branch specified"""
@@ -268,9 +267,8 @@ class TestRE:
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
""")
backport_to = core.IS_BACKPORT.search(message)
assert backport_to is not None
assert backport_to.groups() == ('19.1', '19.2')
backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.1', '19.2')]
def test_multiple_release_comma(self):
"""Tests commit with more than one branch specified"""
@@ -284,9 +282,20 @@ class TestRE:
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
""")
backport_to = core.IS_BACKPORT.search(message)
assert backport_to is not None
assert backport_to.groups() == ('19.1', '19.2')
backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.1', '19.2')]
def test_multiple_release_lines(self):
"""Tests commit with more than one branch specified in mulitple tags"""
message = textwrap.dedent("""\
commit title
Backport-to: 19.0
Backport-to: 19.1, 19.2
""")
backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.0', ''), ('19.1', '19.2')]
class TestResolveNomination:
@@ -386,6 +395,17 @@ class TestResolveNomination:
assert c.nominated
assert c.nomination_type is core.NominationType.BACKPORT
@pytest.mark.asyncio
async def test_backport_is_nominated_after(self):
s = self.FakeSubprocess(b'Backport-to: 16.2')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
await core.resolve_nomination(c, '16.3')
assert c.nominated
assert c.nomination_type is core.NominationType.BACKPORT
@pytest.mark.asyncio
async def test_backport_is_not_nominated(self):
s = self.FakeSubprocess(b'Backport-to: 16.2')

View File

@@ -1,2 +1,3 @@
attrs==23.1.0
packaging==25.0
urwid==2.1.2

View File

@@ -220,13 +220,11 @@ driver libraries into the source tree of Android and patch the binary names.
mkdir prebuilts/mesa/x86_64
mkdir prebuilts/mesa/x86
cp ${INSTALL_PREFIX_64}/lib/libEGL.so prebuilts/mesa/x86_64/
cp ${INSTALL_PREFIX_64}/lib/libglapi.so prebuilts/mesa/x86_64/
cp ${INSTALL_PREFIX_64}/lib/libgallium_dri.so prebuilts/mesa/x86_64/
cp ${INSTALL_PREFIX_64}/lib/libGLESv1_CM.so prebuilts/mesa/x86_64/
cp ${INSTALL_PREFIX_64}/lib/libGLESv2.so prebuilts/mesa/x86_64/
cp ${INSTALL_PREFIX_64}/lib/libvulkan_lvp.so prebuilts/mesa/x86_64/
cp ${INSTALL_PREFIX_32}/lib/libEGL.so prebuilts/mesa/x86
cp ${INSTALL_PREFIX_32}/lib/libglapi.so prebuilts/mesa/x86
cp ${INSTALL_PREFIX_32}/lib/libgallium_dri.so prebuilts/mesa/x86/
cp ${INSTALL_PREFIX_32}/lib/libGLESv1_CM.so prebuilts/mesa/x86
cp ${INSTALL_PREFIX_32}/lib/libGLESv2.so prebuilts/mesa/x86
@@ -246,24 +244,6 @@ the libraries in the build.
.. code-block::
cc_prebuilt_library_shared {
name: "libglapi",
arch: {
x86_64: {
srcs: ["x86_64/libglapi.so"],
},
x86: {
srcs: ["x86/libglapi.so"],
},
},
strip: {
none: true,
},
relative_install_path: "egl",
shared_libs: ["libc", "libdl", "liblog", "libm"],
vendor: true
}
cc_prebuilt_library_shared {
name: "libgallium_dri",
arch: {

View File

@@ -9,7 +9,8 @@ and `Mali-G610 <https://www.khronos.org/conformance/adopters/conformant-products
but **non-conformant** on other GPUs.
PanVK, the Vulkan implementation in the Panfrost driver stack, is currently
**non-conformant** on all GPUs.
**conformant** on `Mali-G610 <https://www.khronos.org/conformance/adopters/conformant-products#submission_906>`__,
but *non-conformant* on other GPUs.
The following hardware is currently supported:

View File

@@ -490,7 +490,6 @@ Vulkan 1.3 -- all DONE: anv, lvp, nvk, radv, tu, vn, v3dv
VK_KHR_maintenance4 DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn)
VK_KHR_shader_integer_dot_product DONE (anv, dzn, hasvk, lvp, nvk, radv, tu, v3dv, vn)
VK_KHR_shader_non_semantic_info DONE (anv, hasvk, nvk, panvk, radv, tu, v3dv, vn)
VK_KHR_shader_relaxed_extended_instruction DONE (anv, hasvk, nvk, panvk, radv, tu, v3dv)
VK_KHR_shader_terminate_invocation DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn)
VK_KHR_synchronization2 DONE (anv, dzn, hasvk, lvp, nvk, panvk, radv, tu, v3dv, vn)
VK_KHR_zero_initialize_workgroup_memory DONE (anv, hasvk, lvp, nvk, panvk, radv, tu, v3dv, vn)
@@ -522,7 +521,7 @@ Vulkan 1.4 -- all DONE: anv, lvp, nvk, radv/gfx8+, tu/a7xx+
VK_KHR_push_descriptor DONE (anv, hasvk, lvp, nvk, panvk, radv, tu, vn)
VK_KHR_shader_expect_assume DONE (anv, dzn, hasvk, lvp, nvk, panvk, pvr, radv, tu, v3dv, vn)
VK_KHR_shader_float_controls2 DONE (anv, lvp, nvk, radv, tu)
VK_KHR_shader_subgroup_rotate DONE (anv, lvp, nvk, radv, tu)
VK_KHR_shader_subgroup_rotate DONE (anv, lvp, nvk, panvk, radv, tu)
VK_KHR_vertex_attribute_divisor DONE (anv, lvp, nvk, panvk, radv, tu, v3dv)
VK_EXT_host_image_copy DONE (anv, lvp, nvk/Turing+, tu)
VK_EXT_pipeline_protected_access DONE (anv/gfx12+)
@@ -561,6 +560,7 @@ Khronos extensions that are not part of any Vulkan version:
VK_KHR_ray_tracing_position_fetch DONE (anv, radv/gfx10.3+)
VK_KHR_shader_clock DONE (anv, hasvk, lvp, nvk, radv, vn)
VK_KHR_shader_maximal_reconvergence DONE (anv, lvp, nvk, radv)
VK_KHR_shader_relaxed_extended_instruction DONE (anv, hasvk, nvk, panvk, radv, tu, v3dv)
VK_KHR_shader_subgroup_uniform_control_flow DONE (anv, hasvk, nvk, radv, tu)
VK_KHR_shader_quad_control DONE (anv, nvk, radv)
VK_KHR_shared_presentable_image not started
@@ -597,7 +597,7 @@ Khronos extensions that are not part of any Vulkan version:
VK_EXT_device_fault DONE (radv)
VK_EXT_device_generated_commands DONE (nvk/Turing+, radv/gfx8+)
VK_EXT_device_memory_report DONE (vn)
VK_EXT_direct_mode_display DONE (anv, lvp, nvk, radv, tu, v3dv)
VK_EXT_direct_mode_display DONE (anv, lvp, nvk, panvk, radv, tu, v3dv)
VK_EXT_discard_rectangles DONE (radv)
VK_EXT_display_control DONE (anv, hasvk, nvk, radv, tu)
VK_EXT_display_surface_counter DONE (anv, lvp, nvk, radv, tu)

View File

@@ -3,6 +3,14 @@ Release Notes
The release notes summarize what's new or changed in each Mesa release.
- :doc:`25.0.7 release notes <relnotes/25.0.7>`
- :doc:`25.0.6 release notes <relnotes/25.0.6>`
- :doc:`25.0.5 release notes <relnotes/25.0.5>`
- :doc:`25.0.4 release notes <relnotes/25.0.4>`
- :doc:`25.0.3 release notes <relnotes/25.0.3>`
- :doc:`25.0.2 release notes <relnotes/25.0.2>`
- :doc:`25.0.1 release notes <relnotes/25.0.1>`
- :doc:`25.0.0 release notes <relnotes/25.0.0>`
- :doc:`24.3.4 release notes <relnotes/24.3.4>`
- :doc:`24.3.3 release notes <relnotes/24.3.3>`
- :doc:`24.3.2 release notes <relnotes/24.3.2>`
@@ -442,6 +450,14 @@ The release notes summarize what's new or changed in each Mesa release.
:maxdepth: 1
:hidden:
25.0.7 <relnotes/25.0.7>
25.0.6 <relnotes/25.0.6>
25.0.5 <relnotes/25.0.5>
25.0.4 <relnotes/25.0.4>
25.0.3 <relnotes/25.0.3>
25.0.2 <relnotes/25.0.2>
25.0.1 <relnotes/25.0.1>
25.0.0 <relnotes/25.0.0>
24.3.4 <relnotes/24.3.4>
24.3.3 <relnotes/24.3.3>
24.3.2 <relnotes/24.3.2>

4610
docs/relnotes/25.0.0.rst Normal file

File diff suppressed because it is too large Load Diff

251
docs/relnotes/25.0.1.rst Normal file
View File

@@ -0,0 +1,251 @@
Mesa 25.0.1 Release Notes / 2025-03-05
======================================
Mesa 25.0.1 is a bug fix release which fixes bugs found since the 25.0.0 release.
Mesa 25.0.1 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.1 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 49eb55ba5acccae91deb566573a6a73144a0f39014be1982d78c21c5b6b0bb3f mesa-25.0.1.tar.xz
SHA512: 1ecb1b90c5f78de4c61f177888543778285731faccc6f78d266d4b437f7b422a78b705a6e9fc6c9eab62c08f2573db5dd725eaa9cc9e5bedcaa7d8cfe6b47a1f mesa-25.0.1.tar.xz
New features
------------
- None
Bug fixes
---------
- Zink: Kopper's present thread causes Wayland protocol races
- GLmatrix needs aligned malloc
- Lavapipe crashes if no Position is output in mesh shader
- [RADV/aco][regression][bisected] - Avowed (2457220) - GPU hangs near Watermill outside of Dawnshore
- radv/sqtt: assertion "layout transition marker should be only emitted inside a barrier marker"
- [radv] Glitchy ground geometry regression in Total War Warhammer III on RX 7600
Changes
-------
Benjamin Lee (4):
- panfrost: remove NIR_PASS_V usage for noperspective lowering
- panfrost: fix large int32->float16 conversions
- panfrost: fix condition in bi_nir_is_replicated
- panfrost/va: remove swizzle mod from LDEXP
Caio Oliveira (1):
- brw: Fix size in assembler when compacting
Daniel Schürmann (5):
- aco/scheduler: always respect min_waves on GFX10+
- aco/insert_exec_mask: Don't immediately set exec to zero in break/continue blocks
- aco/insert_exec_mask: don't restore exec in continue_or_break blocks
- aco/ssa_elimination: insert parallelcopies for p_phi immediately before branch
- aco/assembler: Fix short jumps over chained branches
Dave Airlie (1):
- vulkan/wsi/x11: don't use update_region for damage if not created
David Rosca (2):
- frontends/va: Set AV1 max_width/height to surface size
- radeonsi/vcn: Set all pic params for H264 encode references
Dylan Baker (2):
- iris: Correctly set NOS for geometry shader state changes
- iris: fix handling of GL_*_VERTEX_CONVENTION
Emmanuel Gil Peyrot (1):
- panvk: Initialize out array with the correct length
Eric Engestrom (8):
- docs: add sha sum for 25.0.0
- .pick_status.json: Update to b331713f20148852370a4fae5c2830d46801eb3b
- .pick_status.json: Update to 55c476efed01121b3a64a58c304aae8ef9a79475
- .pick_status.json: Mark b85c94fc891fe9d73b3a032aea8a6a71b8e6173b as denominated
- .pick_status.json: Update to 4348253db5232b7be4db0a0ff47b31d51bc8f534
- .pick_status.json: Update to fbc55afbdfc93a82c69f1cd6a1f4abbed96cfd19
- .pick_status.json: Mark 5461ed5808421a8ffb79bdaa1449265f3e8f40a5 as denominated
- .pick_status.json: Update to 45e771f4fbe4245b252c6360e55776080f0bf458
Erik Faye-Lund (1):
- mesa/main: wire up glapi bits for EXT_multi_draw_indirect
Faith Ekstrand (12):
- nak: Only use suld.constant on Ampere+
- zink: Use the correct array size for signal_values[]
- zink: Use persistent semaphores for PIPE_FD_TYPE_SYNCOBJ
- nvk: Don't bind a fragment shading rate image pre-Turing
- nvk: Do not set INVALIDATE_SKED_CACHES pre-MaxwellB
- nak/qmd: Add a nak_get_qmd_cbuf_desc_layout() helper
- nvk: Handle pre-Turing dispatch indirect commands
- nvk: Only support deviceGeneratedCommandsMultiDrawIndirectCount on Turing+
- nvk: Only support compute shader derivatives on Turing+
- zink: Don't present to Wayland surfaces asynchronously
- egl/dri2: Rework get_wl_surface_proxy()
- egl/wayland: Pass the original wl_surface to kopper
Georg Lehmann (1):
- aco/insert_exec: fix continue_or_break on gfx6-7
Gert Wollny (1):
- r600/sfn: gather info and set lowering 64 bit after nir_lower_io
Guilherme Gallo (3):
- ci/lava: Drop the repeating quotes on lava-test-case
- ci/lava: Propagate errors in SSH tests
- ci/lava: Add U-Boot action timeout for rockchip DUTs
Hans-Kristian Arntzen (1):
- radv: Always set 0 dispatch offset for indirect CS.
Hyunjun Ko (1):
- anv: Do not support the tiling of DRM modifier if DECODE_DST
Iago Toral Quiroga (1):
- pan/va: fix FAU validation
James Hogan (5):
- mesa: Consider NumViews to reuse FBO attachments
- mesa: Handle GL_FRAMEBUFFER_INCOMPLETE_VIEW_TARGETS_OVR
- mesa: Check views don't exceed GL_MAX_ARRAY_TEXTURE_LAYERS
- mesa: OVR_multiview framebuffer attachment parameters
- mesa: Handle getting GL_MAX_VIEWS_OVR
Job Noorman (1):
- ir3/ra: prevent reusing parent interval of reloaded sources
Juan A. Suarez Romero (2):
- v3dv: duplicate key for texel_buffer cache
- broadcom/simulator: use string copy instead of memcpy
Karol Herbst (3):
- rusticl/mem: set num_samples and num_mip_levels to 0 when importing from GL
- rusticl/platform: advertise all extensions supported by all devices
- intel/brw, lp: enable lower_pack_64_4x16
Kevin Chuang (2):
- anv/bvh: Fix encoder handling sparse buffer
- anv/bvh: Fix copy shader handling sparse buffer
Konstantin Seurer (1):
- llvmpipe: Skip draw_mesh if the ms did not write gl_Position
Lars-Ivar Hesselberg Simonsen (2):
- panfrost: Use RUN_COMPUTE over RUN_COMPUTE_INDIRECT
- panvk: Use RUN_COMPUTE over RUN_COMPUTE_INDIRECT
Lionel Landwerlin (2):
- anv: fix missing 3DSTATE_PS:Kernel0MaximumPolysperThread programming
- vulkan/runtime: ensure robustness state is fully initialized
Lorenzo Rossi (1):
- nvk: Fix MSAA sparse residency lowering crash
Marek Olšák (1):
- mesa: allocate GLmatrix aligned to 16 bytes
Mary Guillemard (1):
- pan/bi: Disallow FAU special page 3 and WARP_ID on message instructions
Mike Blumenkrantz (6):
- zink: wait on tc fence before checking for fd semaphore
- zink: always fully unwrap contexts
- zink: clamp UBO sizes instead of asserting
- llvmpipe: pass layer count to rast clear
- gallium: fix pipe_framebuffer_state::view_mask
- mesa: avoid creating incomplete surfaces when multiview goes out of range
Natalie Vock (1):
- radv/rt: Don't allocate the traversal shader in a capture/replay range
Patrick Lerda (3):
- r600: fix evergreen_emit_vertex_buffers() related cl regression
- r600: fix the indirect draw 8-bits path
- r600: fix emit_image_size() range base compatibility
Paulo Zanoni (1):
- brw: extend the NOP+WHILE workaround
Peyton Lee (1):
- radeonsi/vpe: check reduction ratio
Pierre-Eric Pelloux-Prayer (2):
- tc: add missing TC_SENTINEL for TC_END_BATCH
- mesa/st: call _mesa_glthread_finish before _mesa_make_current
Rhys Perry (1):
- ac/nir: fix tess factor optimization when workgroup barriers are reduced
Roland Scheidegger (1):
- llvmpipe: Fix alpha-to-coverage without dithering
Samuel Pitoiset (3):
- radv/video: fix adding the query pool BO to the cmdbuf list
- radv: fix missing SQTT barriers for fbfetch color/depth decompressions
- radv: fix re-emitting fragment output state when resetting gfx pipeline state
Tapani Pälli (2):
- iris: wait for imported fences to be available in iris_fence_await
- iris: remove dead code that cannot get hit anymore
Yiwei Zhang (2):
- venus: fix image format cache miss with AHB usage query
- venus: relax the requirement for sync2
Yogesh Mohan Marimuthu (1):
- winsys/amdgpu: same_queue variable should be set if there is only one queue

234
docs/relnotes/25.0.2.rst Normal file
View File

@@ -0,0 +1,234 @@
Mesa 25.0.2 Release Notes / 2025-03-20
======================================
Mesa 25.0.2 is a bug fix release which fixes bugs found since the 25.0.1 release.
Mesa 25.0.2 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.2 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: adf904d083b308df95898600ffed435f4b5c600d95fb6ec6d4c45638627fdc97 mesa-25.0.2.tar.xz
SHA512: 2de8e8b514619d9ad5f407f5e1ff04fff8039d66b5f32257c2e8ca3d9f3b190269066aeba0779d6e0b2a2c0739237382fc6a98ea8563ed97801a809c96163386 mesa-25.0.2.tar.xz
New features
------------
- None
Bug fixes
---------
- RADV: logic used to avoid running on CDNA is faulty
- [LNL/BMG] Assassin's Creed Valhalla trace replay hang
- X11 + Zink on NVK flickers older frames in Firefox based browsers
- Vulkan conformanceVersion is reported as 0.0.0.0 in Mesa 25.0.0
- VRAM Abnormal use on mesa 25.0
- [radv][regression] Multiple games detect the wrong amount of vram
- Resident Evil 2 Remake flickers
- OpConstantNull not supported for OpTypeCooperativeMatrixKHR
- v3dv: vkcube-wayland crashes on raspberry pi 5 kernel 6.12 and latest mesa
- GMSH Visualization Fails with radeonsi:can't compile a main shader part, Fedora 41 AMD 7900xt
- AMD VDPAU deinterlacing SIGSEGV
- radv: vkd3d-proton test_primitive_restart_list_topology_stream_output randomly fails on NAVI2X
- Mesa 24.1 introduced a Vulkan problem with DOOM 2016 on AMD 780M GPU
- nouveau & zink+nvk: Flashing in Firefox and Thunderbird on Hyprland
Changes
-------
Aaron Ruby (1):
- gfxstream: Downgrade log severity when enabling params in LinuxVirtGpu
Alyssa Rosenzweig (2):
- pan/mdg: call nir_lower_is_helper_invocation
- nir/lower_helper_writes: fix stores after discard
Ashley Smith (1):
- panfrost: Reset syncobj after use to avoid kernel warnings
Bas Nieuwenhuizen (1):
- radv: Move support check out of winsys.
Dave Airlie (1):
- radv/video: don't try and send events on UVD devices.
David Rosca (4):
- gallium/vl: Fix video buffer supported format check
- Revert "frontends/vdpau: Alloc interlaced surface for interlaced pics"
- frontends/vdpau: Fix creating deinterlace filter for interleaved buffers
- gallium/vl: Return YUV plane order for single plane formats
Eric Engestrom (6):
- docs: add sha sum for 25.0.1
- .pick_status.json: Mark 61b0955308d720a6fa065e7a414d16999f7ffd03 as denominated
- .pick_status.json: Mark 534436f8635e63a30e4d7af4837dad35cfa361ad as denominated
- .pick_status.json: Update to 61feea6954a7526836ccbd30c657e6afc11fb4f5
- .pick_status.json: Mark 551770ccf8bdb1e5fa45ddac854535edf2b31a22 as denominated
- meson: announce that clover is deprecated (slated for removal)
Erik Faye-Lund (2):
- docs/features: add missing panvk feature
- panvk: correct VkPhysicalDeviceProperties::deviceName
Faith Ekstrand (9):
- util/box: Add a intersect_2d helper
- zink: Use pipe_box helpers for damage calculations
- zink: Set needs_barrier after transitioning to QUEUE_FAMILY_FOREIGN
- zink: Check queue families when binding image resources
- nvk: Allow rendering to linear images with unaligned strides
- nil: Relax alignment requirements for linear images
- vtn: Support cooperative matrices in OpConstantNull
- egl/x11: Re-order an if statement
- egl/kopper: Update the EGLSurface size after kopperSwapBuffers()
Ganesh Belgur Ramachandra (1):
- amd: use 128B compression for scanout images when drm.minor <63
Georg Lehmann (3):
- radv: enable invariant geom for DOOM(2016)
- aco/gfx11.5: remove vinterp ddx/ddy path
- aco/ra: disallow vcc definitions for pseudo scalar trans instrs
Ivan A. Melnikov (1):
- gallium/radeon: Make sure radeonsi PCI IDs are also included
Job Noorman (2):
- ir3: fix false dependencies of rpt instructions
- ir3: keep inputs at start block when creating empty preamble
John Anthony (1):
- panvk: Avoid division by zero for vkCmdCopyQueryPoolResults
José Roberto de Souza (1):
- intel/common: Retry GEM_CONTEXT_CREATE when PXP have not finished initialization
Karol Herbst (6):
- rusticl/program: implement CL_INVALID_PROGRAM_EXECUTABLE check in clGetProgramInfo
- rusticl/program: pass options by reference
- rusticl/program: loop over all devices inside Program::build
- rusticl/program: rework build_nirs so it only touches devices we care about
- rusticl/program: fix building kernels
- nir/serialize: fix decoding of is_return and is_uniform
Lionel Landwerlin (3):
- anv: fix non page aligned descriptor bindings on <Gfx12.0
- brw: fix spilling for Xe2+
- brw: ensure VUE header writes in HS/DS/GS stages
Lucas Stach (2):
- etnaviv: rs: fix slow/fast clear transitions
- etnaviv: fix ETNA_MESA_DEBUG=no_early_z
Marek Olšák (1):
- Revert "ac/nir: clamp vertex color outputs in the right place"
Mary Guillemard (2):
- pan/bi: Fix out of range access in bi_instr_replicates
- pan/bi: Ensure we select b0 with halfswizzle in va_lower_constants
Matt Turner (1):
- glsl: Add missing break
Maíra Canal (1):
- v3dv: don't overwrite the primary fd if it's already set
Mel Henning (1):
- nvk: Don't zero imported memory
Mike Blumenkrantz (1):
- zink: fix refcounting of zink_surface objects
Natalie Vock (2):
- radv/rt: Guard leaf encoding by leaf node count
- radv/rt: Flush L2 after writing internal node offset on GFX12
Patrick Lerda (2):
- r600: fix cayman main non-deterministic behavior problem
- r600: update the software fp64 support
Pierre-Eric Pelloux-Prayer (1):
- st/mesa: fix nir_load_per_vertex_input parameter
Rebecca Mckeever (1):
- panvk: Add STORAGE_IMAGE_BIT feature for formats supporting sampled images
Rhys Perry (1):
- aco: insert dependency waits in certain situations
Rob Clark (2):
- tc: Add missing tc_set_driver_thread()
- freedreno: Wait for imported syncobj fences to be available
Samuel Pitoiset (6):
- ac,radv: add a workaround for a hw bug with primitive restart on GFX10-GFX10.3
- radv: fix a GPU hang with inherited rendering and HiZ/HiS on GFX1201
- radv/amdgpu: fix device deduplication
- radv: update conformance version
- aco: do not apply OMOD/CLAMP for pseudo scalar trans instrs
- radv: emit a dummy PS state for noop FS on GFX12
Seán de Búrca (1):
- rusticl/mem: don't create svm_pointers slice from null raw pointer
Sviatoslav Peleshko (2):
- anv: Add full subgroups workaround for the shaders that use shared memory
- drirc: Apply assume_full_subgroups_with_shared_memory to Resident Evil 2
Timothy Arceri (1):
- util/u_idalloc: fix util_idalloc_sparse_alloc_range()
Yiwei Zhang (4):
- venus: fix a memory corruption in query records recycle
- lavapipe: set availability bit for accel struct host queries
- lavapipe: fix accel struct device query copy
- venus: fix to ignore dstSet for push descriptor

231
docs/relnotes/25.0.3.rst Normal file
View File

@@ -0,0 +1,231 @@
Mesa 25.0.3 Release Notes / 2025-04-02
======================================
Mesa 25.0.3 is a bug fix release which fixes bugs found since the 25.0.2 release.
Mesa 25.0.3 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.3 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 5ff426ed6ce0588fd96d18975bdff451ae2ab2fe98b5d1528842ee71ec66711b mesa-25.0.3.tar.xz
SHA512: a8ddfa3ac31869e82a49d14aaab0659d0496ae77db3f32aa0d5d28de8e1e4cace9fa652451a050fbc79281e8461cd70e86ad464aa387533387187fbcb604aaab mesa-25.0.3.tar.xz
New features
------------
- None
Bug fixes
---------
- [RADV][RDNA3][Phoenix3][APU] NARAKA: BLADEPOINT (1203220) gpu hang reproducible (ice/water regression mesa 24.1 bisected SAMPLE_MASK_TRACKER_WATERMARK=15) random (maybe other apps/games)
- GPU hangs running Octopath Traveler II with 780M
- GPU crash on Radeon 780M with Tales of Arise
- brw: Hit unreachable nir_op_fsign case that brw_nir_lower_fsign missed
- The Last of Us - shadows flickering on gfx1201 without nohiz flag
- anv: Dark pattern overlayed on objects in Eve Online DX11 mode on BMG
- Mesa 25 removes VA-API encoding for R9 390
- Video stuttering / anv: extend implicit fencing support
- anv, bmg: Visual issues in AC Origins, Odyssey and Fenyx Rising when dxvk doesn't export PointSize
- [ANV][LNL] - A Game About Digging A Hole (3244220) - Title throws an assertion failure on launch.
- anv/video: Timestamps are exposed in video encode queue, but it crashes
- Getting a crash with manually built llvmpipe (OpenGL)
- [RadeonSI] Blender assetshelf icons are borken in mesa >= 25.0.0
- radeonsi regression after 24.3.4
- misc OpenGL CTS failures
- glBindVertexBuffer regression due to ID reuse
Changes
-------
Caio Oliveira (1):
- brw: Fix decoding of 3-src destination stride in EU validation
Connor Abbott (3):
- tu: Fix GMEM offset for multisample layered separate stencil
- tu: Fix size of frag_size_ir3 and frag_offset_ir3 driver params
- tu: Fix reported FDM fragment size with multiview
Daniel Schürmann (1):
- aco: don't assume that demote doesn't cause an empty exec mask
Daniel Stone (1):
- ci: Re-enable trace jobs with updated Piglit
Dave Airlie (2):
- gallivm: check for avx512vbmi and tell LLVM the correct answer.
- nak: add reads after setting writes
David Rosca (5):
- radeonsi/vce: Support old VCE firmware
- gallium/vl: Fix rotation with scaling for compute shaders
- gallium/vl: Fix mirror with rotation for compute shaders
- frontends/va: Don't ignore rotation and mirror for conversions to RGB
- radv: Add radv_format_description to remap 10/12bit formats to 16bit
Eric Engestrom (11):
- docs: add sha sum for 25.0.2
- .pick_status.json: Update to 85983e060ccca163ff5c4aad51c7082b7ae8c4a0
- ci/piglit: drop usage of s3cp for a simple download
- ci: always abort if the curl download fails
- ci: replace broken s3cp command with a simple curl call
- ci: run shader-db & zink-lvp on kvm runners
- pick-ui: fix parsing of multiple \`backport-to:` lines
- .pick_status.json: Update to e3433489f81a75c278ff70cc5700cd028447bf76
- [25.0 only] update ci expectations
- .pick_status.json: Update to b60d816d6ee35cc1bfa2d2f6aed59104a09ec11d
- .pick_status.json: Update to 0d2ebca39fd2a68bfb64dc2196e442e25dc90334
Eric R. Smith (1):
- panfrost: consider xfb shader when calculating thread local storage size
Erik Faye-Lund (3):
- panfrost: avoid accidental aliasing
- panvk: check for texture-compression support
- mesa/main: fix regression in extension-checking
Faith Ekstrand (10):
- nak: Insert the annotation in the right spot in assign_regs
- nak: Always copy sources when handling vec/pack/mov ops
- nak: Fix a SM check for OpPCnt
- nvk: Free owned_gart_mem correctly
- nvk: Fix a Volta check
- nouveau/mme/fermi: Don't allow STATE and EMIT on the same op
- nvk: Use the right sample mask for 8x/4pass on Maxwell A
- vulkan/wsi: Signal buffer memory object when blitting
- nvk: Use max_image_dimension for maxFramebufferWidth/Height
- nvk: Disable 32k images on Pascal A
Hyunjun Ko (1):
- vulkan/video: Do byte-alignment when building a h264 slice header
Ian Romanick (1):
- brw/nir: Lower fsign again after last call to brw_nir_optimize
Job Noorman (1):
- ir3/legalize: take wrmask into account for delay updates
Jordan Justen (2):
- intel/dev: Add BMG PCI IDs (0xe210, 0xe215, 0xe216)
- intel/dev: Add BMG 0xe211 PCI ID
Lionel Landwerlin (4):
- anv: fix end of pipe timestamp query writes
- anv: disable replication when we don't have both VS/FS stages
- brw: always write the VUE header
- anv: limit implict write with drirc
Lucas Stach (1):
- kmsro: look for graphics capable screen as renderonly device
Natalie Vock (2):
- radv/rt: Flush CP writes from the common BVH framework with INV_L2 on GFX12
- vulkan/bvh: Move first PLOC task_count fetch inside PHASE
Paulo Zanoni (1):
- drirc/anv: DiggingGame.exe needs force_vk_vendor=-1
Pierre-Eric Pelloux-Prayer (2):
- ac/nir: fix nir_metadata value of ac_nir_lower_image_opcodes
- radeonsi: use composed swizzle in cdna_emu_make_image_descriptor
Rebecca Mckeever (1):
- panvk: Remove lower_tg4_broadcom_swizzle from panvk_preprocess_nir()
Rhys Perry (1):
- aco/ra: fix free register counting when moving variables
Robert Mader (3):
- llvmpipe: Take offset into account when importing dmabufs
- llvmpipe: Free dummy_dmabuf on shutdown
- gallivm: Re-add check for passmgr before disposing it
Samuel Pitoiset (8):
- radv: fix creating pipeline binary from the traversal shader
- radv: fix bpe for the stencil aspect of depth/stencil copies on transfer queue
- radv: fix compresed depth/stencil copies on transfer queue
- radv/meta: fix color<->depth/stencil image copies
- radv: do not trigger FCE or FMASK decompress on compute queue
- ac/surface: fix selecting preferred alignments for HiZ/HiS on GFX12
- Revert "radv: program SAMPLE_MASK_TRACKER_WATERMARK optimally for GFX11 APUs"
- Revert "radeonsi/gfx11: program SAMPLE_MASK_TRACKER_WATERMARK optimally for APUs"
Taras Pisetskyi (1):
- anv,driconf: Add sampler coordinate precision workaround for EVE Online
Timothy Arceri (9):
- mesa: fix reuse of deleted buffer object
- mesa: fix reuse of deleted texture object
- mesa: fix potential race condition in with TexObjects
- mesa: fix reuse of deleted sampler object
- mesa: fix potential race conditions in with FrameBuffers
- mesa: fix potential race condition in with RenderBuffers
- mesa: fix potential race condition in with ATIShaders
- mesa: fix potential race condition in with Programs
- nir: fix uniform cloning helper
Tomeu Vizoso (2):
- egl/surfaceless: Only choose drivers that expose the graphics capability
- kopper: Explicitly choose zink
Trigger Huang (1):
- radeonsi: Fix perfcounter start event in si_pc_emit_start
Valentine Burley (1):
- ci: Add missing kvm runner tags
Yiwei Zhang (6):
- docs: demote VK_KHR_shader_relaxed_extended_instruction
- venus: fix unexpected ring alive status expire upon owner thread switch
- venus: fix ahb usage caching
- venus: fix maint4 multi-planar memory requirements
- panvk/csf: rework cache flush reduction
- panvk: fix memory requirement query for aliased disjoint image
irql-notlessorequal (1):
- hasvk: Fix non-functioning version override.

256
docs/relnotes/25.0.4.rst Normal file
View File

@@ -0,0 +1,256 @@
Mesa 25.0.4 Release Notes / 2025-04-17
======================================
Mesa 25.0.4 is a bug fix release which fixes bugs found since the 25.0.3 release.
Mesa 25.0.4 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.4 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 76293cf4372ca4e4e73fd6c36c567b917b608a4db9d11bd2e33068199a7df04d mesa-25.0.4.tar.xz
SHA512: 562a97bd0374ff2a76f71c848df4fe542f1fc66c420a9101eb4bb1947d00eee4417d9c6f2d1be19638663753785c19384f8a6dc078c3187448ab79413d906152 mesa-25.0.4.tar.xz
New features
------------
- None
Bug fixes
---------
- RADV: Performance regression in Elden Ring on GFX8/Polaris
- RADV: Performance regression in Elden Ring on GFX8/Polaris
- Confidential issue #12324
- Confidential issue #12946
- The Last of Us Part I GPU hang on gfx1201
- brw: new Xe2 CTS failures
- [NVK] NAK assert in The Last of Us Part 2 shader
- [ANV][LNL] - Lost Records: Bloom & Rage (1902960) - Title hangs on launch and subsequently crashes to desktop.
- [BMG] Intel b580 battlemage: Fort Solis (Unreal Engine game) boots to menu, hangs while loading after hitting continue from the main menu
- [ANV][LNL] - NINJA GAIDEN 2 Black (3287520) - Environment assets are incorrectly rendered or missing.
- [ANV][LNL] - The Headliners (3059070) - Title hangs a few minutes after launch.
- anv, regression: Invisibly blinking cliffs & rocks in Satisfactory DX12 on BMG
- vk/overlay: output_file option failing
- [bisected, LNL] brw: 341e5117ecbc ("brw/nir: Treat load_const as convergent") regresses arb_gpu_shader5-interpolateAtOffset on LNL
- vulkan regression mesa 24.3.4 to 25.0.0.rc3 with broadcom
- radv: nir_opt_varyings.c:2766: deduplicate_outputs: Assertion \`list_index == 0' failed.
- vulkan/wsi: memory leak from wsi_CreateSwapchainKHR
Changes
-------
Aaron Ruby (2):
- gfxstream: Make the virtgpu device discovery for LinuxVirtGpu more robust
- gfxstream: Add common interfaces in the VirtGpuDevice to query DrmInfo and PciBusInfo
Alyssa Rosenzweig (4):
- nir/lower_blend: refactor logicop variables
- nir/lower_blend: disable logic ops for unsupported formats
- panfrost: invert and rename no_ubo_to_push flag
- panfrost: do not push "true" UBOs
Benjamin Lee (2):
- panvk/csf: fix uninitialized read in utrace_clone_init_builder
- panfrost/pps: fix omitting several counters
Benjamin Otte (1):
- lavapipe: Don't advertise support for multiplane drm formats
Boris Brezillon (2):
- vulkan/state: Fix input attachment map state initialization/copy
- vk/pass: Add input attachment location info
Caio Oliveira (1):
- nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset
Caterina Shablia (2):
- panfrost: don't overwrite push uniforms and sysvals UBO with user's UBO
- panfrost: update nr_uniform_buffers before dispatching XFB
Connor Abbott (1):
- tu: Fix layer_count with dynamic rendering + multiview
David Rosca (4):
- radeonsi/vcn: Disable AV1 unidir compound with rate control
- radv/video: Fix msg header total size
- radv/video: Fix encode session info for VCN3+
- radeonsi/vpe: Use float division to get scaling ratio
Eric Engestrom (7):
- docs: add sha sum for 25.0.3
- [25.0 only] update more ci expectations
- .pick_status.json: Update to 7c5389695bdf106acaab6ccc69535f25c1d7a8e6
- ci: rename ci-tron priority tag to avoid conflict with the generic fdo runners
- .pick_status.json: Update to 2f00daf67a7990da68dfc4a8e5f2019daecb7a59
- .pick_status.json: Update to 58321cf2e57279079bf742be1063ac2900ea2436
- .pick_status.json: Update to 555821ff93118d4a6ea441127cd0427a95743d47
Eric R. Smith (2):
- panfrost,lima: use index size in panfrost minmax_cache
- panfrost: fix transaction elimination crc valid calculation
Erik Faye-Lund (4):
- panfrost: fixup typo in 16x sample-pattern
- nir/lower_tex: use texture_mask instead of shifting on use
- panvk: set shared_addr_format
- panvk: claim official conformance on v10
Faith Ekstrand (3):
- nak: Allow predicates in nir_intrinsic_as_uniform
- nvk/nvkmd: Check the correct flag for the Kepler GART workaround
- nil: Multiply by array_stride_B instead of adding
Felix DeGrood (1):
- vk/overlay-layer: fix regression in non-control pathway
Georg Lehmann (2):
- spirv: clamp/sign-extend non 32bit ldexp exponents
- spirv: fix cooperative matrix by value function params
Gurchetan Singh (3):
- gfxstream: check device exists before using it
- gfxstream: refactor device initialization
- gfxstream: follow the semantics desired by distro VK loader
Ian Romanick (4):
- brw/algebraic: Constant folding for BROADCAST and SHUFFLE
- brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset
- brw/algebraic: Optimize derivative of convergent value
- brw/nir: Use offset() for all uses of offs in emit_pixel_interpolater_alu_at_offset
Jan Alexander Steffens (heftig) (1):
- gfxstream: Use proper log format for 32-bit Vulkan
Job Noorman (1):
- ir3/ra: assign interval offsets to new defs after shared RA
Jose Maria Casanova Crespo (1):
- v3dv: avoid TFU reading unmapped pages beyond the end of the buffers
Juan A. Suarez Romero (1):
- v3dv: don't check if DRM device is master
Kenneth Graunke (4):
- brw: Track the largest VGRF size in liveness analysis
- brw: Use live->max_vgrf_size in register coalescing
- brw: Use live->max_vgrf_size in pre-RA scheduling
- brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs()
Lars-Ivar Hesselberg Simonsen (2):
- panvk: Add barrier for interleaved ZS copy cmds
- vk/sync: Fix execution only barriers
Lionel Landwerlin (3):
- brw: fix shuffle with scalar/uniform index
- anv: fix self dependency computation
- brw: fix Wa_22013689345 emission
Marek Olšák (5):
- radeonsi: work around a primitive restart bug on gfx10-10.3
- radeonsi: make si_shader_selector::main_shader_part_* an iterable union
- radeonsi: add ACO-specific main shader parts
- ac/surface: make gfx12_estimate_size reusable by gfx6
- ac/surface: select 3D tile mode without overallocating too much for gfx6-8
Mike Blumenkrantz (4):
- gallium/util: check nr_samples in pipe_surface_equal()
- tu: check for valid descriptor set when binding descriptors
- zink: don't set shared block stride without KHR_workgroup_memory_explicit_layout
- zink: stop setting ArrayStride on image arrays
Natalie Vock (1):
- aco: Make private_segment_buffer/scratch_offset per-resume
Patrick Lerda (9):
- r600: move stores to the end of shader when required
- r600: fix textures with swizzles limited to zero and one
- r600: fallback to util_blitter_draw_rectangle when required
- r600: fix pa_su_vtx_cntl rounding mode
- r600: fix points clipping
- i915: fix i915_set_vertex_buffers() related refcnt imbalance and remove redundancies
- i915: fix slab_create() related memory leaks
- i915: fix nir_to_tgsi() related memory leak
- i915: fix draw_create_fragment_shader() related memory leak
Pierre-Eric Pelloux-Prayer (1):
- winsys/amdgpu: disable VM_ALWAYS_VALID
Rob Clark (1):
- tu/vdrm: Fix userspace fence cmds
Ryan Mckeever (1):
- pan/format: Update format flags to follow HW spec
Samuel Pitoiset (4):
- radv: fix ignoring conditional rendering with vkCmdResolveImage()
- radv: determine if HiZ/HiS is enabled earlier on GFX12
- radv: add a workaround for buggy HiZ/HiS on GFX12
- radv: apply the workaround for buggy HiZ/HiS on GFX12 for DGC
Sviatoslav Peleshko (1):
- vulkan/wsi/headless: Remove unnecessary wsi_configure_image()
Tapani Pälli (3):
- compiler/glsl: check that bias is not used outside fragment stage
- mesa: clamp texbuf query size to MAX_TEXTURE_BUFFER_SIZE
- mesa: various fixes for ClearTexImage/ClearTexSubImage
Timothy Arceri (1):
- glsl: fix regression in ubo cloning
Timur Kristóf (4):
- nir/xfb: Preserve some xfb information when gathering from intrinsics.
- nir/opt_varyings: Fix assertion when deduplicating TCS outputs.
- radv: Use buffers_written mask when gathering XFB info.
- radv: Call nir_opt_undef too after nir_opt_varyings.

185
docs/relnotes/25.0.5.rst Normal file
View File

@@ -0,0 +1,185 @@
Mesa 25.0.5 Release Notes / 2025-04-30
======================================
Mesa 25.0.5 is a bug fix release which fixes bugs found since the 25.0.4 release.
Mesa 25.0.5 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.5 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: c0d245dea0aa4b49f74b3d474b16542e4a8799791cd33d676c69f650ad4378d0 mesa-25.0.5.tar.xz
SHA512: d65e027829e3bef60bc0e3e71160e6b3721e797e2157c71dbeef0cd6e202f8f8098b3cd41159cd0e96e520eaf92ea49c2c9bb1af1a54867b6a7c551c2197c068 mesa-25.0.5.tar.xz
New features
------------
- None
Bug fixes
---------
- WWE 2k23 small "artifacts"
- Variable Rate Shading (VRS) produces very aliased results on RADV with an AMD gpu
- Vulkan issues after sleeping on 9070 XT
- ring gfx_0.0.0 timeout after waking from sleep - RX 9070
- radeonsi: CL conformance test \`vector_swizzle` fails since 177427877bb50ad7ba24abfa13e55a2684d804df
- Random mesa crashes in kwin_wayland on a 6600XT
- Patch to fix clinfo on rusticl
- radv/aco: Ghost of Tsushima hangs and causes gpu resets on RDNA 3 GPU
- mesa-vulkan-driver-git.x86_64 causes strange colored rectangle artifacts in Final Fantasy XIV
Changes
-------
Connor Abbott (1):
- tu: Fix flushing when using a staging buffer for copies
Danylo Piliaiev (1):
- tu,freedreno: Don't fallback to LINEAR with DRM_FORMAT_MOD_QCOM_COMPRESSED
David Rosca (1):
- radv: Use radv_format_to_pipe_format instead of vk_format_to_pipe_format
Dmitry Baryshkov (1):
- meson: disable SIMD blake optimisations on x32 host
Ella Stanforth (1):
- v3d/compiler: Fixup output types for all 8 outputs
Eric Engestrom (8):
- docs: add sha sum for 25.0.4
- .pick_status.json: Update to 5f3a3740dcc6d243f2ef14138fb1c09bcbb9b5fd
- pick-ui: make \`Backport-to: 25.0` backport to 25.0 \*and more recent release branches*
- aco: help clang 20 do some additions and subtractions
- .pick_status.json: Update to 091d52965f805d61dd3a8e091ac20869a794e632
- pick-ui: add missing dependency
- .pick_status.json: Update to 3493500abb78a4dc22aba14840bba5c777fde745
- .pick_status.json: Update to 5a55133ce7d5bb2419f2aa99c5296037afb7ba6a
Faith Ekstrand (2):
- nak/legalize: Take a RegFile in copy_alu_src_and_lower_fmod
- nak/sm70: Fix the bit74_75_ar_mod assert
Georg Lehmann (2):
- nir/opt_algebraic: disable fsat(a + 1.0) opt if a can be NaN
- aco: set opsel_hi to 1 for WMMA
Ian Romanick (4):
- brw/algebraic: Clear condition modifier on optimized SEL instruction
- brw/algebraic: Don't optimize float SEL.CMOD to MOV
- elk/algebraic: Clear condition modifier on optimized SEL instruction
- elk/algebraic: Don't optimize float SEL.CMOD to MOV
Janne Grunau (2):
- venus: Do not use instance pointer before NULL check
- venus: virtgpu: Require stable wire format
John Anthony (1):
- panvk: Enable VK_EXT_direct_mode_display
José Roberto de Souza (3):
- intel: Program XY_FAST_COLOR_BLT::Destination Mocs for gfx12
- intel: Fix the MOCS values in XY_FAST_COLOR_BLT for Xe2+
- intel: Fix the MOCS values in XY_BLOCK_COPY_BLT for Xe2+
Karol Herbst (2):
- rusticl/device: fix panic when disabling 3D image write support
- nir_lower_mem_access_bit_sizes: fix negative chunk offsets
Lionel Landwerlin (1):
- anv: use companion batch for operations with HIZ/STC_CCS destination
Loïc Minier (1):
- freedreno: check if GPU supported in fd_pipe_new2
Marek Olšák (1):
- radv: fix incorrect patch_outputs_read for TCS with dynamic state
Mary Guillemard (3):
- panvk: reset dyn_bufs map count to 0 in create_copy_table
- panvk: Take rasterization sample into account in indirect draw on v10+
- panvk: Take resource index in valhall_lower_get_ssbo_size
Mel Henning (3):
- nvk: SET_STATISTICS_COUNTER at start of meta_begin
- nvk: Override render enable for blits and resolves
- wsi/headless: Override finish_create
Mike Blumenkrantz (1):
- zink: verify that surface exists when adding implicit feedback loop
Olivia Lee (1):
- panfrost: allow promoting sysval UBO to push constants
Patrick Lerda (1):
- mesa_interface: fix legacy dri2 compatibility
Pierre-Eric Pelloux-Prayer (1):
- radeonsi: fix potential use after free in si_set_debug_callback
Rhys Perry (3):
- aco/gfx12: don't use second VALU for VOPD's OPX if there is a WaR
- aco: combine VALU lanemask hazard into VALUMaskWriteHazard
- aco/gfx11: create waitcnt for workgroup vmem barriers
Samuel Pitoiset (3):
- radv: only enable DCC for invisible VRAM on GFX12
- radv: fix re-emitting VRS state when rendering begins
- radv: set radv_disable_dcc=true for WWE 2k23
Tapani Pälli (2):
- iris: force reallocate on eglCreateImage with GFX >= 20
- iris: make sure to not mix compressed vs non-compressed
Tomeu Vizoso (1):
- etnaviv: Release screen->dummy_desc_reloc.bo
Yinjie Yao (2):
- gallium/pipe: Increase hevc max slice to 600
- frontends/va: Handle properly when decoding more slices than limit
Yiwei Zhang (1):
- venus: fix missing renderer destructions

182
docs/relnotes/25.0.6.rst Normal file
View File

@@ -0,0 +1,182 @@
Mesa 25.0.6 Release Notes / 2025-05-14
======================================
Mesa 25.0.6 is a bug fix release which fixes bugs found since the 25.0.5 release.
Mesa 25.0.6 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.6 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 0d179e019e3441f5d957330d7abb3b0ef38e6782cc85a382608cd1a4a77fa2e1 mesa-25.0.6.tar.xz
SHA512: 6a0abc8a5bbbb8ffdad7286fc5642f643b1f4183794425ba689c2c9f5c73a4131c8685074241deb1022631b4c1f1c505dbd848190ec60d5d6931e90dd9316e05 mesa-25.0.6.tar.xz
New features
------------
- None
Bug fixes
---------
- In SkinDeep, GL_LINES causes GL_INVALID_OPERATION with radeonsi and llvmpipe
- radv: UB and artifacts when copying a \`COMBINED_IMAGE_SAMPLER` with an immutable sampler
- RADV: Dynamic state multiple viewport corruption
- [drm:amdgpu_uvd_cs_pass2 [amdgpu]] \*ERROR* )Handle 0x48780001 already in use!
- glGetInternalformativ returns incorrect information for GL_STENCIL_INDEX8
- RadeonSI: Psychonauts rendering regression since !29895
- [r600g] Rejected CS when using dolphin's GPU texture decoder
- radeonsi: Assertion \`src_bit_size == bit_size' failed. when running without MESA_GLSL_DISABLE_IO_OPT=1
- radeonsi vdpau + Packed YUY2 = assert
- Indiana Jones and The Great Circle, Graphical corruption on 9070 XT.
- glPushAttrib/glPopAttrib broken with glColorMaterial and ligthing
- radv: Flickering in Kingdom Come: Deliverance II
- RADV regression causes severe glitches in Hunt Showdown 1896 on Polaris
- Z-Fighting in Tomb Raider IV - VI Remastered Linux
- RADV:RX 9070:Mesa-25.0.5 GTA 5 Enhanced GPU HANG
- [anv] VK_ERROR_DEVICE_LOST on Linux 6.13.8 while playing Dota 2 on Intel Graphics
Changes
-------
Connor Abbott (4):
- freedreno: Add compute_lb_size device info
- freedreno/a6xx: Define CONSTANTRAMMODE
- freedreno/a6xx, turnip: Set CONSTANTRAMMODE correctly
- ir3: Take LB restriction on constlen into account on a7xx
David Rosca (3):
- frontends/vdpau: Fix creating surfaces with 422 chroma
- ac/uvd: Add ac_uvd_alloc_stream_handle
- radv/video: Use ac_uvd_alloc_stream_handle
Eric Engestrom (4):
- docs: add sha sum for 25.0.5
- .pick_status.json: Update to e7a7d9ea2e2e48171fad131a7bfa7576e02ea4e0
- .pick_status.json: Mark eeffb4e674d10db9aefebeca91c2d87c1676b81e as denominated
- .pick_status.json: Mark 4b76d04f7f3348838239f184e68141df6409b67a as denominated
Faith Ekstrand (1):
- nak: Set lower_pack_64_4x16
Gurchetan Singh (1):
- gfxstream: make sure by default descriptor is negative
José Roberto de Souza (1):
- intel/tools: Fix batch buffer decoder
Karmjit Mahil (1):
- tu: Fix segfault in fail_submit KGSL path
Karol Herbst (4):
- r600: fix r600_buffer_from_user_memory for rusticl
- iris: parse global bindings for every gen
- iris/xe: fix compute shader start address
- iris/xe: take the grids variable_shared_mem into account
Konstantin Seurer (1):
- radv: Return VK_ERROR_INCOMPATIBLE_DRIVER for unsupported devices
Lars-Ivar Hesselberg Simonsen (4):
- pan/texture: Correctly handle slice stride for MSAA
- pan/texture: Set plane size to slice size
- pan/genxml/v10: Add minus1 mod for plane width/height
- pan/texture/v10+: Set width/height in the plane descs
Lionel Landwerlin (3):
- anv: force fragment shader execution when occlusion queries are active
- intel: fix null render target setup logic
- vulkan/runtime: fixup assert with link_geom_stages
Marek Olšák (2):
- nir/opt_vectorize_io: fix a failure when vectorizing different bit sizes
- nir: fix gathering color interp modes in nir_lower_color_inputs
Matthieu Oechslin (1):
- r600: Take dual source blending in account when creating target mask with RATs
Mel Henning (3):
- nak: Remove hfma2 src 1 modifiers
- nak: Add Src::is_unmodified() helper
- nak: Check that swizzles are none
Mike Blumenkrantz (2):
- egl: fix sw fallback rejection in non-sw EGL_PLATFORM=device
- zink: fix broken comparison for dummy pipe surface sizing
Natalie Vock (2):
- radv,driconf: Add radv_force_64k_sparse_alignment config
- driconf: Add workarounds for DOOM: The Dark Ages
Paul Gofman (1):
- radv/amdgpu: Fix hash key in radv_amdgpu_winsys_destroy().
Rhys Perry (3):
- aco: swap the correct v_mov_b32 if there are two of them
- ac/llvm: correctly split vector 8/16-bit stores
- ac/llvm: correctly set alignment of vector global load/store
Robert Mader (1):
- llvmpipe: Fix dmabuf import paths for DRM_FORMAT_YUYV variants
Sagar Ghuge (2):
- intel/compiler: Fix stackIDs on Xe2+
- anv: Fix untyped data port cache pipe control dump output
Samuel Pitoiset (7):
- radv: do not clear unwritten color attachments with dual-source blending
- radv: disable SINGLE clear codes to workaround a hw bug with DCC on GFX11
- radv: fix GPU hangs with image copies for ASTC/ETC2 formats on transfer queue
- radv: ignore radv_disable_dcc_stores on GFX12
- radv: fix SDMA copies for linear 96-bits formats
- radv: fix emitting dynamic viewports/scissors when the count is static
- radv: remove the optimization for equal immutable samplers
Tapani Pälli (1):
- mesa: add missing stencil formats to _mesa_is_stencil_format
Thomas H.P. Andersen (1):
- driconf: update X4 Foundations executable name
Timothy Arceri (3):
- util/driconf: add force_gl_depth_component_type_int workaround
- mesa: fix color material tracking
- mesa: relax EXT_texture_integer validation

199
docs/relnotes/25.0.7.rst Normal file
View File

@@ -0,0 +1,199 @@
Mesa 25.0.7 Release Notes / 2025-05-28
======================================
Mesa 25.0.7 is a bug fix release which fixes bugs found since the 25.0.6 release.
Mesa 25.0.7 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 25.0.7 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 592272df3cf01e85e7db300c449df5061092574d099da275d19e97ef0510f8a6 mesa-25.0.7.tar.xz
SHA512: 825bbd8bc5507de147488519786c0200afacf97dae621c80ead24b2c5dd55c5a442757ac8452698ae611e9344025465080795cf8f2dc4eb7ce07b5cc521b2b5c mesa-25.0.7.tar.xz
New features
------------
- None
Bug fixes
---------
- RX9070 hard crash with Mafia Definitive Edition
- RADV: Potential bug with vulkan fragment shader interpolation (on outputs from mesh shaders?)
- In the game "Foundation" a buildings areas of effect is missing
- ANV: Dota 2 May 22 2025 update crashing in vkCmdBindDescriptorSets with no validation error
- [RADV][GFX9] Recent Mesa-git broken on AMD Vega 64 with ring sdma0 timeouts when launching DXVK games
- Vulkan Video engages during playback of format which is not supported by my Fiji GPU
- ACO: IR Validation error "SDWA operand selection size" triggered by compute shader on VEGA20
- RADV: Gibberish output with llama.cpp (Vulkan compute) on Radeon VII (Vega20) with Mesa 25.1.0, works on 25.0.5
- Blending broken in game SkinDeep
- Radeon R5 (Mullins) H264 VA-API encoding acceleration doesn't work
- nvk: lib_rs_gen.py requires \`rustfmt`
- radv: vkd3d-proton test failure with predication + EXT_dgc
- mesa-25.0.4 aborts Xserver due to ACO "Unsupported opcode" v_lshlrev_b16
Changes
-------
Adam Jackson (2):
- vtn: (Silently) handle FunctionParameterAttributeNo{Capture,Write}
- vtn/opencl: Handle OpenCLstd_F{Min,Max}_common
Calder Young (2):
- iris: Fix accidental writes to global dirty bit instead of local
- iris: set dependency between SF_CL and CC states
Christian Gmeiner (1):
- zink: Fix NIR validation error in cubemap-to-array lowering
Dave Airlie (1):
- nvk: Fix compute class comparison in dispatch indirect
David Rosca (4):
- radeonsi/vce: Fix bitstream buffer size
- radeonsi/vce: Only send one task per IB
- radeonsi/vce: Fix output quality and performance in speed preset
- radv/video: Limit 10bit H265 decode support to stoney and newer
Ella Stanforth (1):
- v3d/compiler: Fix ub when using memcmp for texture comparisons.
Eric Engestrom (3):
- docs: add sha sum for 25.0.6
- .pick_status.json: Mark 29d7b90cfcb67ecc2ff3e422dd7b38898abb1bbe as denominated
- .pick_status.json: Update to 8965e60118fa17407c5bfcdca1fe2854ad2fb150
Erik Faye-Lund (1):
- mesa/main: remove non-existing function prototype
Faith Ekstrand (2):
- nvk: Allocate the correct VAB size on Kepler
- nouveau/mme: Don't install the HW tests
Georg Lehmann (2):
- radeonsi: always lower alu bit sizes
- aco: assume sram ecc is enabled on Vega20
Gurchetan Singh (1):
- gfxstream: get rid of logspam in virtualized case
Hans-Kristian Arntzen (1):
- radv: Consider that DGC might need shader reads of predicated data.
José Roberto de Souza (2):
- anv: Implement missing part of Wa_1604061319
- anv: Enable preemption due 3DPRIMITIVE in GFX 12
Karol Herbst (2):
- nir: fix use-after-free on function parameter names
- vtn: fix use-after-free on function parameter names
Lars-Ivar Hesselberg Simonsen (2):
- panvk/v9+: Set up limited texture descs for storage use
- panvk/v9+: Set up limited texture descs for storage use
LingMan (1):
- entaviv/isa: Silence warnings about non snake case names
Lionel Landwerlin (4):
- anv: enable preemption setting on command/batch correctly
- anv/brw: stop turning load_push_constants into load_uniform
- hasvk/elk: stop turning load_push_constants into load_uniform
- anv: don't use pipeline layout at descriptor bind
Marek Olšák (2):
- winsys/amdgpu: fix running out of 32bit address space with high FPS
- glsl: fix sampler and image type checking in lower_precision
Matt Turner (1):
- gallivm: Use \`llvm.roundeven` in lp_build_round()
Mel Henning (2):
- nouveau/headers: Run rustfmt after file is closed
- nouveau/headers: Ignore PermissionError in rustfmt
Mike Blumenkrantz (2):
- llvmpipe: disable conditional rendering mem for blits
- lavapipe: handle counterOffset in vkCmdDrawIndirectByteCountEXT
Natalie Vock (1):
- driconf: Fix DOOM: The Dark Ages workaround name in 25.0.x
Olivia Lee (1):
- util/u_printf: fix memory leak in u_printf_singleton_add_serialized
Patrick Lerda (1):
- r600: fix pop-free clipping
Paulo Zanoni (1):
- anv/trtt: don't avoid the TR-TT submission when there is stuff to signal
Qiang Yu (1):
- nir/opt_varyings: fix mesh shader miss promote varying to flat
Rhys Perry (1):
- aco/gfx115: consider point sample acceleration
Rob Clark (1):
- ci: Disable fd-farm
Samuel Pitoiset (5):
- radv: fix fetching conditional rendering state for DGC preprocess
- radv: fix conditional rendering with DGC and non native 32-bit predicate
- radv: fix missing texel scale for unaligned linear SDMA copies
- radv: fix capture/replay with sparse images and descriptor buffer
- radv: add radv_disable_hiz_his_gfx12 and enable for Mafia Definitive Edition
Timothy Arceri (7):
- st/mesa: fix _IntegerBuffers bitfield use
- mesa/st: fix _BlendForceAlphaToOneDraw bitfield use
- mesa/st: fix _IsRGBDraw bitfield use
- mesa: fix _FP32Buffers bitfield use
- mesa: update validation when draw buffer changes
- mesa: extend linear_as_nearest work around
- util: add workaround for the game Foundation

View File

@@ -1,40 +0,0 @@
cl_khr_depth_images in rusticl
Vulkan 1.4 on radv/gfx8+
VK_KHR_dedicated_allocation on panvk
VK_KHR_global_priority on panvk
VK_KHR_index_type_uint8 on panvk
VK_KHR_map_memory2 on panvk
VK_KHR_multiview on panvk/v10+
VK_KHR_shader_non_semantic_info on panvk
VK_KHR_shader_relaxed_extended_instruction on panvk
VK_KHR_vertex_attribute_divisor on panvk
VK_KHR_zero_initialize_workgroup_memory on panvk
VK_KHR_shader_draw_parameters on panvk
VK_KHR_shader_float16_int8 on panvk
VK_KHR_8bit_storage on panvk
VK_EXT_4444_formats on panvk
VK_EXT_global_priority on panvk
VK_EXT_global_priority_query on panvk
VK_EXT_host_query_reset on panvk
VK_EXT_image_robustness on panvk
VK_EXT_pipeline_robustness on panvk
VK_EXT_provoking_vertex on panvk
VK_EXT_queue_family_foreign on panvk
VK_EXT_sampler_filter_minmax on panvk
VK_EXT_scalar_block_layout on panvk
VK_EXT_tooling_info on panvk
depthClamp on panvk
depthBiasClamp on panvk
drawIndirectFirstInstance on panvk
fragmentStoresAndAtomics on panvk/v10+
sampleRateShading on panvk
occlusionQueryPrecise on panvk
shaderInt16 on panvk
shaderInt64 on panvk
imageCubeArray on panvk
VK_KHR_depth_clamp_zero_one on RADV
VK_KHR_maintenance8 on radv
VK_KHR_shader_subgroup_rotate on panvk/v10+
Vulkan 1.1 on panvk/v10+
VK_EXT_subgroup_size_control on panvk/v10+
initial GFX12 (RDNA4) support on RADV

View File

@@ -136,7 +136,9 @@ following example::
Backport-to: 21.0
Multiple ``Backport-to:`` lines are allowed.
This will backport the commit to the 21.0 branch, as well as any more recent
stable branch. Multiple ``Backport-to:`` lines are allowed, but only the
lowest number mentioned actually matters, so for clarity, please only use one.
The last option is deprecated and mostly here for historical reasons
dating back to when patch submission was done via emails: using a ``Cc:``

View File

@@ -652,13 +652,17 @@ struct drm_amdgpu_gem_userptr {
/* GFX12 and later: */
#define AMDGPU_TILING_GFX12_SWIZZLE_MODE_SHIFT 0
#define AMDGPU_TILING_GFX12_SWIZZLE_MODE_MASK 0x7
/* These are DCC recompression setting for memory management: */
/* These are DCC recompression settings for memory management: */
#define AMDGPU_TILING_GFX12_DCC_MAX_COMPRESSED_BLOCK_SHIFT 3
#define AMDGPU_TILING_GFX12_DCC_MAX_COMPRESSED_BLOCK_MASK 0x3 /* 0:64B, 1:128B, 2:256B */
#define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_SHIFT 5
#define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_MASK 0x7 /* CB_COLOR0_INFO.NUMBER_TYPE */
#define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_SHIFT 8
#define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_MASK 0x3f /* [0:4]:CB_COLOR0_INFO.FORMAT, [5]:MM */
/* When clearing the buffer or moving it from VRAM to GTT, don't compress and set DCC metadata
* to uncompressed. Set when parts of an allocation bypass DCC and read raw data. */
#define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_SHIFT 14
#define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_MASK 0x1
/* bit gap */
#define AMDGPU_TILING_GFX12_SCANOUT_SHIFT 63
#define AMDGPU_TILING_GFX12_SCANOUT_MASK 0x1

View File

@@ -277,7 +277,11 @@ CHIPSET(0xe202, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe20b, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe20c, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe20d, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe210, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe211, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe212, bmg, "BMG G21", "Intel(R) Graphics")
CHIPSET(0xe215, bmg, "BMG", "Intel(R) Graphics")
CHIPSET(0xe216, bmg, "BMG", "Intel(R) Graphics")
CHIPSET(0xb080, ptl, "PTL", "Intel(R) Graphics", FORCE_PROBE)
CHIPSET(0xb081, ptl, "PTL", "Intel(R) Graphics", FORCE_PROBE)

View File

@@ -525,6 +525,8 @@ if not have_mtls_dialect
# cross-compiling, but because this is just an optimization we can skip it
if meson.is_cross_build() and not meson.can_run_host_binaries()
warning('cannot auto-detect -mtls-dialect when cross-compiling, using compiler default')
elif host_machine.system() == 'freebsd'
warning('cannot use -mtls-dialect for FreeBSD, using compiler default')
else
# The way to specify the TLSDESC dialect is architecture-specific.
# We probe both because there is not a fallback guaranteed to work for all
@@ -766,6 +768,8 @@ endif
_opencl = get_option('gallium-opencl')
_rtti = get_option('cpp_rtti')
if _opencl != 'disabled'
warning('Clover will be removed in Mesa 25.2')
if not with_gallium
error('OpenCL Clover implementation requires at least one gallium driver.')
endif

View File

@@ -151,6 +151,7 @@ option(
choices : ['icd', 'standalone', 'disabled'],
value : 'disabled',
description : 'build gallium "clover" OpenCL frontend.',
deprecated: true,
)
option(

View File

@@ -191,6 +191,9 @@
HWCI_KERNEL_MODULES: amdgpu
KERNEL_IMAGE_TYPE: ""
RUNNER_TAG: mesa-ci-x86-64-lava-asus-CM1400CXA-dalboz
# Force fixed 6.6 kernel, amdgpu doesn't revcover from GPU resets on 6.13
# https://gitlab.freedesktop.org/drm/amd/-/issues/3861
EXTERNAL_KERNEL_TAG: "v6.6.21-mesa-f8ea"
# Status: https://lava.collabora.dev/scheduler/device_type/lenovo-TPad-C13-Yoga-zork
.lava-lenovo-TPad-C13-Yoga-zork:x86_64:
@@ -204,6 +207,9 @@
HWCI_KERNEL_MODULES: amdgpu
KERNEL_IMAGE_TYPE: ""
RUNNER_TAG: mesa-ci-x86-64-lava-lenovo-TPad-C13-Yoga-zork
# Force fixed 6.6 kernel, amdgpu doesn't revcover from GPU resets on 6.13
# https://gitlab.freedesktop.org/drm/amd/-/issues/3861
EXTERNAL_KERNEL_TAG: "v6.6.21-mesa-f8ea"
# Status: https://lava.collabora.dev/scheduler/device_type/hp-x360-14a-cb0001xx-zork
.lava-hp-x360-14a-cb0001xx-zork:x86_64:
@@ -217,6 +223,9 @@
HWCI_KERNEL_MODULES: amdgpu
KERNEL_IMAGE_TYPE: ""
RUNNER_TAG: mesa-ci-x86-64-lava-hp-x360-14a-cb0001xx-zork
# Force fixed 6.6 kernel, amdgpu doesn't revcover from GPU resets on 6.13
# https://gitlab.freedesktop.org/drm/amd/-/issues/3861
EXTERNAL_KERNEL_TAG: "v6.6.21-mesa-f8ea"
############### LAVA
@@ -397,7 +406,7 @@
tags:
- farm:$RUNNER_FARM_LOCATION
- amdgpu:codename:VANGOGH
- $VALVE_INFRA_VANGOGH_JOB_PRIORITY
- $CI_TRON_JOB_PRIORITY_TAG
.navi31-test-valve:
variables:

View File

@@ -1,10 +1,7 @@
glx@glx-make-current,Fail
glx@glx-multi-window-single-context,Fail
glx@glx-visuals-depth -pixmap,Fail
glx@glx-visuals-stencil -pixmap,Fail
glx@glx-swap-event_async,Fail
glx@glx-swap-pixmap-bad,Fail
glx@glx_arb_create_context_no_error@no error,Fail
spec@!opengl 1.0@rasterpos,Fail
spec@!opengl 1.0@rasterpos@glsl_vs_gs_linked,Fail
spec@!opengl 1.0@rasterpos@glsl_vs_tes_linked,Fail
@@ -13,8 +10,6 @@ spec@!opengl 3.2@gl-3.2-adj-prims cull-front pv-last,Fail
spec@!opengl 3.2@gl-3.2-adj-prims line cull-back pv-last,Fail
spec@!opengl 3.2@gl-3.2-adj-prims line cull-front pv-last,Fail
spec@!opengl 3.2@gl-3.2-adj-prims pv-last,Fail
spec@arb_program_interface_query@arb_program_interface_query-getprogramresourceindex,Fail
spec@arb_program_interface_query@arb_program_interface_query-getprogramresourceindex@'vs_input2[1][0]' on GL_PROGRAM_INPUT,Fail
spec@arb_shading_language_packing@execution@built-in-functions@fs-packhalf2x16,Fail
spec@arb_shading_language_packing@execution@built-in-functions@vs-packhalf2x16,Fail
spec@egl 1.4@eglterminate then unbind context,Fail

View File

@@ -1,6 +1,5 @@
glx@glx-multi-window-single-context,Fail
glx@glx-swap-pixmap-bad,Fail
glx@glx-visuals-depth -pixmap,Fail
glx@glx-visuals-stencil -pixmap,Fail
glx@glx_arb_create_context_no_error@no error,Fail
spec@!opengl 1.0@gl-1.0-user-clip-all-planes,Fail
@@ -151,3 +150,7 @@ spec@arb_fragment_layer_viewport@layer-gs-writes-out-of-range,Fail
# glcts update
KHR-GLES3.clip_distance.coverage,Fail
KHR-GLES3.cull_distance.functional,Fail
# since hetzner migration
spec@ext_external_objects@vk-ping-pong-multi-sem,Fail
spec@ext_external_objects@vk-ping-pong-single-sem,Crash

View File

@@ -27,3 +27,6 @@ dEQP-GLES3.functional.occlusion_query.conservative_scissor_stencil_clear
dEQP-GLES3.functional.occlusion_query.conservative_depth_clear
dEQP-GLES3.functional.occlusion_query.scissor_depth_clear_stencil_write_stencil_clear
dEQP-GLES3.functional.occlusion_query.conservative_depth_write_stencil_clear
# since hetzner migration
spec@!opengl 1.0@gl-1.0-ortho-pos

View File

@@ -16,8 +16,6 @@ spec@!opengl 1.0@rasterpos@glsl_vs_tes_linked,Fail
spec@!opengl 1.1@line-smooth-stipple,Fail
spec@arb_fragment_layer_viewport@layer-gs-writes-out-of-range,Fail
spec@arb_pipeline_statistics_query@arb_pipeline_statistics_query-frag,Fail
spec@arb_program_interface_query@arb_program_interface_query-getprogramresourceindex,Fail
spec@arb_program_interface_query@arb_program_interface_query-getprogramresourceindex@'vs_input2[1][0]' on GL_PROGRAM_INPUT,Fail
spec@arb_shader_texture_lod@execution@arb_shader_texture_lod-texgradcube,Fail
spec@arb_shading_language_packing@execution@built-in-functions@fs-packhalf2x16,Fail
spec@arb_shading_language_packing@execution@built-in-functions@vs-packhalf2x16,Fail

View File

@@ -8,11 +8,3 @@ dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving_transfe
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.2_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.4_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.8_bit,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal,Fail

View File

@@ -8,11 +8,3 @@ dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving_transfe
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.2_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.4_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.8_bit,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal,Fail

View File

@@ -8,27 +8,3 @@ dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving_transfe
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.2_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.4_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.8_bit,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.general_optimal,Fail

View File

@@ -8,27 +8,3 @@ dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving_transfe
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.2_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.4_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.8_bit,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.optimal_optimal,Fail

View File

@@ -1,9 +0,0 @@
# RADV_PERFTEST=transfer_queue hangs
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal

View File

@@ -8,27 +8,3 @@ dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving_transfe
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.2_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.4_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.8_bit,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_separate_layouts.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal,Fail
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_separate_layouts.optimal_optimal,Fail

View File

@@ -1,9 +0,0 @@
# RADV_PERFTEST=transfer_queue hangs
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.general_optimal
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d16_unorm_s8_uint_d16_unorm_s8_uint_depth_stencil_aspects.optimal_optimal
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.general_optimal
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_general
dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.2d_to_2d.d32_sfloat_s8_uint_d32_sfloat_s8_uint_depth_stencil_aspects.optimal_optimal

View File

@@ -249,7 +249,6 @@ gfx6_init_graphics_preamble_state(const struct ac_preamble_state *state,
/* CLEAR_STATE doesn't clear these correctly on certain generations.
* I don't know why. Deduced by trial and error.
*/
ac_pm4_set_reg(pm4, R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET, 0);
ac_pm4_set_reg(pm4, R_028204_PA_SC_WINDOW_SCISSOR_TL, S_028204_WINDOW_OFFSET_DISABLE(1));
ac_pm4_set_reg(pm4, R_028030_PA_SC_SCREEN_SCISSOR_TL, 0);
}
@@ -678,7 +677,6 @@ gfx12_init_graphics_preamble_state(const struct ac_preamble_state *state,
ac_pm4_set_reg(pm4, R_028AA0_VGT_DRAW_PAYLOAD_CNTL, 0);
ac_pm4_set_reg(pm4, R_028ABC_DB_HTILE_SURFACE, 0);
ac_pm4_set_reg(pm4, R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET, 0);
ac_pm4_set_reg(pm4, R_028B50_VGT_TESS_DISTRIBUTION,
S_028B50_ACCUM_ISOLINE(128) |
S_028B50_ACCUM_TRI(128) |

View File

@@ -542,8 +542,9 @@ static void handle_env_var_force_family(struct radeon_info *info)
exit(1);
}
bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
bool require_pci_bus_info)
enum ac_query_gpu_info_result
ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
bool require_pci_bus_info)
{
struct amdgpu_gpu_info amdinfo;
struct drm_amdgpu_info_device device_info = {0};
@@ -567,7 +568,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
if (!ac_query_pci_bus_info(fd, info)) {
if (require_pci_bus_info)
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
assert(info->drm_major == 3);
@@ -577,27 +578,27 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
fprintf(stderr, "amdgpu: DRM version is %u.%u.%u, but this driver is "
"only compatible with 3.27.0 (kernel 4.20+) or later.\n",
info->drm_major, info->drm_minor, info->drm_patchlevel);
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
uint64_t cap;
r = drmGetCap(fd, DRM_CAP_SYNCOBJ, &cap);
if (r != 0 || cap == 0) {
fprintf(stderr, "amdgpu: syncobj support is missing but is required.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
/* Query hardware and driver information. */
r = ac_drm_query_gpu_info(dev, &amdinfo);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_gpu_info failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
r = ac_drm_query_info(dev, AMDGPU_INFO_DEV_INFO, sizeof(device_info), &device_info);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_info(dev_info) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
for (unsigned ip_type = 0; ip_type < AMD_NUM_IP_TYPES; ip_type++) {
@@ -660,35 +661,35 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
/* Only require gfx or compute. */
if (!info->ip[AMD_IP_GFX].num_queues && !info->ip[AMD_IP_COMPUTE].num_queues) {
fprintf(stderr, "amdgpu: failed to find gfx or compute.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
r = ac_drm_query_firmware_version(dev, AMDGPU_INFO_FW_GFX_ME, 0, 0, &info->me_fw_version,
&info->me_fw_feature);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_firmware_version(me) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
r = ac_drm_query_firmware_version(dev, AMDGPU_INFO_FW_GFX_MEC, 0, 0, &info->mec_fw_version,
&info->mec_fw_feature);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_firmware_version(mec) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
r = ac_drm_query_firmware_version(dev, AMDGPU_INFO_FW_GFX_PFP, 0, 0, &info->pfp_fw_version,
&info->pfp_fw_feature);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_firmware_version(pfp) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
if (info->ip[AMD_IP_VCN_DEC].num_queues || info->ip[AMD_IP_VCN_UNIFIED].num_queues) {
r = ac_drm_query_firmware_version(dev, AMDGPU_INFO_FW_VCN, 0, 0, &vidip_fw_version, &vidip_fw_feature);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_firmware_version(vcn) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
} else {
info->vcn_dec_version = (vidip_fw_version & 0x0F000000) >> 24;
info->vcn_enc_major_version = (vidip_fw_version & 0x00F00000) >> 20;
@@ -699,7 +700,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
r = ac_drm_query_firmware_version(dev, AMDGPU_INFO_FW_VCE, 0, 0, &vidip_fw_version, &vidip_fw_feature);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_firmware_version(vce) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
} else
info->vce_fw_version = vidip_fw_version;
}
@@ -708,7 +709,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
r = ac_drm_query_firmware_version(dev, AMDGPU_INFO_FW_UVD, 0, 0, &vidip_fw_version, &vidip_fw_feature);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_firmware_version(uvd) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
} else
info->uvd_fw_version = vidip_fw_version;
}
@@ -717,7 +718,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
r = ac_drm_query_sw_info(dev, amdgpu_sw_info_address32_hi, &info->address32_hi);
if (r) {
fprintf(stderr, "amdgpu: amdgpu_query_sw_info(address32_hi) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
struct drm_amdgpu_memory_info meminfo = {0};
@@ -725,7 +726,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
r = ac_drm_query_info(dev, AMDGPU_INFO_MEMORY, sizeof(meminfo), &meminfo);
if (r) {
fprintf(stderr, "amdgpu: ac_drm_query_info(memory) failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
/* Note: usable_heap_size values can be random and can't be relied on. */
@@ -865,7 +866,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
else {
fprintf(stderr, "amdgpu: Unknown gfx version: %u.%u\n",
info->ip[AMD_IP_GFX].ver_major, info->ip[AMD_IP_GFX].ver_minor);
return false;
return AC_QUERY_GPU_INFO_UNIMPLEMENTED_HW;
}
info->family_id = device_info.family;
@@ -880,7 +881,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
if (!info->name) {
fprintf(stderr, "amdgpu: unknown (family_id, chip_external_rev): (%u, %u)\n",
device_info.family, device_info.external_rev);
return false;
return AC_QUERY_GPU_INFO_UNIMPLEMENTED_HW;
}
memset(info->lowercase_name, 0, sizeof(info->lowercase_name));
@@ -1255,6 +1256,15 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
info->has_vgt_flush_ngg_legacy_bug = info->gfx_level == GFX10 ||
info->family == CHIP_NAVI21;
/* GFX10-GFX10.3 (tested on NAVI10, NAVI21 and NAVI24 but likely all) are
* affected by a hw bug when primitive restart is updated and no context
* registers are written between draws. One workaround is to emit
* SQ_NON_EVENT(0) which is a NOP packet that adds a small delay and seems
* to fix it reliably.
*/
info->has_prim_restart_sync_bug = info->gfx_level == GFX10 ||
info->gfx_level == GFX10_3;
/* First Navi2x chips have a hw bug that doesn't allow to write
* depth/stencil from a FS for multi-pixel fragments.
*/
@@ -1450,6 +1460,11 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
*/
info->gfx12_supports_display_dcc = info->gfx_level >= GFX12 && info->drm_minor >= 58;
/* AMDGPU always enables DCC compressed writes when a BO is moved back to
* VRAM until .60.
*/
info->gfx12_supports_dcc_write_compress_disable = info->gfx_level >= GFX12 && info->drm_minor >= 60;
info->has_stable_pstate = info->drm_minor >= 45;
if (info->gfx_level >= GFX12) {
@@ -1691,7 +1706,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
r = ac_drm_query_uq_fw_area_info(dev, AMDGPU_HW_IP_GFX, 0, &fw_info);
if (r) {
fprintf(stderr, "amdgpu: amdgpu_query_uq_fw_area_info() failed.\n");
return false;
return AC_QUERY_GPU_INFO_FAIL;
}
info->has_fw_based_shadowing = true;
@@ -1754,7 +1769,7 @@ bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
exit(0);
}
}
return true;
return AC_QUERY_GPU_INFO_SUCCESS;
}
void ac_compute_driver_uuid(char *uuid, size_t size)

View File

@@ -104,6 +104,7 @@ struct radeon_info {
bool has_image_load_dcc_bug;
bool has_two_planes_iterate256_bug;
bool has_vgt_flush_ngg_legacy_bug;
bool has_prim_restart_sync_bug;
bool has_cs_regalloc_hang_bug;
bool has_async_compute_threadgroup_bug;
bool has_async_compute_align32_bug;
@@ -161,6 +162,7 @@ struct radeon_info {
/* Allocate both aligned and unaligned DCC and use the retile blit. */
bool use_display_dcc_with_retile_blit;
bool gfx12_supports_display_dcc;
bool gfx12_supports_dcc_write_compress_disable;
/* Memory info. */
uint32_t pte_fragment_size;
@@ -327,8 +329,14 @@ struct radeon_info {
bool has_image_bvh_intersect_ray;
};
bool ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
bool require_pci_bus_info);
enum ac_query_gpu_info_result {
AC_QUERY_GPU_INFO_SUCCESS,
AC_QUERY_GPU_INFO_FAIL,
AC_QUERY_GPU_INFO_UNIMPLEMENTED_HW,
};
enum ac_query_gpu_info_result ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
bool require_pci_bus_info);
void ac_compute_driver_uuid(char *uuid, size_t size);

View File

@@ -65,6 +65,10 @@
#define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_MASK 0x7
#define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_SHIFT 8
#define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_MASK 0x3f
/* When clearing the buffer or moving it from VRAM to GTT, don't compress and set DCC metadata
* to uncompressed. Set when parts of an allocation bypass DCC and read raw data. */
#define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_SHIFT 14
#define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_MASK 0x1
#define AMDGPU_TILING_SET(field, value) \
(((__u64)(value) & AMDGPU_TILING_##field##_MASK) << AMDGPU_TILING_##field##_SHIFT)
#define AMDGPU_TILING_GET(value, field) \
@@ -1193,6 +1197,59 @@ static void ac_compute_cmask(const struct radeon_info *info, const struct ac_sur
surf->cmask_size = surf->cmask_slice_size * num_layers;
}
static uint64_t ac_estimate_size(const struct ac_surf_config *config,
unsigned blk_w, unsigned blk_h, unsigned bpp,
unsigned in_width, unsigned in_height,
unsigned align_width, unsigned align_height,
unsigned align_depth)
{
assert(bpp);
unsigned num_samples = MAX2(1, config->info.samples);
unsigned bpe = bpp / 8;
unsigned width = align(in_width, align_width * blk_w);
unsigned height = align(in_height , align_height * blk_h);
unsigned depth = align(config->is_3d ? config->info.depth :
config->is_cube ? 6 : config->info.array_size, align_depth);
unsigned tile_size_bytes = align_width * align_height * align_depth * num_samples * bpe;
if (config->info.levels > 1 && align_height > 1) {
width = util_next_power_of_two(width);
height = util_next_power_of_two(height);
}
uint64_t size = 0;
/* Note: This mipmap size computation is inaccurate. */
for (unsigned i = 0; i < config->info.levels; i++) {
uint64_t level_size =
(uint64_t)DIV_ROUND_UP(width, blk_w) * DIV_ROUND_UP(height, blk_h) * depth *
num_samples * bpe;
size += level_size;
if (tile_size_bytes >= 4096 && level_size <= tile_size_bytes / 2) {
/* We are likely in the mip tail, return. */
assert(size);
return size;
}
/* Minify the level. */
width = u_minify(width, 1);
height = u_minify(height, 1);
if (config->is_3d)
depth = u_minify(depth, 1);
}
/* TODO: check that this is not too different from the correct value */
assert(size);
return size;
}
#define SI__GB_TILE_MODE__BANK_WIDTH(x) (((x) >> 14) & 0x3)
#define SI__GB_TILE_MODE__BANK_HEIGHT(x) (((x) >> 16) & 0x3)
#define SI__GB_TILE_MODE__MACRO_TILE_ASPECT(x) (((x) >> 18) & 0x3)
#define SI__GB_TILE_MODE__NUM_BANKS(x) (((x) >> 20) & 0x3)
/**
* Fill in the tiling information in \p surf based on the given surface config.
*
@@ -1255,11 +1312,100 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib, const struct radeon_info *i
}
} else {
if (config->is_3d) {
/* GFX6 doesn't have 3D_TILED_XTHICK. */
if (info->gfx_level >= GFX7)
AddrSurfInfoIn.tileMode = ADDR_TM_3D_TILED_XTHICK;
else
AddrSurfInfoIn.tileMode = ADDR_TM_2D_TILED_XTHICK;
/* Select the best tile mode that doesn't overallocate memory too much.
* The tile modes below are sorted from best to worst performance.
*/
struct {
unsigned tile_mode;
unsigned gfx6_tile_mode_index;
unsigned gfx7_tile_mode_index;
unsigned microtile_width;
unsigned microtile_height;
unsigned microtile_depth;
bool supported; /* this comes from the tile mode arrays in the kernel */
/* Derived fields. */
unsigned bank_width;
unsigned bank_height;
unsigned num_banks;
unsigned macro_tile_aspect;
unsigned align_width;
unsigned align_height;
unsigned align_depth;
} modes[] = {
{ADDR_TM_3D_TILED_XTHICK, 0, 26, 8, 8, 8, info->gfx_level >= GFX7},
{ADDR_TM_2D_TILED_XTHICK, 19, 25, 8, 8, 8, true},
{ADDR_TM_3D_TILED_THICK, 0, 21, 8, 8, 4, info->gfx_level >= GFX7},
{ADDR_TM_2D_TILED_THICK, 20, 20, 8, 8, 4, true},
{ADDR_TM_3D_TILED_THIN1, 0, 15, 8, 8, 1, info->gfx_level >= GFX7},
{ADDR_TM_2D_TILED_THIN1, 14, 14, 8, 8, 1, true},
{ADDR_TM_1D_TILED_THICK, 18, 19, 8, 8, 4, true},
{ADDR_TM_1D_TILED_THIN1, 13, 13, 8, 8, 1, true},
/* Don't use LINEAR_ALIGNED. It doesn't work with BC formats. */
};
for (unsigned i = 0; i < ARRAY_SIZE(modes); i++) {
if (!modes[i].supported)
continue;
if (modes[i].tile_mode <= ADDR_TM_1D_TILED_THICK) {
modes[i].align_width = modes[i].microtile_width;
modes[i].align_height = modes[i].microtile_height;
modes[i].align_depth = modes[i].microtile_depth;
continue;
}
if (info->gfx_level >= GFX7) {
ADDR_GET_MACROMODEINDEX_INPUT in = {sizeof(in)};
ADDR_GET_MACROMODEINDEX_OUTPUT out = {sizeof(out)};
in.tileIndex = modes[i].gfx7_tile_mode_index;
in.bpp = surf->bpe * 8;
in.numFrags = 1;
if (AddrGetMacroModeIndex(addrlib, &in, &out) != ADDR_OK) {
fprintf(stderr, "amdgpu: AddrGetMacroModeIndex failed.\n");
return -1;
}
uint32_t macro_mode_reg = info->cik_macrotile_mode_array[out.macroModeIndex];
modes[i].bank_width = 1 << G_009990_BANK_WIDTH(macro_mode_reg);
modes[i].bank_height = 1 << G_009990_BANK_HEIGHT(macro_mode_reg);
modes[i].num_banks = 2 << G_009990_NUM_BANKS(macro_mode_reg);
modes[i].macro_tile_aspect = 1 << G_009990_MACRO_TILE_ASPECT(macro_mode_reg);
} else {
/* GFX6. */
uint32_t tile_mode_reg = info->si_tile_mode_array[modes[i].gfx6_tile_mode_index];
modes[i].bank_width = 1 << SI__GB_TILE_MODE__BANK_WIDTH(tile_mode_reg);
modes[i].bank_height = 1 << SI__GB_TILE_MODE__BANK_HEIGHT(tile_mode_reg);
modes[i].num_banks = 2 << SI__GB_TILE_MODE__NUM_BANKS(tile_mode_reg);
modes[i].macro_tile_aspect = 1 << SI__GB_TILE_MODE__MACRO_TILE_ASPECT(tile_mode_reg);
}
modes[i].align_width = modes[i].microtile_width * modes[i].bank_width *
info->num_tile_pipes * modes[i].macro_tile_aspect;
modes[i].align_height = modes[i].microtile_height * modes[i].bank_height *
modes[i].num_banks / modes[i].macro_tile_aspect;
modes[i].align_depth = modes[i].microtile_depth;
}
uint64_t ideal_size = ac_estimate_size(config, surf->blk_w, surf->blk_h, surf->bpe * 8,
config->info.width, config->info.height, 1, 1, 1);
AddrSurfInfoIn.tileMode = ADDR_TM_1D_TILED_THIN1; /* used if everything else fails */
for (unsigned i = 0; i < ARRAY_SIZE(modes); i++) {
if (!modes[i].supported)
continue;
uint64_t size = ac_estimate_size(config, surf->blk_w, surf->blk_h, surf->bpe * 8,
config->info.width, config->info.height,
modes[i].align_width, modes[i].align_height,
modes[i].align_depth);
if (size <= ideal_size * 3) {
AddrSurfInfoIn.tileMode = modes[i].tile_mode;
break;
}
}
} else {
AddrSurfInfoIn.tileMode = ADDR_TM_2D_TILED_THIN1;
}
@@ -2709,57 +2855,12 @@ static int gfx9_compute_surface(struct ac_addrlib *addrlib, const struct radeon_
return 0;
}
static uint64_t gfx12_estimate_size(const ADDR3_COMPUTE_SURFACE_INFO_INPUT *in,
const struct radeon_surf *surf,
unsigned align_width, unsigned align_height,
unsigned align_depth)
{
unsigned blk_w = surf ? surf->blk_w : 1;
unsigned blk_h = surf ? surf->blk_h : 1;
unsigned bpe = in->bpp ? in->bpp / 8 : surf->bpe;
unsigned width = align(in->width, align_width * blk_w);
unsigned height = align(in->height, align_height * blk_h);
unsigned depth = align(in->numSlices, align_depth);
unsigned tile_size = align_width * align_height * align_depth *
in->numSamples * bpe;
if (in->numMipLevels > 1 && align_height > 1) {
width = util_next_power_of_two(width);
height = util_next_power_of_two(height);
}
uint64_t size = 0;
/* Note: This mipmap size computation is inaccurate. */
for (unsigned i = 0; i < in->numMipLevels; i++) {
uint64_t level_size =
(uint64_t)DIV_ROUND_UP(width, blk_w) * DIV_ROUND_UP(height, blk_h) * depth *
in->numSamples * bpe;
size += level_size;
if (tile_size >= 4096 && level_size <= tile_size / 2) {
/* We are likely in the mip tail, return. */
assert(size);
return size;
}
/* Minify the level. */
width = u_minify(width, 1);
height = u_minify(height, 1);
if (in->resourceType == ADDR_RSRC_TEX_3D)
depth = u_minify(depth, 1);
}
/* TODO: check that this is not too different from the correct value */
assert(size);
return size;
}
static unsigned gfx12_select_swizzle_mode(struct ac_addrlib *addrlib,
const struct radeon_info *info,
const struct ac_surf_config *config,
const struct radeon_surf *surf,
const ADDR3_COMPUTE_SURFACE_INFO_INPUT *in)
const ADDR3_COMPUTE_SURFACE_INFO_INPUT *in,
uint64_t flags)
{
ADDR3_GET_POSSIBLE_SWIZZLE_MODE_INPUT get_in = {0};
ADDR3_GET_POSSIBLE_SWIZZLE_MODE_OUTPUT get_out = {0};
@@ -2776,9 +2877,9 @@ static unsigned gfx12_select_swizzle_mode(struct ac_addrlib *addrlib,
get_in.numMipLevels = in->numMipLevels;
get_in.numSamples = in->numSamples;
if (surf && surf->flags & RADEON_SURF_PREFER_4K_ALIGNMENT) {
if (flags & RADEON_SURF_PREFER_4K_ALIGNMENT) {
get_in.maxAlign = 4 * 1024;
} else if (surf && surf->flags & RADEON_SURF_PREFER_64K_ALIGNMENT) {
} else if (flags & RADEON_SURF_PREFER_64K_ALIGNMENT) {
get_in.maxAlign = 64 * 1024;
} else {
get_in.maxAlign = info->has_dedicated_vram ? (256 * 1024) : (64 * 1024);
@@ -2795,10 +2896,11 @@ static unsigned gfx12_select_swizzle_mode(struct ac_addrlib *addrlib,
assert(get_out.validModes.value);
unsigned bpe = in->bpp ? in->bpp / 8 : surf->bpe;
unsigned log_bpp = util_logbase2(bpe);
unsigned log_bpp = util_logbase2(get_in.bpp / 8);
unsigned log_samples = util_logbase2(in->numSamples);
uint64_t ideal_size = gfx12_estimate_size(in, surf, 1, 1, 1);
unsigned blk_w = surf ? surf->blk_w : 1;
unsigned blk_h = surf ? surf->blk_h : 1;
uint64_t ideal_size = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height, 1, 1, 1);
if (in->resourceType == ADDR_RSRC_TEX_3D) {
static unsigned block3d_size_4K[5][3] = {
@@ -2823,17 +2925,20 @@ static unsigned gfx12_select_swizzle_mode(struct ac_addrlib *addrlib,
{16, 32, 32},
};
uint64_t size_4K = gfx12_estimate_size(in, surf, block3d_size_4K[log_bpp][0],
block3d_size_4K[log_bpp][1],
block3d_size_4K[log_bpp][2]);
uint64_t size_4K = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block3d_size_4K[log_bpp][0],
block3d_size_4K[log_bpp][1],
block3d_size_4K[log_bpp][2]);
uint64_t size_64K = gfx12_estimate_size(in, surf, block3d_size_64K[log_bpp][0],
block3d_size_64K[log_bpp][1],
block3d_size_64K[log_bpp][2]);
uint64_t size_64K = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block3d_size_64K[log_bpp][0],
block3d_size_64K[log_bpp][1],
block3d_size_64K[log_bpp][2]);
uint64_t size_256K = gfx12_estimate_size(in, surf, block3d_size_256K[log_bpp][0],
block3d_size_256K[log_bpp][1],
block3d_size_256K[log_bpp][2]);;
uint64_t size_256K = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block3d_size_256K[log_bpp][0],
block3d_size_256K[log_bpp][1],
block3d_size_256K[log_bpp][2]);
float max_3d_overalloc_256K = 1.1;
float max_3d_overalloc_64K = 1.2;
@@ -2989,19 +3094,24 @@ static unsigned gfx12_select_swizzle_mode(struct ac_addrlib *addrlib,
},
};
uint64_t size_LINEAR = gfx12_estimate_size(in, surf, block_size_LINEAR[log_bpp], 1, 1);
uint64_t size_LINEAR = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block_size_LINEAR[log_bpp], 1, 1);
uint64_t size_256B = gfx12_estimate_size(in, surf, block_size_256B[log_samples][log_bpp][0],
block_size_256B[log_samples][log_bpp][1], 1);
uint64_t size_256B = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block_size_256B[log_samples][log_bpp][0],
block_size_256B[log_samples][log_bpp][1], 1);
uint64_t size_4K = gfx12_estimate_size(in, surf, block_size_4K[log_samples][log_bpp][0],
block_size_4K[log_samples][log_bpp][1], 1);;
uint64_t size_4K = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block_size_4K[log_samples][log_bpp][0],
block_size_4K[log_samples][log_bpp][1], 1);
uint64_t size_64K = gfx12_estimate_size(in, surf, block_size_64K[log_samples][log_bpp][0],
block_size_64K[log_samples][log_bpp][1], 1);
uint64_t size_64K = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block_size_64K[log_samples][log_bpp][0],
block_size_64K[log_samples][log_bpp][1], 1);
uint64_t size_256K = gfx12_estimate_size(in, surf, block_size_256K[log_samples][log_bpp][0],
block_size_256K[log_samples][log_bpp][1], 1);
uint64_t size_256K = ac_estimate_size(config, blk_w, blk_h, get_in.bpp, in->width, in->height,
block_size_256K[log_samples][log_bpp][0],
block_size_256K[log_samples][log_bpp][1], 1);
float max_2d_overalloc_256K = 1.1; /* relative to ideal */
float max_2d_overalloc_64K = 1.3; /* relative to ideal */
@@ -3032,6 +3142,7 @@ static unsigned gfx12_select_swizzle_mode(struct ac_addrlib *addrlib,
}
static bool gfx12_compute_hiz_his_info(struct ac_addrlib *addrlib, const struct radeon_info *info,
const struct ac_surf_config *config,
struct radeon_surf *surf, struct gfx12_hiz_his_layout *hizs,
const ADDR3_COMPUTE_SURFACE_INFO_INPUT *surf_in)
{
@@ -3059,7 +3170,7 @@ static bool gfx12_compute_hiz_his_info(struct ac_addrlib *addrlib, const struct
/* Compute the HiZ/HiS size. */
in.width = align(DIV_ROUND_UP(surf_in->width, 8), 2);
in.height = align(DIV_ROUND_UP(surf_in->height, 8), 2);
in.swizzleMode = gfx12_select_swizzle_mode(addrlib, info, NULL, &in);
in.swizzleMode = gfx12_select_swizzle_mode(addrlib, info, config, NULL, &in, surf->flags);
int ret = Addr3ComputeSurfaceInfo(addrlib->handle, &in, &out);
if (ret != ADDR_OK)
@@ -3112,7 +3223,7 @@ static bool gfx12_compute_miptree(struct ac_addrlib *addrlib, const struct radeo
surf->surf_size = surf->u.gfx9.zs.stencil_offset + out.surfSize;
if (info->chip_rev >= 2 &&
!gfx12_compute_hiz_his_info(addrlib, info, surf, &surf->u.gfx9.zs.his, in))
!gfx12_compute_hiz_his_info(addrlib, info, config, surf, &surf->u.gfx9.zs.his, in))
return false;
return true;
@@ -3175,7 +3286,7 @@ static bool gfx12_compute_miptree(struct ac_addrlib *addrlib, const struct radeo
if (in->flags.depth) {
assert(in->swizzleMode != ADDR3_LINEAR);
return gfx12_compute_hiz_his_info(addrlib, info, surf, &surf->u.gfx9.zs.hiz, in);
return gfx12_compute_hiz_his_info(addrlib, info, config, surf, &surf->u.gfx9.zs.hiz, in);
}
/* Compute tile swizzle for the color surface. All swizzle modes >= 4K support it. */
@@ -3261,7 +3372,8 @@ static bool gfx12_compute_surface(struct ac_addrlib *addrlib, const struct radeo
} else if (surf->flags & RADEON_SURF_VIDEO_REFERENCE) {
AddrSurfInfoIn.swizzleMode = ADDR3_256B_2D;
} else {
AddrSurfInfoIn.swizzleMode = gfx12_select_swizzle_mode(addrlib, info, surf, &AddrSurfInfoIn);
AddrSurfInfoIn.swizzleMode = gfx12_select_swizzle_mode(addrlib, info, config, surf,
&AddrSurfInfoIn, surf->flags);
}
/* Force the linear pitch from 128B (default) to 256B for multi-GPU interop. This only applies
@@ -3309,6 +3421,8 @@ static bool gfx12_compute_surface(struct ac_addrlib *addrlib, const struct radeo
/* Don't change the DCC settings for imported buffers - they might differ. */
!(surf->flags & RADEON_SURF_IMPORTED)) {
surf->u.gfx9.color.dcc.max_compressed_block_size = V_028C78_MAX_BLOCK_SIZE_256B;
if ((info->drm_minor < 63) && (surf->flags & RADEON_SURF_SCANOUT))
surf->u.gfx9.color.dcc.max_compressed_block_size = V_028C78_MAX_BLOCK_SIZE_128B;
}
}
@@ -3517,6 +3631,8 @@ void ac_surface_apply_bo_metadata(enum amd_gfx_level gfx_level, struct radeon_su
AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_DATA_FORMAT);
surf->u.gfx9.color.dcc_number_type =
AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_NUMBER_TYPE);
surf->u.gfx9.color.dcc_write_compress_disable =
AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_WRITE_COMPRESS_DISABLE);
scanout = AMDGPU_TILING_GET(tiling_flags, GFX12_SCANOUT);
} else if (gfx_level >= GFX9) {
surf->u.gfx9.swizzle_mode = AMDGPU_TILING_GET(tiling_flags, SWIZZLE_MODE);
@@ -3564,6 +3680,7 @@ void ac_surface_compute_bo_metadata(const struct radeon_info *info, struct radeo
surf->u.gfx9.color.dcc.max_compressed_block_size);
*tiling_flags |= AMDGPU_TILING_SET(GFX12_DCC_NUMBER_TYPE, surf->u.gfx9.color.dcc_number_type);
*tiling_flags |= AMDGPU_TILING_SET(GFX12_DCC_DATA_FORMAT, surf->u.gfx9.color.dcc_data_format);
*tiling_flags |= AMDGPU_TILING_SET(GFX12_DCC_WRITE_COMPRESS_DISABLE, surf->u.gfx9.color.dcc_write_compress_disable);
*tiling_flags |= AMDGPU_TILING_SET(GFX12_SCANOUT, (surf->flags & RADEON_SURF_SCANOUT) != 0);
} else if (info->gfx_level >= GFX9) {
uint64_t dcc_offset = 0;

View File

@@ -275,6 +275,7 @@ struct gfx9_surf_layout {
*/
uint8_t dcc_number_type; /* CB_COLOR0_INFO.NUMBER_TYPE */
uint8_t dcc_data_format; /* [0:4]:CB_COLOR0_INFO.FORMAT, [5]:MM */
bool dcc_write_compress_disable;
/* Displayable DCC. This is always rb_aligned=0 and pipe_aligned=0.
* The 3D engine doesn't support that layout except for chips with 1 RB.

View File

@@ -0,0 +1,33 @@
/**************************************************************************
*
* Copyright 2025 Advanced Micro Devices, Inc.
*
* SPDX-License-Identifier: MIT
*
**************************************************************************/
#include <stdint.h>
#include "ac_uvd_dec.h"
#include "util/os_time.h"
#include "util/detect_os.h"
#include "util/bitpack_helpers.h"
#if DETECT_OS_POSIX
#include <unistd.h>
#endif
void ac_uvd_init_stream_handle(struct ac_uvd_stream_handle *handle)
{
#if DETECT_OS_POSIX
handle->base = util_bitreverse(getpid() ^ os_time_get());
#else
handle->base = util_bitreverse(os_time_get());
#endif
handle->counter = 0;
}
unsigned ac_uvd_alloc_stream_handle(struct ac_uvd_stream_handle *handle)
{
return handle->base ^ ++handle->counter;
}

View File

@@ -406,4 +406,12 @@ struct ruvd_msg {
} body;
};
struct ac_uvd_stream_handle {
uint32_t base;
uint32_t counter;
};
void ac_uvd_init_stream_handle(struct ac_uvd_stream_handle *handle);
unsigned ac_uvd_alloc_stream_handle(struct ac_uvd_stream_handle *handle);
#endif

View File

@@ -412,7 +412,7 @@ radv_vcn_av1_film_grain_init_scaling(uint8_t scaling_points[][2], uint8_t num, s
}
void
ac_vcn_av1_init_film_grain_buffer(rvcn_dec_film_grain_params_t *fg_params, rvcn_dec_av1_fg_init_buf_t *fg_buf)
ac_vcn_av1_init_film_grain_buffer(unsigned av1_version, rvcn_dec_film_grain_params_t *fg_params, rvcn_dec_av1_fg_init_buf_t *fg_buf)
{
const int32_t luma_block_size_y = LUMA_BLOCK_SIZE_Y;
const int32_t luma_block_size_x = LUMA_BLOCK_SIZE_X;
@@ -542,24 +542,38 @@ ac_vcn_av1_init_film_grain_buffer(rvcn_dec_film_grain_params_t *fg_params, rvcn_
}
align_ptr = &fg_buf->luma_grain_block[0][0];
for (i = 0; i < 64; i++) {
for (j = 0; j < 80; j++)
*align_ptr++ = luma_grain_block_tmp[i][j];
if (((i + 1) % 4) == 0)
align_ptr += 64;
}
align_ptr0 = &fg_buf->cb_grain_block[0][0];
align_ptr1 = &fg_buf->cr_grain_block[0][0];
for (i = 0; i < 32; i++) {
for (j = 0; j < 40; j++) {
*align_ptr0++ = cb_grain_block_tmp[i][j];
*align_ptr1++ = cr_grain_block_tmp[i][j];
if (av1_version == RDECODE_AV1_VER_2) {
for (i = 0; i < 64; i++)
for (j = 0; j < 64; j++)
*align_ptr++ = luma_grain_block_tmp[i][j];
for (i = 0; i < 32; i++) {
for (j = 0; j < 32; j++) {
*align_ptr0++ = cb_grain_block_tmp[i][j];
*align_ptr1++ = cr_grain_block_tmp[i][j];
}
}
if (((i + 1) % 8) == 0) {
align_ptr0 += 64;
align_ptr1 += 64;
} else {
for (i = 0; i < 64; i++) {
for (j = 0; j < 80; j++)
*align_ptr++ = luma_grain_block_tmp[i][j];
if (((i + 1) % 4) == 0)
align_ptr += 64;
}
for (i = 0; i < 32; i++) {
for (j = 0; j < 40; j++) {
*align_ptr0++ = cb_grain_block_tmp[i][j];
*align_ptr1++ = cr_grain_block_tmp[i][j];
}
if (((i + 1) % 8) == 0) {
align_ptr0 += 64;
align_ptr1 += 64;
}
}
}

View File

@@ -433,6 +433,7 @@
#define RDECODE_AV1_VER_0 0
#define RDECODE_AV1_VER_1 1
#define RDECODE_AV1_VER_2 2
typedef struct rvcn_decode_buffer_s {
unsigned int valid_buf_flag;
@@ -1216,6 +1217,6 @@ struct jpeg_params {
unsigned ac_vcn_dec_calc_ctx_size_av1(unsigned av1_version);
void ac_vcn_av1_init_probs(unsigned av1_version, uint8_t *prob);
void ac_vcn_av1_init_film_grain_buffer(rvcn_dec_film_grain_params_t *fg_params, rvcn_dec_av1_fg_init_buf_t *fg_buf);
void ac_vcn_av1_init_film_grain_buffer(unsigned av1_version, rvcn_dec_film_grain_params_t *fg_params, rvcn_dec_av1_fg_init_buf_t *fg_buf);
#endif

View File

@@ -91,6 +91,7 @@ amd_common_files = files(
'ac_vcn_av1_default.h',
'ac_vcn_dec.c',
'ac_vcn_enc.c',
'ac_uvd_dec.c',
'nir/ac_nir.c',
'nir/ac_nir.h',
'nir/ac_nir_helpers.h',

View File

@@ -513,6 +513,6 @@ static bool lower_image_opcodes(nir_builder *b, nir_instr *instr, void *data)
bool ac_nir_lower_image_opcodes(nir_shader *nir)
{
return nir_shader_instructions_pass(nir, lower_image_opcodes,
nir_metadata_control_flow,
nir_metadata_none,
NULL);
}

View File

@@ -70,9 +70,6 @@ ac_nir_lower_legacy_vs(nir_shader *nir,
/* This should be after streamout and before exports. */
ac_nir_clamp_vertex_color_outputs(&b, &out);
/* This should be after streamout and before exports. */
ac_nir_clamp_vertex_color_outputs(&b, &out);
uint64_t export_outputs = nir->info.outputs_written | VARYING_BIT_POS;
if (kill_pointsize)
export_outputs &= ~VARYING_BIT_PSIZ;

View File

@@ -831,9 +831,10 @@ hs_msg_group_vote_use_memory(nir_builder *b, lower_tess_io_state *st,
nir_pop_if(&top_b, thread0);
/* Insert a barrier to wait for initialization above if there hasn't been any other barrier
* in the shader.
* in the shader. If tcs_out_patch_fits_subgroup=true, then TCS barriers don't have a scope
* larger than a subgroup.
*/
if (!st->tcs_info.always_executes_barrier) {
if (!st->tcs_info.always_executes_barrier || st->tcs_out_patch_fits_subgroup) {
nir_barrier(b, .execution_scope = SCOPE_WORKGROUP, .memory_scope = SCOPE_WORKGROUP,
.memory_semantics = NIR_MEMORY_ACQ_REL, .memory_modes = nir_var_mem_shared);
}

View File

@@ -376,11 +376,13 @@ A va_vdst=0 wait: `s_waitcnt_deptr 0x0fff`
### VALUMaskWriteHazard
Triggered by:
SALU writing then SALU or VALU reading a SGPR that was previously used as a lane mask for a VALU.
SALU or VALU writing then SALU or VALU reading a SGPR that was previously used as a lane mask for a
VALU when using wave64.
Mitigated by:
A VALU instruction reading a non-exec SGPR before the SALU write, or a sa_sdst=0 wait after the
SALU write: `s_waitcnt_depctr 0xfffe`
A VALU instruction reading a non-exec SGPR before the SGPR write, or a wait after the
write: `s_waitcnt_depctr 0xfffe` for SALU, `s_waitcnt_depctr 0xf1ff` for non-VCC VALU and
`s_waitcnt_depctr 0xfffd` for VCC VALU.
## RDNA4 / GFX12 hazards

View File

@@ -832,8 +832,8 @@ emit_mimg_instruction_gfx12(asm_context& ctx, std::vector<uint32_t>& out, const
uint8_t vaddr[5] = {0, 0, 0, 0, 0};
for (unsigned i = 3; i < instr->operands.size(); i++)
vaddr[i - 3] = reg(ctx, instr->operands[i], 8);
unsigned num_vaddr = instr->operands.size() - 3;
for (unsigned i = 0; i < MIN2(instr->operands.back().size() - 1, 5 - num_vaddr); i++)
int num_vaddr = instr->operands.size() - 3;
for (int i = 0; i < (int)MIN2(instr->operands.back().size() - 1, ARRAY_SIZE(vaddr) - num_vaddr); i++)
vaddr[num_vaddr + i] = reg(ctx, instr->operands.back(), 8) + i + 1;
encoding = 0;
@@ -1538,6 +1538,8 @@ chain_branches(asm_context& ctx, std::vector<uint32_t>& out, branch_info& branch
unsigned target = branch.target;
branch.target = new_block->index;
unsigned skip_branch_target = 0; /* Target of potentially inserted short jump. */
/* Find suitable insertion point:
* We define two offset ranges within our new branch instruction should be placed.
* Then we try to maximize the distance from either the previous branch or the target.
@@ -1604,6 +1606,7 @@ chain_branches(asm_context& ctx, std::vector<uint32_t>& out, branch_info& branch
bld.reset(&ctx.program->blocks[insertion_block_idx].instructions, it);
} else {
bld.reset(&ctx.program->blocks[insertion_block_idx - 1].instructions);
skip_branch_target = insertion_block_idx;
}
/* Since we insert a branch into existing code, mitigate LdsBranchVmemWARHazard on GFX10. */
@@ -1623,6 +1626,11 @@ chain_branches(asm_context& ctx, std::vector<uint32_t>& out, branch_info& branch
insert_code(ctx, out, insert_at, code.size(), code.data());
new_block->offset = block_offset;
if (skip_branch_target) {
/* If we insert a short jump over the new branch at the end of a block,
* ensure that it gets updated accordingly after additional changes. */
ctx.branches.push_back({block_offset - 1, skip_branch_target});
}
ctx.branches.push_back({block_offset, target});
assert(out[ctx.branches.back().pos] == code.back());
}

View File

@@ -258,6 +258,7 @@ struct NOP_ctx_gfx11 {
/* VALUMaskWriteHazard */
std::bitset<128> sgpr_read_by_valu_as_lanemask;
std::bitset<128> sgpr_read_by_valu_as_lanemask_then_wr_by_salu;
std::bitset<128> sgpr_read_by_valu_as_lanemask_then_wr_by_valu;
/* WMMAHazards */
std::bitset<256> vgpr_written_by_wmma;
@@ -280,6 +281,8 @@ struct NOP_ctx_gfx11 {
sgpr_read_by_valu_as_lanemask |= other.sgpr_read_by_valu_as_lanemask;
sgpr_read_by_valu_as_lanemask_then_wr_by_salu |=
other.sgpr_read_by_valu_as_lanemask_then_wr_by_salu;
sgpr_read_by_valu_as_lanemask_then_wr_by_valu |=
other.sgpr_read_by_valu_as_lanemask_then_wr_by_valu;
vgpr_written_by_wmma |= other.vgpr_written_by_wmma;
sgpr_read_by_valu |= other.sgpr_read_by_valu;
sgpr_read_by_valu_then_wr_by_valu |= other.sgpr_read_by_valu_then_wr_by_valu;
@@ -299,6 +302,8 @@ struct NOP_ctx_gfx11 {
sgpr_read_by_valu_as_lanemask == other.sgpr_read_by_valu_as_lanemask &&
sgpr_read_by_valu_as_lanemask_then_wr_by_salu ==
other.sgpr_read_by_valu_as_lanemask_then_wr_by_salu &&
sgpr_read_by_valu_as_lanemask_then_wr_by_valu ==
other.sgpr_read_by_valu_as_lanemask_then_wr_by_valu &&
vgpr_written_by_wmma == other.vgpr_written_by_wmma &&
sgpr_read_by_valu == other.sgpr_read_by_valu &&
sgpr_read_by_valu_then_wr_by_salu == other.sgpr_read_by_valu_then_wr_by_salu;
@@ -798,24 +803,6 @@ check_written_regs(const aco_ptr<Instruction>& instr, const std::bitset<N>& chec
});
}
template <std::size_t N>
bool
check_read_regs(const aco_ptr<Instruction>& instr, const std::bitset<N>& check_regs)
{
return std::any_of(instr->operands.begin(), instr->operands.end(),
[&check_regs](const Operand& op) -> bool
{
if (op.isConstant())
return false;
bool writes_any = false;
for (unsigned i = 0; i < op.size(); i++) {
unsigned op_reg = op.physReg() + i;
writes_any |= op_reg < check_regs.size() && check_regs[op_reg];
}
return writes_any;
});
}
template <std::size_t N>
void
mark_read_regs(const aco_ptr<Instruction>& instr, std::bitset<N>& reg_reads)
@@ -1464,23 +1451,62 @@ handle_instruction_gfx11(State& state, NOP_ctx_gfx11& ctx, aco_ptr<Instruction>&
if (state.program->gfx_level < GFX12) {
/* VALUMaskWriteHazard
* VALU reads SGPR as a lane mask and later written by SALU cannot safely be read by SALU or
* VALU.
* VALU reads SGPR as a lane mask and later written by SALU or VALU cannot safely be read by
* SALU or VALU.
*/
if (state.program->wave_size == 64 && (instr->isSALU() || instr->isVALU()) &&
check_read_regs(instr, ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_salu)) {
bld.sopp(aco_opcode::s_waitcnt_depctr, 0xfffe);
sa_sdst = 0;
if (state.program->wave_size == 64 && (instr->isSALU() || instr->isVALU())) {
uint16_t imm = 0xffff;
for (Operand op : instr->operands) {
if (op.physReg() >= state.program->dev.sgpr_limit)
continue;
for (unsigned i = 0; i < op.size(); i++) {
unsigned reg = op.physReg() + i;
/* s_waitcnt_depctr on sa_sdst */
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_salu[reg]) {
imm &= 0xfffe;
sa_sdst = 0;
}
/* s_waitcnt_depctr on va_sdst (if non-VCC SGPR) or va_vcc (if VCC SGPR) */
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[reg]) {
bool is_vcc = reg == vcc || reg == vcc_hi;
imm &= is_vcc ? 0xfffd : 0xf1ff;
if (is_vcc)
wait.va_vcc = 0;
else
wait.va_sdst = 0;
}
}
}
if (imm != 0xffff)
bld.sopp(aco_opcode::s_waitcnt_depctr, imm);
}
if (va_vdst == 0) {
ctx.valu_since_wr_by_trans.reset();
ctx.trans_since_wr_by_trans.reset();
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu.reset();
}
if (sa_sdst == 0)
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_salu.reset();
if (wait.va_sdst == 0) {
std::bitset<128> old = ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu;
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu.reset();
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc] = old[vcc];
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc_hi] = old[vcc_hi];
}
if (wait.va_vcc == 0) {
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc] = false;
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc_hi] = false;
}
if (state.program->wave_size == 64 && instr->isSALU() &&
check_written_regs(instr, ctx.sgpr_read_by_valu_as_lanemask)) {
unsigned reg = instr->definitions[0].physReg().reg();
@@ -1511,6 +1537,15 @@ handle_instruction_gfx11(State& state, NOP_ctx_gfx11& ctx, aco_ptr<Instruction>&
if (!op.isConstant() && op.physReg().reg() < 126)
ctx.sgpr_read_by_valu_as_lanemask.reset();
}
if (!instr->definitions.empty() &&
instr->definitions.back().getTemp().type() == RegType::sgpr &&
check_written_regs(instr, ctx.sgpr_read_by_valu_as_lanemask)) {
unsigned reg = instr->definitions.back().physReg().reg();
for (unsigned i = 0; i < instr->definitions.back().size(); i++)
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[reg + i] = 1;
}
switch (instr->opcode) {
case aco_opcode::v_addc_co_u32:
case aco_opcode::v_subb_co_u32:
@@ -1745,6 +1780,16 @@ resolve_all_gfx11(State& state, NOP_ctx_gfx11& ctx,
waitcnt_depctr &= 0xfffe;
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_salu.reset();
}
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc] ||
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc_hi]) {
waitcnt_depctr &= 0xfffd;
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc] = false;
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[vcc_hi] = false;
}
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu.any()) {
waitcnt_depctr &= 0xf1ff;
ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu.reset();
}
if (ctx.sgpr_read_by_valu_as_lanemask.any()) {
valu_read_sgpr = true;
ctx.sgpr_read_by_valu_as_lanemask.reset();

View File

@@ -300,6 +300,21 @@ add_coupling_code(exec_ctx& ctx, Block* block, std::vector<aco_ptr<Instruction>>
} else if (preds.size() == 1) {
ctx.info[idx].exec = ctx.info[preds[0]].exec;
/* After continue and break blocks, we implicitly set exec to zero.
* This is so that parallelcopies can be inserted before the branch
* without being affected by the changed exec mask.
*/
if (ctx.info[idx].exec.back().op.constantEquals(0)) {
assert(block->logical_succs.empty());
/* Check whether the successor block already restores exec. */
uint16_t block_kind = ctx.program->blocks[block->linear_succs[0]].kind;
if (!(block_kind & (block_kind_loop_header | block_kind_loop_exit | block_kind_invert |
block_kind_merge))) {
/* The successor does not restore exec. */
restore_exec = true;
}
}
} else {
assert(preds.size() == 2);
assert(ctx.info[preds[0]].exec.size() == ctx.info[preds[1]].exec.size());
@@ -627,15 +642,14 @@ add_branch_code(exec_ctx& ctx, Block* block)
assert(block->instructions.back()->opcode == aco_opcode::p_branch);
block->instructions.pop_back();
bool need_parallelcopy = false;
while (!(ctx.info[idx].exec.back().type & mask_type_loop)) {
while (!(ctx.info[idx].exec.back().type & mask_type_loop))
ctx.info[idx].exec.pop_back();
need_parallelcopy = true;
}
if (need_parallelcopy)
bld.copy(Definition(exec, bld.lm), ctx.info[idx].exec.back().op);
bld.branch(aco_opcode::p_cbranch_nz, Operand(exec, bld.lm), block->linear_succs[1],
Temp cond = bld.sop2(Builder::s_or, bld.def(bld.lm), bld.def(s1, scc),
ctx.info[idx].exec.back().op, Operand::zero(bld.lm.bytes()))
.def(1)
.getTemp();
bld.branch(aco_opcode::p_cbranch_nz, Operand(cond, scc), block->linear_succs[1],
block->linear_succs[0]);
} else if (block->kind & block_kind_uniform) {
Pseudo_branch_instruction& branch = block->instructions.back()->branch();
@@ -703,14 +717,8 @@ add_branch_code(exec_ctx& ctx, Block* block)
break;
}
/* check if the successor is the merge block, otherwise set exec to 0 */
// TODO: this could be done better by directly branching to the merge block
unsigned succ_idx = ctx.program->blocks[block->linear_succs[1]].linear_succs[0];
Block& succ = ctx.program->blocks[succ_idx];
if (!(succ.kind & block_kind_invert || succ.kind & block_kind_merge)) {
bld.copy(Definition(exec, bld.lm), Operand::zero(bld.lm.bytes()));
}
/* Implicitly set exec to zero and branch. */
ctx.info[idx].exec.back().op = Operand::zero(bld.lm.bytes());
bld.branch(aco_opcode::p_cbranch_nz, bld.scc(cond), block->linear_succs[1],
block->linear_succs[0]);
} else if (block->kind & block_kind_continue) {
@@ -729,14 +737,8 @@ add_branch_code(exec_ctx& ctx, Block* block)
}
assert(cond != Temp());
/* check if the successor is the merge block, otherwise set exec to 0 */
// TODO: this could be done better by directly branching to the merge block
unsigned succ_idx = ctx.program->blocks[block->linear_succs[1]].linear_succs[0];
Block& succ = ctx.program->blocks[succ_idx];
if (!(succ.kind & block_kind_invert || succ.kind & block_kind_merge)) {
bld.copy(Definition(exec, bld.lm), Operand::zero(bld.lm.bytes()));
}
/* Implicitly set exec to zero and branch. */
ctx.info[idx].exec.back().op = Operand::zero(bld.lm.bytes());
bld.branch(aco_opcode::p_cbranch_nz, bld.scc(cond), block->linear_succs[1],
block->linear_succs[0]);
} else {

View File

@@ -287,8 +287,13 @@ check_instr(wait_ctx& ctx, wait_imm& wait, Instruction* instr)
if (vmem_type && ctx.gfx_level < GFX12) {
wait_event event = get_vmem_event(ctx, instr, vmem_type);
wait_type type = (wait_type)(ffs(ctx.info->get_counters_for_event(event)) - 1);
if ((it->second.events & ctx.info->events[type]) == event &&
(type != wait_type_vm || it->second.vmem_types == vmem_type))
bool event_matches = (it->second.events & ctx.info->events[type]) == event;
/* wait_type_vm/counter_vm can have several different vmem_types */
bool type_matches = type != wait_type_vm || (it->second.vmem_types == vmem_type &&
util_bitcount(vmem_type) == 1);
if (event_matches && type_matches)
reg_imm[type] = wait_imm::unset_counter;
}
@@ -319,9 +324,9 @@ perform_barrier(wait_ctx& ctx, wait_imm& imm, memory_sync_info sync, unsigned se
if (bar_scope_lds <= subgroup_scope)
events &= ~event_lds;
/* Until GFX12, in non-WGP, the L1 (L0 on GFX10+) cache keeps all memory operations
/* Until GFX11, in non-WGP, the L1 (L0 on GFX10+) cache keeps all memory operations
* in-order for the same workgroup */
if (ctx.gfx_level < GFX12 && !ctx.program->wgp_mode && sync.scope <= scope_workgroup)
if (ctx.gfx_level < GFX11 && !ctx.program->wgp_mode && sync.scope <= scope_workgroup)
events &= ~(event_vmem | event_vmem_store | event_smem);
if (events)

View File

@@ -7391,7 +7391,9 @@ Temp
get_scratch_resource(isel_context* ctx)
{
Builder bld(ctx->program, ctx->block);
Temp scratch_addr = ctx->program->private_segment_buffer;
Temp scratch_addr;
if (!ctx->program->private_segment_buffers.empty())
scratch_addr = ctx->program->private_segment_buffers.back();
if (!scratch_addr.bytes()) {
Temp addr_lo =
bld.sop1(aco_opcode::p_load_symbol, bld.def(s1), Operand::c32(aco_symbol_scratch_addr_lo));
@@ -7449,7 +7451,7 @@ visit_load_scratch(isel_context* ctx, nir_intrinsic_instr* instr)
} else {
info.resource = get_scratch_resource(ctx);
info.offset = Operand(as_vgpr(ctx, get_ssa_temp(ctx, instr->src[0].ssa)));
info.soffset = ctx->program->scratch_offset;
info.soffset = ctx->program->scratch_offsets.back();
emit_load(ctx, bld, info, scratch_mubuf_load_params);
}
}
@@ -7505,7 +7507,7 @@ visit_store_scratch(isel_context* ctx, nir_intrinsic_instr* instr)
offset = as_vgpr(ctx, offset);
for (unsigned i = 0; i < write_count; i++) {
aco_opcode op = get_buffer_store_op(write_datas[i].bytes());
Instruction* mubuf = bld.mubuf(op, rsrc, offset, ctx->program->scratch_offset,
Instruction* mubuf = bld.mubuf(op, rsrc, offset, ctx->program->scratch_offsets.back(),
write_datas[i], offsets[i], true);
mubuf->mubuf().sync = memory_sync_info(storage_scratch, semantic_private);
unsigned access = ACCESS_TYPE_STORE | ACCESS_IS_SWIZZLED_AMD |
@@ -7932,7 +7934,7 @@ visit_cmat_muladd(isel_context* ctx, nir_intrinsic_instr* instr)
Operand B(as_vgpr(ctx, get_ssa_temp(ctx, instr->src[1].ssa)));
Operand C(as_vgpr(ctx, get_ssa_temp(ctx, instr->src[2].ssa)));
VALU_instruction& vop3p = bld.vop3p(opcode, Definition(dst), A, B, C, 0, 0)->valu();
VALU_instruction& vop3p = bld.vop3p(opcode, Definition(dst), A, B, C, 0, 0x7)->valu();
vop3p.neg_lo[0] = (signed_mask & 0x1) != 0;
vop3p.neg_lo[1] = (signed_mask & 0x2) != 0;
vop3p.clamp = clamp;
@@ -8082,24 +8084,8 @@ visit_intrinsic(isel_context* ctx, nir_intrinsic_instr* instr)
aco_opcode subrev =
instr->def.bit_size == 16 ? aco_opcode::v_subrev_f16 : aco_opcode::v_subrev_f32;
/* v_interp with constant sources only works on GFX11/11.5,
* and it's only faster on GFX11.5.
*/
bool use_interp = dpp_ctrl1 == dpp_quad_perm(0, 0, 0, 0) && instr->def.bit_size == 32 &&
ctx->program->gfx_level == GFX11_5;
if (!nir_src_is_divergent(&instr->src[0])) {
bld.vop2(subrev, Definition(dst), src, src);
} else if (use_interp && dpp_ctrl2 == dpp_quad_perm(1, 1, 1, 1)) {
bld.vinterp_inreg(aco_opcode::v_interp_p10_f32_inreg, Definition(dst), src,
Operand::c32(0x3f800000), src)
->valu()
.neg[2] = true;
} else if (use_interp && dpp_ctrl2 == dpp_quad_perm(2, 2, 2, 2)) {
Builder::Result tmp = bld.vinterp_inreg(aco_opcode::v_interp_p10_f32_inreg, bld.def(v1),
Operand::c32(0), Operand::c32(0), src);
tmp->valu().neg = 0x6;
bld.vinterp_inreg(aco_opcode::v_interp_p2_f32_inreg, Definition(dst), src,
Operand::c32(0x3f800000), tmp);
} else if (ctx->program->gfx_level >= GFX8 && dpp_ctrl2 == dpp_quad_perm(0, 1, 2, 3)) {
bld.vop2_dpp(subrev, Definition(dst), src, src, dpp_ctrl1);
} else if (ctx->program->gfx_level >= GFX8) {
@@ -8662,6 +8648,11 @@ visit_intrinsic(isel_context* ctx, nir_intrinsic_instr* instr)
if (ctx->shader->info.maximally_reconverges)
ctx->program->needs_wqm = true;
if (ctx->block->loop_nest_depth || ctx->cf_info.parent_if.is_divergent) {
ctx->cf_info.exec.potentially_empty_discard = true;
begin_empty_exec_skip(ctx, &instr->instr, instr->instr.block);
}
break;
}
case nir_intrinsic_terminate:
@@ -10940,9 +10931,9 @@ add_startpgm(struct isel_context* ctx)
* handling spilling.
*/
if (ctx->args->ring_offsets.used)
ctx->program->private_segment_buffer = get_arg(ctx, ctx->args->ring_offsets);
ctx->program->private_segment_buffers.push_back(get_arg(ctx, ctx->args->ring_offsets));
ctx->program->scratch_offset = get_arg(ctx, ctx->args->scratch_offset);
ctx->program->scratch_offsets.push_back(get_arg(ctx, ctx->args->scratch_offset));
} else if (ctx->program->gfx_level <= GFX10_3 && ctx->program->stage != raytracing_cs) {
/* Manually initialize scratch. For RT stages scratch initialization is done in the prolog.
*/

View File

@@ -75,6 +75,7 @@ init_program(Program* program, Stage stage, const struct aco_shader_info* info,
case GFX10: program->family = CHIP_NAVI10; break;
case GFX10_3: program->family = CHIP_NAVI21; break;
case GFX11: program->family = CHIP_NAVI31; break;
case GFX11_5: program->family = CHIP_GFX1150; break;
case GFX12: program->family = CHIP_GFX1200; break;
default: program->family = CHIP_UNKNOWN; break;
}
@@ -151,7 +152,9 @@ init_program(Program* program, Stage stage, const struct aco_shader_info* info,
default: break;
}
program->dev.sram_ecc_enabled = program->family == CHIP_MI100;
program->dev.sram_ecc_enabled = program->family == CHIP_VEGA20 ||
program->family == CHIP_MI100 || program->family == CHIP_MI200 ||
program->family == CHIP_GFX940;
/* apparently gfx702 also has fast v_fma_f32 but I can't find a family for that */
program->dev.has_fast_fma32 = program->gfx_level >= GFX9;
if (program->family == CHIP_TAHITI || program->family == CHIP_CARRIZO ||
@@ -1430,15 +1433,20 @@ get_op_fixed_to_def(Instruction* instr)
uint8_t
get_vmem_type(enum amd_gfx_level gfx_level, Instruction* instr)
{
if (instr->opcode == aco_opcode::image_bvh64_intersect_ray)
if (instr->opcode == aco_opcode::image_bvh64_intersect_ray) {
return vmem_bvh;
else if (gfx_level >= GFX12 && instr->opcode == aco_opcode::image_msaa_load)
} else if (gfx_level >= GFX12 && instr->opcode == aco_opcode::image_msaa_load) {
return vmem_sampler;
else if (instr->isMIMG() && !instr->operands[1].isUndefined() &&
instr->operands[1].regClass() == s4)
return vmem_sampler;
else if (instr->isVMEM() || instr->isScratch() || instr->isGlobal())
} else if (instr->isMIMG() && !instr->operands[1].isUndefined() &&
instr->operands[1].regClass() == s4) {
bool point_sample_accel =
gfx_level == GFX11_5 && (instr->opcode == aco_opcode::image_sample ||
instr->opcode == aco_opcode::image_sample_l ||
instr->opcode == aco_opcode::image_sample_lz);
return vmem_sampler | (point_sample_accel ? vmem_nosampler : 0);
} else if (instr->isVMEM() || instr->isScratch() || instr->isGlobal()) {
return vmem_nosampler;
}
return 0;
}

View File

@@ -2130,8 +2130,9 @@ public:
std::vector<ac_shader_debug_info> debug_info;
std::vector<uint8_t> constant_data;
Temp private_segment_buffer;
Temp scratch_offset;
/* Private segment buffers and scratch offsets. One entry per start/resume block */
aco::small_vec<Temp, 2> private_segment_buffers;
aco::small_vec<Temp, 2> scratch_offsets;
uint16_t num_waves = 0;
uint16_t min_waves = 0;

View File

@@ -3096,6 +3096,9 @@ apply_omod_clamp(opt_ctx& ctx, aco_ptr<Instruction>& instr)
if (needs_vop3 && !can_vop3)
return false;
if (instr_info.classes[(int)instr->opcode] == instr_class::valu_pseudo_scalar_trans)
return false;
/* SDWA omod is GFX9+. */
bool can_use_omod = (can_vop3 || ctx.program->gfx_level >= GFX9) && !instr->isVOP3P() &&
(!instr->isVINTERP_INREG() || interp_can_become_fma(ctx, instr));

View File

@@ -253,6 +253,12 @@ struct DefInfo {
if (imageGather4D16Bug)
bounds.size -= MAX2(rc.bytes() / 4 - ctx.num_linear_vgprs, 0);
} else if (instr_info.classes[(int)instr->opcode] == instr_class::valu_pseudo_scalar_trans) {
/* RDNA4 ISA doc, 7.10. Pseudo-scalar Transcendental ALU ops:
* - VCC may not be used as a destination
*/
if (bounds.contains(vcc))
bounds.size = vcc - bounds.lo();
}
if (!data_stride)
@@ -1274,7 +1280,7 @@ get_reg_impl(ra_ctx& ctx, const RegisterFile& reg_file,
RegClass rc = info.rc;
/* check how many free regs we have */
unsigned regs_free = reg_file.count_zero(bounds);
unsigned regs_free = reg_file.count_zero(get_reg_bounds(ctx, rc));
/* mark and count killed operands */
unsigned killed_ops = 0;
@@ -1427,6 +1433,14 @@ get_reg_specified(ra_ctx& ctx, const RegisterFile& reg_file, RegClass rc,
if (!info.bounds.contains(reg_win) && !is_vcc && !is_m0)
return false;
if (instr_info.classes[(int)instr->opcode] == instr_class::valu_pseudo_scalar_trans) {
/* RDNA4 ISA doc, 7.10. Pseudo-scalar Transcendental ALU ops:
* - VCC may not be used as a destination
*/
if (vcc_win.contains(reg_win))
return false;
}
if (reg_file.test(reg, info.rc.bytes()))
return false;
@@ -1835,7 +1849,7 @@ get_reg(ra_ctx& ctx, const RegisterFile& reg_file, Temp temp,
/* We should only fail here because keeping under the limit would require
* too many moves. */
assert(reg_file.count_zero(info.bounds) >= info.size);
assert(reg_file.count_zero(get_reg_bounds(ctx, info.rc)) >= info.size);
/* try using more registers */
if (!increase_register_file(ctx, info.rc)) {

View File

@@ -69,10 +69,14 @@ reindex_program(idx_ctx& ctx, Program* program)
}
/* update program members */
program->private_segment_buffer = Temp(ctx.renames[program->private_segment_buffer.id()],
program->private_segment_buffer.regClass());
program->scratch_offset =
Temp(ctx.renames[program->scratch_offset.id()], program->scratch_offset.regClass());
for (auto& private_segment_buffer : program->private_segment_buffers) {
private_segment_buffer =
Temp(ctx.renames[private_segment_buffer.id()], private_segment_buffer.regClass());
}
for (auto& scratch_offset : program->scratch_offsets) {
scratch_offset =
Temp(ctx.renames[scratch_offset.id()], scratch_offset.regClass());
}
program->temp_rc = ctx.temp_rc;
}

View File

@@ -1264,8 +1264,12 @@ schedule_program(Program* program)
ctx.num_waves = std::max<uint16_t>(ctx.num_waves / wave_fac, 1);
assert(ctx.num_waves > 0);
ctx.mv.max_registers = {int16_t(get_addr_vgpr_from_waves(program, ctx.num_waves * wave_fac) - 2),
int16_t(get_addr_sgpr_from_waves(program, ctx.num_waves * wave_fac))};
ctx.mv.max_registers = {
int16_t(get_addr_vgpr_from_waves(
program, std::max<uint16_t>(ctx.num_waves * wave_fac, program->min_waves)) -
2),
int16_t(get_addr_sgpr_from_waves(
program, std::max<uint16_t>(ctx.num_waves * wave_fac, program->min_waves)))};
/* NGG culling shaders are very sensitive to position export scheduling.
* Schedule less aggressively when early primitive export is used, and

View File

@@ -213,7 +213,7 @@ get_vopd_info(const SchedILPContext& ctx, const Instruction* instr)
}
bool
is_vopd_compatible(const VOPDInfo& a, const VOPDInfo& b)
is_vopd_compatible(const VOPDInfo& a, const VOPDInfo& b, bool* swap)
{
if ((a.is_opy_only && b.is_opy_only) || (a.is_dst_odd == b.is_dst_odd))
return false;
@@ -222,6 +222,8 @@ is_vopd_compatible(const VOPDInfo& a, const VOPDInfo& b)
if (a.has_literal && b.has_literal && a.literal != b.literal)
return false;
*swap = false;
/* The rest is checking src VGPR bank compatibility. */
if ((a.src_banks & b.src_banks) == 0)
return true;
@@ -244,11 +246,13 @@ is_vopd_compatible(const VOPDInfo& a, const VOPDInfo& b)
if (b.op == aco_opcode::v_dual_mov_b32 && !a.is_commutative && a.is_opy_only)
return false;
*swap = true;
return true;
}
bool
can_use_vopd(const SchedILPContext& ctx, unsigned idx)
can_use_vopd(const SchedILPContext& ctx, unsigned idx, bool* prev_can_be_opx)
{
VOPDInfo cur_vopd = ctx.vopd[idx];
Instruction* first = ctx.nodes[idx].instr;
@@ -260,9 +264,14 @@ can_use_vopd(const SchedILPContext& ctx, unsigned idx)
if (ctx.prev_vopd_info.op == aco_opcode::num_opcodes || cur_vopd.op == aco_opcode::num_opcodes)
return false;
if (!is_vopd_compatible(ctx.prev_vopd_info, cur_vopd))
bool swap = false;
if (!is_vopd_compatible(ctx.prev_vopd_info, cur_vopd, &swap))
return false;
/* If we have to swap a v_mov_b32, it will become an OPY-only opcode. */
if (swap && !ctx.prev_vopd_info.is_commutative && cur_vopd.op == aco_opcode::v_dual_mov_b32)
cur_vopd.is_opy_only = true;
assert(first->definitions.size() == 1);
assert(first->definitions[0].size() == 1);
assert(second->definitions.size() == 1);
@@ -279,8 +288,23 @@ can_use_vopd(const SchedILPContext& ctx, unsigned idx)
return false;
}
/* WaR dependencies are not a concern. */
return true;
/* WaR dependencies are not a concern before GFX12. */
*prev_can_be_opx = true;
if (ctx.program->gfx_level >= GFX12) {
/* From RDNA4 ISA doc:
* The OPX instruction must not overwrite sources of the OPY instruction".
*/
bool war = false;
for (Operand op : first->operands) {
assert(op.size() == 1);
if (second->definitions[0].physReg() == op.physReg())
war = true;
}
if (war)
*prev_can_be_opx = false;
}
return *prev_can_be_opx || !cur_vopd.is_opy_only;
}
Instruction_cycle_info
@@ -619,9 +643,9 @@ select_instruction_ilp(const SchedILPContext& ctx)
bool
compare_nodes_vopd(const SchedILPContext& ctx, int num_vopd_odd_minus_even, bool* use_vopd,
unsigned current, unsigned candidate)
bool* prev_can_be_opx, unsigned current, unsigned candidate)
{
if (can_use_vopd(ctx, candidate)) {
if (can_use_vopd(ctx, candidate, prev_can_be_opx)) {
/* If we can form a VOPD instruction, always prefer to do so. */
if (!*use_vopd) {
*use_vopd = true;
@@ -657,7 +681,7 @@ compare_nodes_vopd(const SchedILPContext& ctx, int num_vopd_odd_minus_even, bool
}
unsigned
select_instruction_vopd(const SchedILPContext& ctx, bool* use_vopd)
select_instruction_vopd(const SchedILPContext& ctx, bool* use_vopd, bool* prev_can_be_opx)
{
*use_vopd = false;
@@ -679,11 +703,14 @@ select_instruction_vopd(const SchedILPContext& ctx, bool* use_vopd)
if (candidate.dependency_mask)
continue;
bool prev_can_be_opx_for_i;
if (cur == -1u) {
cur = i;
*use_vopd = can_use_vopd(ctx, i);
} else if (compare_nodes_vopd(ctx, num_vopd_odd_minus_even, use_vopd, cur, i)) {
*use_vopd = can_use_vopd(ctx, i, prev_can_be_opx);
} else if (compare_nodes_vopd(ctx, num_vopd_odd_minus_even, use_vopd, &prev_can_be_opx_for_i,
cur, i)) {
cur = i;
*prev_can_be_opx = prev_can_be_opx_for_i;
}
}
@@ -719,24 +746,29 @@ get_vopd_opcode_operands(const SchedILPContext& ctx, Instruction* instr, const V
}
Instruction*
create_vopd_instruction(const SchedILPContext& ctx, unsigned idx)
create_vopd_instruction(const SchedILPContext& ctx, unsigned idx, bool prev_can_be_opx)
{
Instruction* x = ctx.prev_info.instr;
Instruction* y = ctx.nodes[idx].instr;
VOPDInfo x_info = ctx.prev_vopd_info;
VOPDInfo y_info = ctx.vopd[idx];
x_info.is_opy_only |= !prev_can_be_opx;
bool swap_x = false, swap_y = false;
if (x_info.src_banks & y_info.src_banks) {
assert(x_info.is_commutative || y_info.is_commutative);
/* Avoid swapping v_mov_b32 because it will become an OPY-only opcode. */
if (x_info.op == aco_opcode::v_dual_mov_b32 && !y_info.is_commutative) {
if (x_info.op == aco_opcode::v_dual_mov_b32 && y_info.op == aco_opcode::v_dual_mov_b32) {
swap_x = x_info.is_opy_only;
swap_y = !swap_x;
} else if (x_info.op == aco_opcode::v_dual_mov_b32 && !y_info.is_commutative) {
swap_x = true;
x_info.is_opy_only = true;
} else {
swap_x = x_info.is_commutative && x_info.op != aco_opcode::v_dual_mov_b32;
swap_y = y_info.is_commutative && !swap_x;
}
y_info.is_opy_only |= swap_y && y_info.op == aco_opcode::v_dual_mov_b32;
}
if (x_info.is_opy_only) {
@@ -744,6 +776,7 @@ create_vopd_instruction(const SchedILPContext& ctx, unsigned idx)
std::swap(x_info, y_info);
std::swap(swap_x, swap_y);
}
assert(!x_info.is_opy_only);
aco_opcode x_op, y_op;
unsigned num_operands = 0;
@@ -774,14 +807,15 @@ do_schedule(SchedILPContext& ctx, It& insert_it, It& remove_it, It instructions_
ctx.prev_info.instr = NULL;
bool use_vopd = false;
bool prev_can_be_opx;
while (ctx.active_mask) {
unsigned next_idx =
ctx.is_vopd ? select_instruction_vopd(ctx, &use_vopd) : select_instruction_ilp(ctx);
unsigned next_idx = ctx.is_vopd ? select_instruction_vopd(ctx, &use_vopd, &prev_can_be_opx)
: select_instruction_ilp(ctx);
Instruction* next_instr = ctx.nodes[next_idx].instr;
if (use_vopd) {
std::prev(insert_it)->reset(create_vopd_instruction(ctx, next_idx));
std::prev(insert_it)->reset(create_vopd_instruction(ctx, next_idx, prev_can_be_opx));
ctx.prev_info.instr = NULL;
} else {
(insert_it++)->reset(next_instr);

View File

@@ -88,13 +88,16 @@ struct spill_ctx {
unsigned vgpr_spill_slots;
Temp scratch_rsrc;
unsigned resume_idx;
spill_ctx(const RegisterDemand target_pressure_, Program* program_)
: target_pressure(target_pressure_), program(program_), memory(),
renames(program->blocks.size(), aco::map<Temp, Temp>(memory)),
spills_entry(program->blocks.size(), aco::unordered_map<Temp, uint32_t>(memory)),
spills_exit(program->blocks.size(), aco::unordered_map<Temp, uint32_t>(memory)),
processed(program->blocks.size(), false), ssa_infos(program->peekAllocationId()),
remat(memory), wave_size(program->wave_size), sgpr_spill_slots(0), vgpr_spill_slots(0)
remat(memory), wave_size(program->wave_size), sgpr_spill_slots(0), vgpr_spill_slots(0),
resume_idx(0)
{}
void add_affinity(uint32_t first, uint32_t second)
@@ -1088,7 +1091,10 @@ spill_block(spill_ctx& ctx, unsigned block_idx)
Temp
load_scratch_resource(spill_ctx& ctx, Builder& bld, bool apply_scratch_offset)
{
Temp private_segment_buffer = ctx.program->private_segment_buffer;
Temp private_segment_buffer;
if (!ctx.program->private_segment_buffers.empty())
private_segment_buffer = ctx.program->private_segment_buffers[ctx.resume_idx];
if (!private_segment_buffer.bytes()) {
Temp addr_lo =
bld.sop1(aco_opcode::p_load_symbol, bld.def(s1), Operand::c32(aco_symbol_scratch_addr_lo));
@@ -1109,7 +1115,7 @@ load_scratch_resource(spill_ctx& ctx, Builder& bld, bool apply_scratch_offset)
Temp carry = bld.tmp(s1);
addr_lo = bld.sop2(aco_opcode::s_add_u32, bld.def(s1), bld.scc(Definition(carry)), addr_lo,
ctx.program->scratch_offset);
ctx.program->scratch_offsets[ctx.resume_idx]);
addr_hi = bld.sop2(aco_opcode::s_addc_u32, bld.def(s1), bld.def(s1, scc), addr_hi,
Operand::c32(0), bld.scc(carry));
@@ -1218,7 +1224,9 @@ spill_vgpr(spill_ctx& ctx, Block& block, std::vector<aco_ptr<Instruction>>& inst
uint32_t spill_id = spill->operands[1].constantValue();
uint32_t spill_slot = slots[spill_id];
Temp scratch_offset = ctx.program->scratch_offset;
Temp scratch_offset;
if (!ctx.program->scratch_offsets.empty())
scratch_offset = ctx.program->scratch_offsets[ctx.resume_idx];
unsigned offset;
setup_vgpr_spill_reload(ctx, block, instructions, spill_slot, scratch_offset, &offset);
@@ -1264,7 +1272,9 @@ reload_vgpr(spill_ctx& ctx, Block& block, std::vector<aco_ptr<Instruction>>& ins
uint32_t spill_id = reload->operands[0].constantValue();
uint32_t spill_slot = slots[spill_id];
Temp scratch_offset = ctx.program->scratch_offset;
Temp scratch_offset;
if (!ctx.program->scratch_offsets.empty())
scratch_offset = ctx.program->scratch_offsets[ctx.resume_idx];
unsigned offset;
setup_vgpr_spill_reload(ctx, block, instructions, spill_slot, scratch_offset, &offset);
@@ -1488,6 +1498,8 @@ assign_spill_slots(spill_ctx& ctx, unsigned spills_to_vgpr)
* we cannot reuse the current scratch_rsrc temp because its definition is unreachable */
if (block.linear_preds.empty())
ctx.scratch_rsrc = Temp();
if (block.kind & block_kind_resume)
++ctx.resume_idx;
}
std::vector<aco_ptr<Instruction>>::iterator it;

View File

@@ -59,20 +59,13 @@ collect_phi_info(ssa_elimination_ctx& ctx)
void
insert_parallelcopies(ssa_elimination_ctx& ctx)
{
/* insert the parallelcopies from logical phis before p_logical_end */
/* insert the parallelcopies from logical phis before branch */
for (unsigned block_idx = 0; block_idx < ctx.program->blocks.size(); ++block_idx) {
auto& logical_phi_info = ctx.logical_phi_info[block_idx];
if (logical_phi_info.empty())
continue;
Block& block = ctx.program->blocks[block_idx];
unsigned idx = block.instructions.size() - 1;
while (block.instructions[idx]->opcode != aco_opcode::p_logical_end) {
assert(idx > 0);
idx--;
}
std::vector<aco_ptr<Instruction>>::iterator it = std::next(block.instructions.begin(), idx);
aco_ptr<Instruction> pc{create_instruction(aco_opcode::p_parallelcopy, Format::PSEUDO,
logical_phi_info.size(), logical_phi_info.size())};
unsigned i = 0;
@@ -82,6 +75,7 @@ insert_parallelcopies(ssa_elimination_ctx& ctx)
i++;
}
pc->pseudo().needs_scratch_reg = false;
auto it = std::prev(block.instructions.end());
block.instructions.insert(it, std::move(pc));
}

View File

@@ -340,6 +340,81 @@ BEGIN_TEST(insert_waitcnt.waw.vmem_types)
}
END_TEST
BEGIN_TEST(insert_waitcnt.waw.point_sample_accel)
if (!setup_cs(NULL, GFX11_5))
return;
Definition def_v4(PhysReg(260), v1);
Operand op_v0(PhysReg(256), v1);
Operand desc_s4(PhysReg(0), s4);
Operand desc_s8(PhysReg(8), s8);
/* image_sample has point sample acceleration, but image_sample_b does not. Both are VMEM sample
* instructions. */
//>> p_unit_test 0
//! v1: %0:v[4] = image_sample %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
//! s_waitcnt vmcnt(0)
//! v1: %0:v[4] = image_sample %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(0));
bld.mimg(aco_opcode::image_sample, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
bld.mimg(aco_opcode::image_sample, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
//>> p_unit_test 1
//! v1: %0:v[4] = image_sample_b %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
//! s_waitcnt vmcnt(0)
//! v1: %0:v[4] = image_sample %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
bld.reset(program->create_and_insert_block());
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(1));
bld.mimg(aco_opcode::image_sample_b, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
bld.mimg(aco_opcode::image_sample, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
//>> p_unit_test 2
//! v1: %0:v[4] = image_load %0:s[8-15], s4: undef, v1: undef, %0:v[0] 1d
//! s_waitcnt vmcnt(0)
//! v1: %0:v[4] = image_sample %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
bld.reset(program->create_and_insert_block());
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(2));
bld.mimg(aco_opcode::image_load, def_v4, desc_s8, Operand(s4), Operand(v1), op_v0);
bld.mimg(aco_opcode::image_sample, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
//>> p_unit_test 3
//! v1: %0:v[4] = image_sample %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
//! s_waitcnt vmcnt(0)
//! v1: %0:v[4] = image_sample_b %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
bld.reset(program->create_and_insert_block());
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(3));
bld.mimg(aco_opcode::image_sample, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
bld.mimg(aco_opcode::image_sample_b, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
//>> p_unit_test 4
//! v1: %0:v[4] = image_sample %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
//! s_waitcnt vmcnt(0)
//! v1: %0:v[4] = image_load %0:s[8-15], s4: undef, v1: undef, %0:v[0] 1d
bld.reset(program->create_and_insert_block());
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(4));
bld.mimg(aco_opcode::image_sample, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
bld.mimg(aco_opcode::image_load, def_v4, desc_s8, Operand(s4), Operand(v1), op_v0);
//>> p_unit_test 5
//! v1: %0:v[4] = image_sample_b %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
//! v1: %0:v[4] = image_sample_b %0:s[8-15], %0:s[0-3], v1: undef, %0:v[0] 1d
bld.reset(program->create_and_insert_block());
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(5));
bld.mimg(aco_opcode::image_sample_b, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
bld.mimg(aco_opcode::image_sample_b, def_v4, desc_s8, desc_s4, Operand(v1), op_v0);
//>> p_unit_test 5
//! v1: %0:v[4] = image_load %0:s[8-15], s4: undef, v1: undef, %0:v[0] 1d
//! v1: %0:v[4] = image_load %0:s[8-15], s4: undef, v1: undef, %0:v[0] 1d
bld.reset(program->create_and_insert_block());
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(5));
bld.mimg(aco_opcode::image_load, def_v4, desc_s8, Operand(s4), Operand(v1), op_v0);
bld.mimg(aco_opcode::image_load, def_v4, desc_s8, Operand(s4), Operand(v1), op_v0);
finish_waitcnt_test();
END_TEST
BEGIN_TEST(insert_waitcnt.vmem)
if (!setup_cs(NULL, GFX12))
return;

View File

@@ -2083,3 +2083,18 @@ BEGIN_TEST(optimizer.trans_inline_constant)
finish_opt_test();
END_TEST
BEGIN_TEST(optimizer.trans_no_omod)
//>> s1: %a = p_startpgm
if (!setup_cs("s1", GFX12))
return;
//! s1: %tmp0 = v_s_log_f32 %a
//! v1: %res = v_mul_legacy_f32 %tmp0, 0.5
//! p_unit_test 0, %res
Temp dst = bld.vop3(aco_opcode::v_s_log_f32, bld.def(s1), inputs[0]);
writeout(0, bld.vop2(aco_opcode::v_mul_legacy_f32, bld.def(v1), dst,
bld.copy(bld.def(v1), Operand::c32(0x3f000000))));
finish_opt_test();
END_TEST

View File

@@ -153,3 +153,46 @@ BEGIN_TEST(vopd_sched.mov_to_add_bfrev)
finish_schedule_vopd_test();
END_TEST
BEGIN_TEST(vopd_sched.war)
for (amd_gfx_level gfx : {GFX11, GFX12}) {
if (!setup_cs(NULL, gfx, CHIP_UNKNOWN, "", 32))
continue;
PhysReg reg_v0{256};
PhysReg reg_v1{257};
PhysReg reg_v3{259};
PhysReg reg_v5{261};
//>> p_unit_test 0
//~gfx11! v1: %0:v[1] = v_dual_add_f32 %0:v[3], %0:v[1] :: v1: %0:v[0] = v_dual_mul_f32 %0:v[1], %0:v[3]
//~gfx12! v1: %0:v[0] = v_dual_mul_f32 %0:v[1], %0:v[3] :: v1: %0:v[1] = v_dual_add_f32 %0:v[3], %0:v[1]
bld.pseudo(aco_opcode::p_unit_test, Operand::zero());
bld.vop2(aco_opcode::v_mul_f32, Definition(reg_v0, v1), Operand(reg_v1, v1),
Operand(reg_v3, v1));
bld.vop2(aco_opcode::v_add_f32, Definition(reg_v1, v1), Operand(reg_v3, v1),
Operand(reg_v1, v1));
/* We can't use OPX for the v_mul_f32 because of the WaR, but we also can't use OPX for the
* v_add_u32 because that opcode is OPY-only. */
//>> p_unit_test 1
//~gfx11! v1: %0:v[1] = v_dual_mul_f32 %0:v[3], %0:v[1] :: v1: %0:v[0] = v_dual_add_nc_u32 %0:v[1], %0:v[3]
//~gfx12! v1: %0:v[0] = v_add_u32 %0:v[1], %0:v[3]
//~gfx12! v1: %0:v[1] = v_mul_f32 %0:v[3], %0:v[1]
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(1));
bld.vop2(aco_opcode::v_add_u32, Definition(reg_v0, v1), Operand(reg_v1, v1),
Operand(reg_v3, v1));
bld.vop2(aco_opcode::v_mul_f32, Definition(reg_v1, v1), Operand(reg_v3, v1),
Operand(reg_v1, v1));
/* Test that we swap the right v_mov_b32. */
//>> p_unit_test 2
//~gfx11! v1: %0:v[1] = v_dual_mov_b32 %0:v[5] :: v1: %0:v[0] = v_dual_add_nc_u32 0, %0:v[1]
//~gfx12! v1: %0:v[0] = v_dual_mov_b32 %0:v[1] :: v1: %0:v[1] = v_dual_add_nc_u32 0, %0:v[5]
bld.pseudo(aco_opcode::p_unit_test, Operand::c32(2));
bld.vop1(aco_opcode::v_mov_b32, Definition(reg_v0, v1), Operand(reg_v1, v1));
bld.vop1(aco_opcode::v_mov_b32, Definition(reg_v1, v1), Operand(reg_v5, v1));
finish_schedule_vopd_test();
}
END_TEST

View File

@@ -1578,11 +1578,14 @@ static void visit_store_ssbo(struct ac_nir_context *ctx, nir_intrinsic_instr *in
num_bytes = 16;
}
/* check alignment of 16 Bit stores */
if (elem_size_bytes == 2 && num_bytes > 2 && (start % 2) == 1) {
writemask |= ((1u << (count - 1)) - 1u) << (start + 1);
/* check alignment of 8/16 Bit stores */
uint32_t align_mul = nir_intrinsic_align_mul(instr);
uint32_t align_offset = nir_intrinsic_align_offset(instr) + start * elem_size_bytes;
uint32_t align = nir_combined_align(align_mul, align_offset & (align_mul - 1));
if (align < MIN2(num_bytes, 4) || (ctx->ac.gfx_level == GFX6 && elem_size_bytes < 4)) {
writemask |= BITFIELD_RANGE(start + 1, count - 1);
count = 1;
num_bytes = 2;
num_bytes = elem_size_bytes;
}
/* Due to alignment issues, split stores of 8-bit/16-bit
@@ -1882,10 +1885,17 @@ static LLVMValueRef visit_load_global(struct ac_nir_context *ctx,
val = LLVMBuildLoad2(ctx->ac.builder, result_type, addr, "");
if (nir_intrinsic_access(instr) & (ACCESS_COHERENT | ACCESS_VOLATILE)) {
/* From the LLVM 21.0.0 language reference:
* > An alignment value higher than the size of the loaded type implies memory up to the
* > alignment value bytes can be safely loaded without trapping in the default address space.
* So limit the alignment to the access size, since this isn't true in NIR.
*/
uint32_t align = nir_intrinsic_align(instr);
uint32_t size = ac_get_type_size(result_type);
LLVMSetAlignment(val, MIN2(align, 1 << (ffs(size) - 1)));
if (nir_intrinsic_access(instr) & (ACCESS_COHERENT | ACCESS_VOLATILE))
LLVMSetOrdering(val, LLVMAtomicOrderingMonotonic);
LLVMSetAlignment(val, ac_get_type_size(result_type));
}
return val;
}
@@ -1904,10 +1914,12 @@ static void visit_store_global(struct ac_nir_context *ctx,
val = LLVMBuildStore(ctx->ac.builder, data, addr);
if (nir_intrinsic_access(instr) & (ACCESS_COHERENT | ACCESS_VOLATILE)) {
uint32_t align = nir_intrinsic_align(instr);
uint32_t size = ac_get_type_size(type);
LLVMSetAlignment(val, MIN2(align, 1 << (ffs(size) - 1)));
if (nir_intrinsic_access(instr) & (ACCESS_COHERENT | ACCESS_VOLATILE))
LLVMSetOrdering(val, LLVMAtomicOrderingMonotonic);
LLVMSetAlignment(val, ac_get_type_size(type));
}
}
static LLVMValueRef visit_global_atomic(struct ac_nir_context *ctx,

View File

@@ -42,45 +42,47 @@ main()
uint32_t ir_leaf_node_size;
uint32_t output_leaf_node_size;
switch (args.geometry_type) {
case VK_GEOMETRY_TYPE_TRIANGLES_KHR: {
ir_leaf_node_size = SIZEOF(vk_ir_triangle_node);
output_leaf_node_size = SIZEOF(radv_bvh_triangle_node);
if (gl_GlobalInvocationID.x < args.leaf_node_count) {
switch (args.geometry_type) {
case VK_GEOMETRY_TYPE_TRIANGLES_KHR: {
ir_leaf_node_size = SIZEOF(vk_ir_triangle_node);
output_leaf_node_size = SIZEOF(radv_bvh_triangle_node);
vk_ir_triangle_node src_node =
DEREF(REF(vk_ir_triangle_node)(OFFSET(args.intermediate_bvh, gl_GlobalInvocationID.x * ir_leaf_node_size)));
REF(radv_bvh_triangle_node) dst_node =
REF(radv_bvh_triangle_node)(OFFSET(args.output_bvh, dst_leaf_offset + gl_GlobalInvocationID.x * output_leaf_node_size));
vk_ir_triangle_node src_node =
DEREF(REF(vk_ir_triangle_node)(OFFSET(args.intermediate_bvh, gl_GlobalInvocationID.x * ir_leaf_node_size)));
REF(radv_bvh_triangle_node) dst_node =
REF(radv_bvh_triangle_node)(OFFSET(args.output_bvh, dst_leaf_offset + gl_GlobalInvocationID.x * output_leaf_node_size));
DEREF(dst_node).coords = src_node.coords;
DEREF(dst_node).triangle_id = src_node.triangle_id;
DEREF(dst_node).geometry_id_and_flags = src_node.geometry_id_and_flags;
DEREF(dst_node).id = 9;
DEREF(dst_node).coords = src_node.coords;
DEREF(dst_node).triangle_id = src_node.triangle_id;
DEREF(dst_node).geometry_id_and_flags = src_node.geometry_id_and_flags;
DEREF(dst_node).id = 9;
break;
}
case VK_GEOMETRY_TYPE_AABBS_KHR: {
ir_leaf_node_size = SIZEOF(vk_ir_aabb_node);
output_leaf_node_size = SIZEOF(radv_bvh_aabb_node);
break;
}
case VK_GEOMETRY_TYPE_AABBS_KHR: {
ir_leaf_node_size = SIZEOF(vk_ir_aabb_node);
output_leaf_node_size = SIZEOF(radv_bvh_aabb_node);
vk_ir_aabb_node src_node =
DEREF(REF(vk_ir_aabb_node)(OFFSET(args.intermediate_bvh, gl_GlobalInvocationID.x * ir_leaf_node_size)));
REF(radv_bvh_aabb_node) dst_node =
REF(radv_bvh_aabb_node)(OFFSET(args.output_bvh, dst_leaf_offset + gl_GlobalInvocationID.x * output_leaf_node_size));
vk_ir_aabb_node src_node =
DEREF(REF(vk_ir_aabb_node)(OFFSET(args.intermediate_bvh, gl_GlobalInvocationID.x * ir_leaf_node_size)));
REF(radv_bvh_aabb_node) dst_node =
REF(radv_bvh_aabb_node)(OFFSET(args.output_bvh, dst_leaf_offset + gl_GlobalInvocationID.x * output_leaf_node_size));
DEREF(dst_node).primitive_id = src_node.primitive_id;
DEREF(dst_node).geometry_id_and_flags = src_node.geometry_id_and_flags;
DEREF(dst_node).primitive_id = src_node.primitive_id;
DEREF(dst_node).geometry_id_and_flags = src_node.geometry_id_and_flags;
break;
}
default:
/* instances */
ir_leaf_node_size = SIZEOF(vk_ir_instance_node);
output_leaf_node_size = SIZEOF(radv_bvh_instance_node);
/* Instance nodes have to be emitted inside the loop since encoding them
* loads an address from the IR node which is uninitialized for inactive nodes.
*/
break;
break;
}
default:
/* instances */
ir_leaf_node_size = SIZEOF(vk_ir_instance_node);
output_leaf_node_size = SIZEOF(radv_bvh_instance_node);
/* Instance nodes have to be emitted inside the loop since encoding them
* loads an address from the IR node which is uninitialized for inactive nodes.
*/
break;
}
}
if (gl_GlobalInvocationID.x >= DEREF(args.header).ir_internal_node_count)

View File

@@ -13,13 +13,6 @@
#include "vk_pipeline_cache.h"
#include "vk_util.h"
#include <fcntl.h>
#include <limits.h>
#ifndef _WIN32
#include <pwd.h>
#endif
#include <sys/stat.h>
static void
radv_suspend_queries(struct radv_meta_saved_state *state, struct radv_cmd_buffer *cmd_buffer)
{
@@ -292,54 +285,11 @@ meta_free(void *_device, void *data)
device->vk.alloc.pfnFree(device->vk.alloc.pUserData, data);
}
#ifndef _WIN32
static bool
radv_builtin_cache_path(char *path)
{
char *xdg_cache_home = secure_getenv("XDG_CACHE_HOME");
const char *suffix = "/radv_builtin_shaders";
const char *suffix2 = "/.cache/radv_builtin_shaders";
struct passwd pwd, *result;
char path2[PATH_MAX + 1]; /* PATH_MAX is not a real max,but suffices here. */
int ret;
if (xdg_cache_home) {
ret = snprintf(path, PATH_MAX + 1, "%s%s%zd", xdg_cache_home, suffix, sizeof(void *) * 8);
return ret > 0 && ret < PATH_MAX + 1;
}
getpwuid_r(getuid(), &pwd, path2, PATH_MAX - strlen(suffix2), &result);
if (!result)
return false;
strcpy(path, pwd.pw_dir);
strcat(path, "/.cache");
if (mkdir(path, 0755) && errno != EEXIST)
return false;
ret = snprintf(path, PATH_MAX + 1, "%s%s%zd", pwd.pw_dir, suffix2, sizeof(void *) * 8);
return ret > 0 && ret < PATH_MAX + 1;
}
#endif
static uint32_t
num_cache_entries(VkPipelineCache cache)
{
struct set *s = vk_pipeline_cache_from_handle(cache)->object_cache;
if (!s)
return 0;
return s->entries;
}
static void
radv_load_meta_pipeline(struct radv_device *device)
radv_init_meta_cache(struct radv_device *device)
{
#ifndef _WIN32
char path[PATH_MAX + 1];
struct stat st;
void *data = NULL;
int fd = -1;
struct vk_pipeline_cache *cache = NULL;
const struct radv_physical_device *pdev = radv_device_physical(device);
struct vk_pipeline_cache *cache;
VkPipelineCacheCreateInfo create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_CACHE_CREATE_INFO,
@@ -347,81 +297,12 @@ radv_load_meta_pipeline(struct radv_device *device)
struct vk_pipeline_cache_create_info info = {
.pCreateInfo = &create_info,
.skip_disk_cache = true,
.disk_cache = pdev->disk_cache_meta,
};
if (!radv_builtin_cache_path(path))
goto fail;
fd = open(path, O_RDONLY);
if (fd < 0)
goto fail;
if (fstat(fd, &st))
goto fail;
data = malloc(st.st_size);
if (!data)
goto fail;
if (read(fd, data, st.st_size) == -1)
goto fail;
create_info.initialDataSize = st.st_size;
create_info.pInitialData = data;
fail:
cache = vk_pipeline_cache_create(&device->vk, &info, NULL);
if (cache) {
if (cache)
device->meta_state.cache = vk_pipeline_cache_to_handle(cache);
device->meta_state.initial_cache_entries = num_cache_entries(device->meta_state.cache);
}
free(data);
if (fd >= 0)
close(fd);
#endif
}
static void
radv_store_meta_pipeline(struct radv_device *device)
{
#ifndef _WIN32
char path[PATH_MAX + 1], path2[PATH_MAX + 7];
size_t size;
void *data = NULL;
if (device->meta_state.cache == VK_NULL_HANDLE)
return;
/* Skip serialization if no entries were added. */
if (num_cache_entries(device->meta_state.cache) <= device->meta_state.initial_cache_entries)
return;
if (vk_common_GetPipelineCacheData(radv_device_to_handle(device), device->meta_state.cache, &size, NULL))
return;
if (!radv_builtin_cache_path(path))
return;
strcpy(path2, path);
strcat(path2, "XXXXXX");
int fd = mkstemp(path2); // open(path, O_WRONLY | O_CREAT, 0600);
if (fd < 0)
return;
data = malloc(size);
if (!data)
goto fail;
if (vk_common_GetPipelineCacheData(radv_device_to_handle(device), device->meta_state.cache, &size, data))
goto fail;
if (write(fd, data, size) == -1)
goto fail;
rename(path2, path);
fail:
free(data);
close(fd);
unlink(path2);
#endif
}
VkResult
@@ -439,7 +320,7 @@ radv_device_init_meta(struct radv_device *device)
.pfnFree = meta_free,
};
radv_load_meta_pipeline(device);
radv_init_meta_cache(device);
result = vk_meta_device_init(&device->vk, &device->meta_state.device);
if (result != VK_SUCCESS)
@@ -488,7 +369,6 @@ radv_device_finish_meta(struct radv_device *device)
radv_device_finish_accel_struct_build_state(device);
radv_store_meta_pipeline(device);
vk_common_DestroyPipelineCache(radv_device_to_handle(device), device->meta_state.cache, NULL);
mtx_destroy(&device->meta_state.mtx);
@@ -612,8 +492,8 @@ radv_break_on_count(nir_builder *b, nir_variable *var, nir_def *count)
VkResult
radv_meta_get_noop_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-noop";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_NOOP;
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, NULL, key_data, strlen(key_data),
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, NULL, &key, sizeof(key),
layout_out);
}

View File

@@ -102,6 +102,52 @@ radv_meta_dst_layout_to_layout(enum radv_meta_dst_layout layout)
extern const VkFormat radv_fs_key_format_exemplars[NUM_META_FS_KEYS];
enum radv_meta_object_key_type {
RADV_META_OBJECT_KEY_NOOP = VK_META_OBJECT_KEY_DRIVER_OFFSET,
RADV_META_OBJECT_KEY_BLIT,
RADV_META_OBJECT_KEY_BLIT2D,
RADV_META_OBJECT_KEY_BLIT2D_COLOR,
RADV_META_OBJECT_KEY_BLIT2D_DEPTH,
RADV_META_OBJECT_KEY_BLIT2D_STENCIL,
RADV_META_OBJECT_KEY_FILL_BUFFER,
RADV_META_OBJECT_KEY_COPY_BUFFER,
RADV_META_OBJECT_KEY_COPY_IMAGE_TO_BUFFER,
RADV_META_OBJECT_KEY_COPY_BUFFER_TO_IMAGE,
RADV_META_OBJECT_KEY_COPY_BUFFER_TO_IMAGE_R32G32B32,
RADV_META_OBJECT_KEY_COPY_IMAGE,
RADV_META_OBJECT_KEY_COPY_IMAGE_R32G32B32,
RADV_META_OBJECT_KEY_COPY_VRS_HTILE,
RADV_META_OBJECT_KEY_CLEAR_CS,
RADV_META_OBJECT_KEY_CLEAR_CS_R32G32B32,
RADV_META_OBJECT_KEY_CLEAR_COLOR,
RADV_META_OBJECT_KEY_CLEAR_DS,
RADV_META_OBJECT_KEY_CLEAR_HTILE,
RADV_META_OBJECT_KEY_CLEAR_DCC_COMP_TO_SINGLE,
RADV_META_OBJECT_KEY_FAST_CLEAR_ELIMINATE,
RADV_META_OBJECT_KEY_DCC_DECOMPRESS,
RADV_META_OBJECT_KEY_DCC_RETILE,
RADV_META_OBJECT_KEY_HTILE_EXPAND_GFX,
RADV_META_OBJECT_KEY_HTILE_EXPAND_CS,
RADV_META_OBJECT_KEY_FMASK_COPY,
RADV_META_OBJECT_KEY_FMASK_EXPAND,
RADV_META_OBJECT_KEY_FMASK_DECOMPRESS,
RADV_META_OBJECT_KEY_RESOLVE_HW,
RADV_META_OBJECT_KEY_RESOLVE_CS,
RADV_META_OBJECT_KEY_RESOLVE_COLOR_CS,
RADV_META_OBJECT_KEY_RESOLVE_DS_CS,
RADV_META_OBJECT_KEY_RESOLVE_FS,
RADV_META_OBJECT_KEY_RESOLVE_COLOR_FS,
RADV_META_OBJECT_KEY_RESOLVE_DS_FS,
RADV_META_OBJECT_KEY_DGC,
RADV_META_OBJECT_KEY_QUERY,
RADV_META_OBJECT_KEY_QUERY_OCCLUSION,
RADV_META_OBJECT_KEY_QUERY_PIPELINE_STATS,
RADV_META_OBJECT_KEY_QUERY_TFB,
RADV_META_OBJECT_KEY_QUERY_TIMESTAMP,
RADV_META_OBJECT_KEY_QUERY_PRIMS_GEN,
RADV_META_OBJECT_KEY_QUERY_MESH_PRIMS_GEN,
};
VkResult radv_device_init_meta(struct radv_device *device);
void radv_device_finish_meta(struct radv_device *device);

View File

@@ -165,7 +165,7 @@ translate_sampler_dim(VkImageType type)
static VkResult
get_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-blit";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_BLIT;
const VkDescriptorSetLayoutBinding binding = {
.binding = 0,
@@ -183,10 +183,17 @@ get_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
const VkPushConstantRange pc_range = {VK_SHADER_STAGE_VERTEX_BIT, 0, 20};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_blit_key {
enum radv_meta_object_key_type type;
VkImageAspectFlags aspects;
VkImageType image_type;
uint32_t fs_key;
};
static VkResult
get_pipeline(struct radv_device *device, const struct radv_image_view *src_iview,
const struct radv_image_view *dst_iview, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
@@ -195,18 +202,26 @@ get_pipeline(struct radv_device *device, const struct radv_image_view *src_iview
const struct radv_image *src_image = src_iview->image;
const struct radv_image *dst_image = dst_iview->image;
const enum glsl_sampler_dim tex_dim = translate_sampler_dim(src_image->vk.image_type);
unsigned fs_key = 0;
char key_data[64];
struct radv_blit_key key;
VkResult result;
result = get_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
if (src_image->vk.aspects == VK_IMAGE_ASPECT_COLOR_BIT)
fs_key = radv_format_meta_fs_key(device, dst_image->vk.format);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_BLIT;
key.aspects = src_image->vk.aspects;
key.image_type = src_image->vk.image_type;
snprintf(key_data, sizeof(key_data), "radv-blit-%d-%d-%d", src_image->vk.aspects, src_image->vk.image_type, fs_key);
if (src_image->vk.aspects == VK_IMAGE_ASPECT_COLOR_BIT)
key.fs_key = radv_format_meta_fs_key(device, dst_image->vk.format);
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
}
nir_shader *fs;
nir_shader *vs = build_nir_vertex_shader(device);
@@ -331,7 +346,7 @@ get_pipeline(struct radv_device *device, const struct radv_image_view *src_iview
case VK_IMAGE_ASPECT_COLOR_BIT:
pipeline_create_info.pColorBlendState = &color_blend_info;
render.color_attachment_count = 1;
render.color_attachment_formats[0] = radv_fs_key_format_exemplars[fs_key];
render.color_attachment_formats[0] = radv_fs_key_format_exemplars[key.fs_key];
break;
case VK_IMAGE_ASPECT_DEPTH_BIT:
pipeline_create_info.pDepthStencilState = &depth_info;
@@ -346,7 +361,7 @@ get_pipeline(struct radv_device *device, const struct radv_image_view *src_iview
}
result = vk_meta_create_graphics_pipeline(&device->vk, &device->meta_state.device, &pipeline_create_info, &render,
key_data, strlen(key_data), pipeline_out);
&key, sizeof(key), pipeline_out);
ralloc_free(vs);
ralloc_free(fs);

View File

@@ -167,11 +167,21 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer, struct radv_meta
if (vk_format_is_color(src_img->image->vk.format) && vk_format_is_depth_or_stencil(dst->image->vk.format)) {
assert(src_img->aspect_mask == VK_IMAGE_ASPECT_COLOR_BIT);
src_aspect_mask = src_img->aspect_mask;
} else if (vk_format_is_depth_or_stencil(src_img->image->vk.format) &&
vk_format_is_color(dst->image->vk.format)) {
if (src_img->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
depth_format = vk_format_stencil_only(src_img->image->vk.format);
} else {
assert(src_img->aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT);
depth_format = vk_format_depth_only(src_img->image->vk.format);
}
}
}
struct radv_image_view dst_iview;
create_iview(cmd_buffer, dst, &dst_iview, depth_format, aspect_mask);
create_iview(cmd_buffer, dst, &dst_iview,
aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT) ? depth_format : 0,
aspect_mask);
const VkRenderingAttachmentInfo att_info = {
.sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO,
@@ -446,14 +456,22 @@ build_nir_copy_fragment_shader_stencil(struct radv_device *device, texel_fetch_b
return b.shader;
}
struct radv_blit2d_key {
enum radv_meta_object_key_type type;
uint32_t index;
};
static VkResult
create_layout(struct radv_device *device, int idx, VkPipelineLayout *layout_out)
{
const VkDescriptorType desc_type =
(idx == BLIT2D_SRC_TYPE_BUFFER) ? VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER : VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE;
char key_data[64];
snprintf(key_data, sizeof(key_data), "radv-blit2d-%d", idx);
struct radv_blit2d_key key;
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_BLIT2D;
key.index = idx;
const VkDescriptorSetLayoutBinding binding = {
.binding = 0,
@@ -474,16 +492,22 @@ create_layout(struct radv_device *device, int idx, VkPipelineLayout *layout_out)
.size = 20,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_blit2d_color_key {
enum radv_meta_object_key_type type;
enum blit2d_src_type src_type;
uint32_t log2_samples;
uint32_t fs_key;
};
static VkResult
get_color_pipeline(struct radv_device *device, enum blit2d_src_type src_type, VkFormat format, uint32_t log2_samples,
VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const unsigned fs_key = radv_format_meta_fs_key(device, format);
char key_data[64];
struct radv_blit2d_color_key key;
const char *name;
VkResult result;
@@ -491,7 +515,17 @@ get_color_pipeline(struct radv_device *device, enum blit2d_src_type src_type, Vk
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-blit2d-color-%d-%d-%d", src_type, log2_samples, fs_key);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_BLIT2D_COLOR;
key.src_type = src_type;
key.log2_samples = log2_samples;
key.fs_key = radv_format_meta_fs_key(device, format);
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
}
texel_fetch_build_func src_func;
switch (src_type) {
@@ -597,18 +631,24 @@ get_color_pipeline(struct radv_device *device, enum blit2d_src_type src_type, Vk
};
result = vk_meta_create_graphics_pipeline(&device->vk, &device->meta_state.device, &pipeline_create_info, &render,
key_data, strlen(key_data), pipeline_out);
&key, sizeof(key), pipeline_out);
ralloc_free(vs_module);
ralloc_free(fs_module);
return result;
}
struct radv_blit2d_ds_key {
enum radv_meta_object_key_type type;
enum blit2d_src_type src_type;
uint32_t log2_samples;
};
static VkResult
get_depth_only_pipeline(struct radv_device *device, enum blit2d_src_type src_type, uint32_t log2_samples,
VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
char key_data[64];
struct radv_blit2d_ds_key key;
const char *name;
VkResult result;
@@ -616,7 +656,16 @@ get_depth_only_pipeline(struct radv_device *device, enum blit2d_src_type src_typ
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-blit2d-depth-%d-%d", src_type, log2_samples);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_BLIT2D_DEPTH;
key.src_type = src_type;
key.log2_samples = log2_samples;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
}
texel_fetch_build_func src_func;
switch (src_type) {
@@ -746,7 +795,7 @@ get_depth_only_pipeline(struct radv_device *device, enum blit2d_src_type src_typ
};
result = vk_meta_create_graphics_pipeline(&device->vk, &device->meta_state.device, &pipeline_create_info, &render,
key_data, strlen(key_data), pipeline_out);
&key, sizeof(key), pipeline_out);
ralloc_free(vs_module);
ralloc_free(fs_module);
@@ -757,7 +806,7 @@ static VkResult
get_stencil_only_pipeline(struct radv_device *device, enum blit2d_src_type src_type, uint32_t log2_samples,
VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
char key_data[64];
struct radv_blit2d_ds_key key;
const char *name;
VkResult result;
@@ -765,7 +814,16 @@ get_stencil_only_pipeline(struct radv_device *device, enum blit2d_src_type src_t
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-blit2d-stencil-%d-%d", src_type, log2_samples);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_BLIT2D_STENCIL;
key.src_type = src_type;
key.log2_samples = log2_samples;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
}
texel_fetch_build_func src_func;
switch (src_type) {
@@ -890,7 +948,7 @@ get_stencil_only_pipeline(struct radv_device *device, enum blit2d_src_type src_t
};
result = vk_meta_create_graphics_pipeline(&device->vk, &device->meta_state.device, &pipeline_create_info, &render,
key_data, strlen(key_data), pipeline_out);
&key, sizeof(key), pipeline_out);
ralloc_free(vs_module);
ralloc_free(fs_module);

View File

@@ -39,7 +39,7 @@ struct fill_constants {
static VkResult
get_fill_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-fill-buffer";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_FILL_BUFFER;
VkResult result;
const VkPushConstantRange pc_range = {
@@ -47,12 +47,12 @@ get_fill_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipeli
.size = sizeof(struct fill_constants),
};
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, key_data,
strlen(key_data), layout_out);
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, &key, sizeof(key),
layout_out);
if (result != VK_SUCCESS)
return result;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -75,8 +75,8 @@ get_fill_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipeli
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -114,7 +114,7 @@ struct copy_constants {
static VkResult
get_copy_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-copy-buffer";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_COPY_BUFFER;
VkResult result;
const VkPushConstantRange pc_range = {
@@ -122,12 +122,12 @@ get_copy_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipeli
.size = sizeof(struct copy_constants),
};
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, key_data,
strlen(key_data), layout_out);
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, &key, sizeof(key),
layout_out);
if (result != VK_SUCCESS)
return result;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -150,8 +150,8 @@ get_copy_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipeli
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -301,11 +301,21 @@ radv_CmdFillBuffer(VkCommandBuffer commandBuffer, VkBuffer dstBuffer, VkDeviceSi
{
VK_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
VK_FROM_HANDLE(radv_buffer, dst_buffer, dstBuffer);
bool old_predicating;
/* VK_EXT_conditional_rendering says that copy commands should not be
* affected by conditional rendering.
*/
old_predicating = cmd_buffer->state.predicating;
cmd_buffer->state.predicating = false;
fillSize = vk_buffer_range(&dst_buffer->vk, dstOffset, fillSize) & ~3ull;
radv_fill_buffer(cmd_buffer, NULL, dst_buffer->bo,
radv_buffer_get_va(dst_buffer->bo) + dst_buffer->offset + dstOffset, fillSize, data);
/* Restore conditional rendering. */
cmd_buffer->state.predicating = old_predicating;
}
static void
@@ -369,6 +379,7 @@ radv_CmdUpdateBuffer(VkCommandBuffer commandBuffer, VkBuffer dstBuffer, VkDevice
VK_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
VK_FROM_HANDLE(radv_buffer, dst_buffer, dstBuffer);
struct radv_device *device = radv_cmd_buffer_device(cmd_buffer);
bool old_predicating;
uint64_t va = radv_buffer_get_va(dst_buffer->bo);
va += dstOffset + dst_buffer->offset;
@@ -378,6 +389,12 @@ radv_CmdUpdateBuffer(VkCommandBuffer commandBuffer, VkBuffer dstBuffer, VkDevice
if (!dataSize)
return;
/* VK_EXT_conditional_rendering says that copy commands should not be
* affected by conditional rendering.
*/
old_predicating = cmd_buffer->state.predicating;
cmd_buffer->state.predicating = false;
if (dataSize < RADV_BUFFER_UPDATE_THRESHOLD && cmd_buffer->qf != RADV_QUEUE_TRANSFER) {
radv_cs_add_buffer(device->ws, cmd_buffer->cs, dst_buffer->bo);
radv_update_buffer_cp(cmd_buffer, va, pData, dataSize);
@@ -387,4 +404,7 @@ radv_CmdUpdateBuffer(VkCommandBuffer commandBuffer, VkBuffer dstBuffer, VkDevice
radv_copy_buffer(cmd_buffer, cmd_buffer->upload.upload_bo, dst_buffer->bo, buf_offset,
dstOffset + dst_buffer->offset, dataSize);
}
/* Restore conditional rendering. */
cmd_buffer->state.predicating = old_predicating;
}

View File

@@ -58,7 +58,7 @@ build_nir_itob_compute_shader(struct radv_device *dev, bool is_3d)
static VkResult
get_itob_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-itob";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_COPY_IMAGE_TO_BUFFER;
const VkDescriptorSetLayoutBinding bindings[] = {
{
@@ -87,25 +87,32 @@ get_itob_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_ou
.size = 16,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_copy_buffer_image_key {
enum radv_meta_object_key_type type;
bool is_3d;
};
static VkResult
get_itob_pipeline(struct radv_device *device, const struct radv_image *image, VkPipeline *pipeline_out,
VkPipelineLayout *layout_out)
{
const bool is_3d = image->vk.image_type == VK_IMAGE_TYPE_3D;
char key_data[64];
struct radv_copy_buffer_image_key key;
VkResult result;
result = get_itob_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-itob-%d", is_3d);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_COPY_IMAGE_TO_BUFFER;
key.is_3d = is_3d;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -128,8 +135,8 @@ get_itob_pipeline(struct radv_device *device, const struct radv_image *image, Vk
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -178,7 +185,7 @@ build_nir_btoi_compute_shader(struct radv_device *dev, bool is_3d)
static VkResult
get_btoi_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-btoi";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_COPY_BUFFER_TO_IMAGE;
const VkDescriptorSetLayoutBinding bindings[] = {
{
@@ -207,8 +214,8 @@ get_btoi_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_ou
.size = 16,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
static VkResult
@@ -216,16 +223,18 @@ get_btoi_pipeline(struct radv_device *device, const struct radv_image *image, Vk
VkPipelineLayout *layout_out)
{
const bool is_3d = image->vk.image_type == VK_IMAGE_TYPE_3D;
char key_data[64];
struct radv_copy_buffer_image_key key;
VkResult result;
result = get_btoi_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-btoi-%d", is_3d);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_COPY_BUFFER_TO_IMAGE;
key.is_3d = is_3d;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -248,8 +257,8 @@ get_btoi_pipeline(struct radv_device *device, const struct radv_image *image, Vk
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -306,7 +315,7 @@ build_nir_btoi_r32g32b32_compute_shader(struct radv_device *dev)
static VkResult
get_btoi_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-btoi-r32g32b32";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_COPY_BUFFER_TO_IMAGE_R32G32B32;
VkResult result;
const VkDescriptorSetLayoutBinding bindings[] = {
@@ -336,12 +345,12 @@ get_btoi_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out
.size = 16,
};
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key,
sizeof(key), layout_out);
if (result != VK_SUCCESS)
return result;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -364,8 +373,8 @@ get_btoi_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -428,7 +437,7 @@ build_nir_itoi_compute_shader(struct radv_device *dev, bool src_3d, bool dst_3d,
static VkResult
get_itoi_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-itoi";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_COPY_IMAGE;
const VkDescriptorSetLayoutBinding bindings[] = {
{
@@ -457,10 +466,17 @@ get_itoi_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_ou
.size = 24,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_copy_image_key {
enum radv_meta_object_key_type type;
bool src_3d;
bool dst_3d;
uint8_t samples_log2;
};
static VkResult
get_itoi_pipeline(struct radv_device *device, const struct radv_image *src_image, const struct radv_image *dst_image,
int samples, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
@@ -469,15 +485,19 @@ get_itoi_pipeline(struct radv_device *device, const struct radv_image *src_image
const bool dst_3d = dst_image->vk.image_type == VK_IMAGE_TYPE_3D;
const uint32_t samples_log2 = ffs(samples) - 1;
VkResult result;
char key_data[64];
struct radv_copy_image_key key;
result = get_itoi_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-itoi-%d-%d-%d", src_3d, dst_3d, samples_log2);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_COPY_IMAGE;
key.src_3d = src_3d;
key.dst_3d = dst_3d;
key.samples_log2 = samples_log2;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -500,8 +520,8 @@ get_itoi_pipeline(struct radv_device *device, const struct radv_image *src_image
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -560,7 +580,7 @@ build_nir_itoi_r32g32b32_compute_shader(struct radv_device *dev)
static VkResult
get_itoi_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-itoi-r32g32b32";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_COPY_IMAGE_R32G32B32;
VkResult result;
const VkDescriptorSetLayoutBinding bindings[] = {
@@ -590,12 +610,12 @@ get_itoi_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out
.size = 24,
};
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key,
sizeof(key), layout_out);
if (result != VK_SUCCESS)
return result;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -618,8 +638,8 @@ get_itoi_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -665,7 +685,7 @@ build_nir_cleari_compute_shader(struct radv_device *dev, bool is_3d, int samples
static VkResult
get_cleari_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-cleari";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_CLEAR_CS;
const VkDescriptorSetLayoutBinding binding = {
.binding = 0,
@@ -686,10 +706,16 @@ get_cleari_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_
.size = 20,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_clear_key {
enum radv_meta_object_key_type type;
bool is_3d;
uint8_t samples_log2;
};
static VkResult
get_cleari_pipeline(struct radv_device *device, const struct radv_image *image, VkPipeline *pipeline_out,
VkPipelineLayout *layout_out)
@@ -697,16 +723,19 @@ get_cleari_pipeline(struct radv_device *device, const struct radv_image *image,
const bool is_3d = image->vk.image_type == VK_IMAGE_TYPE_3D;
const uint32_t samples = image->vk.samples;
const uint32_t samples_log2 = ffs(samples) - 1;
char key_data[64];
struct radv_clear_key key;
VkResult result;
result = get_cleari_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-cleari-%d-%d", is_3d, samples_log2);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_CLEAR_CS;
key.is_3d = is_3d;
key.samples_log2 = samples_log2;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -729,8 +758,8 @@ get_cleari_pipeline(struct radv_device *device, const struct radv_image *image,
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -774,7 +803,7 @@ build_nir_cleari_r32g32b32_compute_shader(struct radv_device *dev)
static VkResult
get_cleari_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-cleari-r32g32b32";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_CLEAR_CS_R32G32B32;
VkResult result;
const VkDescriptorSetLayoutBinding binding = {
@@ -796,12 +825,12 @@ get_cleari_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_o
.size = 16,
};
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key,
sizeof(key), layout_out);
if (result != VK_SUCCESS)
return result;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -824,8 +853,8 @@ get_cleari_r32g32b32_pipeline(struct radv_device *device, VkPipeline *pipeline_o
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -1344,12 +1373,21 @@ radv_meta_image_to_image_cs(struct radv_cmd_buffer *cmd_buffer, struct radv_meta
if (vk_format_is_color(src->image->vk.format) && vk_format_is_depth_or_stencil(dst->image->vk.format)) {
assert(src->aspect_mask == VK_IMAGE_ASPECT_COLOR_BIT);
src_aspect_mask = src->aspect_mask;
} else if (vk_format_is_depth_or_stencil(src->image->vk.format) && vk_format_is_color(dst->image->vk.format)) {
if (src->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
depth_format = vk_format_stencil_only(src->image->vk.format);
} else {
assert(src->aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT);
depth_format = vk_format_depth_only(src->image->vk.format);
}
}
create_iview(cmd_buffer, src, &src_view,
(src_aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) ? depth_format : 0,
src_aspect_mask);
create_iview(cmd_buffer, dst, &dst_view, depth_format, dst_aspect_mask);
create_iview(cmd_buffer, dst, &dst_view,
dst_aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT) ? depth_format : 0,
dst_aspect_mask);
radv_meta_push_descriptor_set(
cmd_buffer, VK_PIPELINE_BIND_POINT_COMPUTE, layout, 0, 2,

View File

@@ -57,32 +57,43 @@ build_color_shaders(struct radv_device *dev, struct nir_shader **out_vs, struct
static VkResult
get_color_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-clear-color";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_CLEAR_COLOR;
const VkPushConstantRange pc_range = {
.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT,
.size = 16,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_clear_color_key {
enum radv_meta_object_key_type type;
uint8_t samples;
uint8_t frag_output;
uint32_t fs_key;
};
static VkResult
get_color_pipeline(struct radv_device *device, uint32_t samples, uint32_t frag_output, VkFormat format,
VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const uint32_t fs_key = radv_format_meta_fs_key(device, format);
char key_data[64];
struct radv_clear_color_key key;
VkResult result;
result = get_color_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-clear-color-%d-%d-%d", samples, frag_output, fs_key);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_CLEAR_COLOR;
key.samples = samples;
key.frag_output = frag_output;
key.fs_key = fs_key;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -198,7 +209,7 @@ get_color_pipeline(struct radv_device *device, uint32_t samples, uint32_t frag_o
render.color_attachment_formats[i] = format;
result = vk_meta_create_graphics_pipeline(&device->vk, &device->meta_state.device, &pipeline_create_info, &render,
key_data, strlen(key_data), pipeline_out);
&key, sizeof(key), pipeline_out);
ralloc_free(vs_module);
ralloc_free(fs_module);
@@ -317,37 +328,57 @@ static bool radv_can_fast_clear_depth(struct radv_cmd_buffer *cmd_buffer, const
const VkClearRect *clear_rect, const VkClearDepthStencilValue clear_value,
uint32_t view_mask);
struct radv_clear_ds_layout_key {
enum radv_meta_object_key_type type;
bool unrestricted;
};
static VkResult
get_depth_stencil_pipeline_layout(struct radv_device *device, bool unrestricted, VkPipelineLayout *layout_out)
{
char key_data[64];
struct radv_clear_ds_layout_key key;
snprintf(key_data, sizeof(key_data), "radv-clear-ds-%d", unrestricted);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_CLEAR_DS;
key.unrestricted = unrestricted;
const VkPushConstantRange pc_range = {
.stageFlags = unrestricted ? VK_SHADER_STAGE_FRAGMENT_BIT : VK_SHADER_STAGE_VERTEX_BIT,
.size = 4,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, NULL, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_clear_ds_key {
enum radv_meta_object_key_type type;
VkImageAspectFlags aspects;
uint8_t samples;
bool fast;
bool unrestricted;
};
static VkResult
get_depth_stencil_pipeline(struct radv_device *device, int samples, VkImageAspectFlags aspects, bool fast,
VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const bool unrestricted = device->vk.enabled_extensions.EXT_depth_range_unrestricted;
char key_data[64];
struct radv_clear_ds_key key;
VkResult result;
result = get_depth_stencil_pipeline_layout(device, unrestricted, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-clear-ds-%d-%d-%d-%d", aspects, samples, fast, unrestricted);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_CLEAR_DS;
key.aspects = aspects;
key.samples = samples;
key.fast = fast;
key.unrestricted = unrestricted;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -475,7 +506,7 @@ get_depth_stencil_pipeline(struct radv_device *device, int samples, VkImageAspec
};
result = vk_meta_create_graphics_pipeline(&device->vk, &device->meta_state.device, &pipeline_create_info, &render,
key_data, strlen(key_data), pipeline_out);
&key, sizeof(key), pipeline_out);
ralloc_free(vs_module);
ralloc_free(fs_module);
@@ -584,7 +615,7 @@ build_clear_htile_mask_shader(struct radv_device *dev)
static VkResult
get_clear_htile_mask_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-clear-htile-mask";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_CLEAR_HTILE;
VkResult result;
const VkDescriptorSetLayoutBinding binding = {
@@ -606,12 +637,12 @@ get_clear_htile_mask_pipeline(struct radv_device *device, VkPipeline *pipeline_o
.size = 8,
};
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
result = vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key,
sizeof(key), layout_out);
if (result != VK_SUCCESS)
return result;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -634,8 +665,8 @@ get_clear_htile_mask_pipeline(struct radv_device *device, VkPipeline *pipeline_o
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -1033,7 +1064,7 @@ radv_clear_dcc(struct radv_cmd_buffer *cmd_buffer, struct radv_image *image, con
static VkResult
get_clear_dcc_comp_to_single_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout_out)
{
const char *key_data = "radv-clear-dcc-comp-to-single";
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_CLEAR_DCC_COMP_TO_SINGLE;
const VkDescriptorSetLayoutBinding binding = {
.binding = 0,
@@ -1054,24 +1085,31 @@ get_clear_dcc_comp_to_single_pipeline_layout(struct radv_device *device, VkPipel
.size = 24,
};
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, key_data,
strlen(key_data), layout_out);
return vk_meta_get_pipeline_layout(&device->vk, &device->meta_state.device, &desc_info, &pc_range, &key, sizeof(key),
layout_out);
}
struct radv_clear_dcc_comp_to_single_key {
enum radv_meta_object_key_type type;
bool is_msaa;
};
static VkResult
get_clear_dcc_comp_to_single_pipeline(struct radv_device *device, bool is_msaa, VkPipeline *pipeline_out,
VkPipelineLayout *layout_out)
{
char key_data[64];
struct radv_clear_dcc_comp_to_single_key key;
VkResult result;
result = get_clear_dcc_comp_to_single_pipeline_layout(device, layout_out);
if (result != VK_SUCCESS)
return result;
snprintf(key_data, sizeof(key_data), "radv-clear-dcc-comp-to-single-%d", is_msaa);
memset(&key, 0, sizeof(key));
key.type = RADV_META_OBJECT_KEY_CLEAR_DCC_COMP_TO_SINGLE;
key.is_msaa = is_msaa;
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, key_data, strlen(key_data));
VkPipeline pipeline_from_cache = vk_meta_lookup_pipeline(&device->meta_state.device, &key, sizeof(key));
if (pipeline_from_cache != VK_NULL_HANDLE) {
*pipeline_out = pipeline_from_cache;
return VK_SUCCESS;
@@ -1094,8 +1132,8 @@ get_clear_dcc_comp_to_single_pipeline(struct radv_device *device, bool is_msaa,
.layout = *layout_out,
};
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, key_data,
strlen(key_data), pipeline_out);
result = vk_meta_create_compute_pipeline(&device->vk, &device->meta_state.device, &pipeline_info, &key, sizeof(key),
pipeline_out);
ralloc_free(cs);
return result;
@@ -1312,7 +1350,7 @@ gfx8_get_fast_clear_parameters(struct radv_device *device, const struct radv_ima
*can_avoid_fast_clear_elim = false;
}
const struct util_format_description *desc = vk_format_description(iview->vk.format);
const struct util_format_description *desc = radv_format_description(iview->vk.format);
if (iview->vk.format == VK_FORMAT_B10G11R11_UFLOAT_PACK32 || iview->vk.format == VK_FORMAT_R5G6B5_UNORM_PACK16 ||
iview->vk.format == VK_FORMAT_B5G6R5_UNORM_PACK16)
extra_channel = -1;
@@ -1392,7 +1430,7 @@ static bool
gfx11_get_fast_clear_parameters(struct radv_device *device, const struct radv_image_view *iview,
const VkClearColorValue *clear_value, uint32_t *reset_value)
{
const struct util_format_description *desc = vk_format_description(iview->vk.format);
const struct util_format_description *desc = radv_format_description(iview->vk.format);
unsigned start_bit = UINT_MAX;
unsigned end_bit = 0;

Some files were not shown because too many files have changed in this diff Show More