opt_copy_propagation() can sometimes propagate FIXED_GRF sources into
SHADER_OPCODE_SENDs as the message payload. For example, GS input
reads, which simply take a URB handle and have the offset in the
descriptor. For non-VGRFs, there isn't a payload to split, so just
skip past such send messages.
Fixes: 589b03d02f ("intel/fs: Opportunistically split SEND message payloads")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28067>
(cherry picked from commit ba11127944)
External images translate to 2D images in ntv, so we will have to emit
OpImageQuerySizeLod instead of OpImageQuerySize (thanks Faith for
pointing that out). This quells
VUID-VkShaderModuleCreateInfo-pCode-08737
Image must have either 'MS'=1 or 'Sampled'=0 or 'Sampled'=2
%32 = OpImageQuerySize %v2int %31
triggred by piglit
spec@oes_egl_image_external_essl3@oes_egl_image_external_essl3
on Zink.
Fixes: 3f783a3c50
zink: omit Lod image operand in ntv when not using an image texture dim
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28389>
(cherry picked from commit b6c1390354)
These are both handled by inserting them directly at the top of the
nir_function_impl. However, if the cursor is already at the top, it
never gets updated so we end up inserting other stuff after the newly
inserted undef or decl_reg. It's an odd edge case to be sure but I hit
it with my new NIR CF pass for NAK.
Fixes: 1be4c61c95 ("nir/builder: Add a helper for creating undefs")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
(cherry picked from commit a782809f81)
When shaders might read metadata (DCC) this must be flushed.
VK_ACCESS_2_MEMORY_READ_BIT includes all READ bits that are relevant.
I think this issue has been uncoverd since vkd3d-proton d1425ee4
("vkd3d: Use VK_ACCESS_MEMORY_{READ,WRITE}_BIT where appropriate")
because RADV used to be missing VK_ACCESS_2_MEMORY_{READ,WRITE} in the
past and vkd3d-proton added a special workaround that has been removed.
This fixes some DCC corruption in WWE 2K24.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10774
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28332>
(cherry picked from commit 585b4c5a01)
in a sequence like:
* resize A
* clear
* resize B
* clear
* resize C
* clear
for a swapchain resource, the geometry for a given op after the resize
may desync for the op with which it was executed, but this is fine
since the underlying swapchain object will have to be re-created anyway
fixes#10827
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28214>
(cherry picked from commit ee13512a62)
For instance, this issue is triggered with
"piglit/bin/glslparsertest tests/spec/arb_bindless_texture/compiler/images/arith-bound-image.frag pass 3.30 GL_ARB_bindless_texture GL_ARB_shader_image_load_store":
Direct leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe9a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7f84ba7e0801 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4391
#2 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#3 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#4 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#5 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#6 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#7 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#8 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Direct leak of 136 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbff57 in operator new(unsigned long) (/usr/lib64/libasan.so.6+0xb2f57)
#1 0x7f84b1a5f749 in LLVMCreateBuilderInContext (/usr/local/lib64/libLLVM-17.so+0xc84749)
#2 0x7f84ba7817b0 in ac_llvm_context_init ../src/amd/llvm/ac_llvm_build.c:54
#3 0x7f84ba542b7a in si_llvm_context_init ../src/gallium/drivers/radeonsi/si_shader_llvm.c:120
#4 0x7f84ba542b7a in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:832
#5 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#6 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#7 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#8 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#9 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b9fee in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f84b81b9fee in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f84b81b05c7 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f84b81b05c7 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f84ba7e06ae in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4381
#7 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#8 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#9 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#10 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#11 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#12 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#13 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b9fee in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f84b81b9fee in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f84b81b05c7 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f84b81b05c7 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f84ba7e06e4 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4382
#7 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#8 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#9 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#10 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#11 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#12 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#13 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b046c in _mesa_hash_table_create ../src/util/hash_table.c:182
#3 0x7f84ba7e06e4 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4382
#4 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#5 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#6 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#7 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#8 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#9 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#10 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b046c in _mesa_hash_table_create ../src/util/hash_table.c:182
#3 0x7f84ba7e06ae in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4381
#4 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#5 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#6 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#7 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#8 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#9 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#10 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
SUMMARY: AddressSanitizer: 920 byte(s) leaked in 6 allocation(s).
Fixes: d92d35c9db ("ac/llvm: add a return value to ac_nir_translate")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28099>
(cherry picked from commit 0fd907fc7b)
The vma_samplers vma heap is initialized unconditionally. Don't use
device->physical->indirect_descriptors as a condition on whether to
free it or not.
From my TGL machine:
==373617== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1
==373617== at 0x48459F3: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==373617== by 0x6926DC0: util_vma_heap_free (vma.c:339)
==373617== by 0x6925ED3: util_vma_heap_init (vma.c:53)
==373617== by 0x5334EDA: anv_CreateDevice (anv_device.c:3404)
==373617== by 0x685593A: vk_tramp_CreateDevice (vk_dispatch_trampolines.c:78)
==373617== by 0x48A6D56: terminator_CreateDevice (loader.c:5833)
==373617== by 0x9C2293F: vulkan_layer_chassis::CreateDevice(VkPhysicalDevice_T*, VkDeviceCreateInfo const*, VkAllocationCallbacks const*, VkDevice_T**) (chassis.cpp:497)
==373617== by 0x48B0690: loader_create_device_chain (loader.c:4937)
==373617== by 0x48B1327: loader_layer_create_device (loader.c:4317)
==373617== by 0x48B8D79: vkCreateDevice (trampoline.c:1004)
==373617== by 0x10CC7A: MyApp::MyApp(int, bool) (sparse.cpp:608)
==373617== by 0x1201E8: main (sparse.cpp:6025)
Fixes: 7c76125db2 ("anv: use 2 different buffers for surfaces/samplers in descriptor sets")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28303>
(cherry picked from commit 6ec1e322f0)
The hand rolled etnaviv conversion functions were able to handle
negative input values when converting to fixpoint. By replacing
them with U_FIXED all negative values are clamped to zero, which
breaks usages where negative inputs are valid, like lodbias.
Fixes: 8bce68edf5 ("etnaviv: switch to U_FIXED(..) macro")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28224>
(cherry picked from commit 4b8981e471)
Fixes validation error:
VUID-VkShaderModuleCreateInfo-pCode-08737
AtomicFAddEXT: expected Pointer to point to a value of type Result
Type
%51 = OpAtomicFAddEXT %float %49 %uint_1 %uint_0 %50
when running
spec@nv_shader_atomic_float@execution@ssbo-atomicadd-float
Fixes: 9f6be8effb
zink: store and use alu types for ntv defs
v2: Fix commit message (Mike)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28243>
(cherry picked from commit 50a6c5d5fa)
Since we split misaligned attributes, we could overwrite one of these
VGPRs in the middle of loading the attribute.
For example:
v_add_u32_e32 v4, vcc, s7, v1
s_waitcnt lgkmcnt(0)
buffer_load_dword v4, v4, s[32:35], 0 idxen
buffer_load_dword v5, v4, s[32:35], 0 idxen offset:4
can overwrite the vertex index in the load of the first component.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27920>
(cherry picked from commit ec892c4d2b)
The stride given in the shader is in number of elements of the of the
type pointed by the given pointer, which may not match the matrix own
element type.
Since we cast the pointer to match the element type, the stride needs to
be ajusted accordingly.
v2:
- Fix mismatching bit-width in matrix element type and pointer type (Caio)
- Do the stride calculation in one place
Fixes dEQP-VK.compute.pipeline.cooperative_matrix.khr_*.multicomponent.*
Fixes: 3a35f8b29b ("intel/cmat: Lower cmat_load and cmat_store")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10820
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27903>
(cherry picked from commit 446f652cde)
The `stride` and `offset` attributes are meaningful for the "virtual"
register files (VGRFs, UNIFORMs and ATTRs). Accumulator is an ARF so
validation should check `hstride` (part of the <V,W,H> triple) and `subnr`
instead.
Fixes: 12d7aaf2b8 ("intel/compiler: add more validation for acc register usage")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28059>
(cherry picked from commit e324fbbe68)
This ensure the region triple <V,W,H> is set correctly, in this case the
desired region is a sequential like <8,8,1>. Without the helper the
sequence we get is <0,1,0> -- which the generator currently partially
adjusts when emitting code, but is not sufficient when doing validation
earlier.
The code generated code is slightly modified. From crucible test
func.shader.subtractSaturate.uint in the fragment shader for SIMD8, the
diff looks like
```
mov(8) acc0<1>UD g21<8,8,1>UD { align1 1Q $0.dst };
-add.sat(8) g22<1>UD -acc0<0,1,0>UD g16<8,8,1>UD { align1 1Q @1 $0.dst };
+add.sat(8) g22<1>UD -acc0<8,8,1>UD g16<8,8,1>UD { align1 1Q @1 $0.dst };
```
Note that without the patch generator adjusted the hstride for acc0 used
as destination (see brw_set_dest), but kept the src region as is. For
the source, it is not clear to me why the <0,1,0> would work correctly
here since it is a scalar, but using <8,8,1> it is correct.
Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28059>
(cherry picked from commit db8022dc4d)
It was correct for the parameters that the driver was using, but incorrect
for other parameters.
1. The address computation must multiply the workgroup size (wave size)
by num_mem_ops to fix the case when num_dwords_per_thread > 4.
2. nir_load_ssbo shouldn't set the number of components to 4 when
num_dwords_per_thread < 4.
Fixes: 6584088cd5 - radeonsi: "create_dma_compute" shader in nir
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28119>
(cherry picked from commit e99765df08)
this should yield more consistent results and avoid weird cases where
various formats are queried for things they don't support and won't use
Fixes: 9a412c10b7 ("zink: set all usage flags when querying sparse features")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28115>
(cherry picked from commit 8fa413fef0)
only certain formats are required to have the storage bit, so be more
tolerant of failure in the case where drivers actually check flags
and reject storage usage when it's actually unsupported
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28115>
(cherry picked from commit 61e5b6ad9d)
the existing guesswork during format selection for teximage is
accurate most of the time, but it's not accurate all of the time.
GL/ES each have a set of sized formats that are required to be
color renderable, and so any time one of these is allocated as a
texture, it MUST have the rendertarget usage bit attached so that
it can later be bound as a framebuffer attachment
an alternative might be to relax this and then try to do migration
to a different format/buffer later if necessary, but that's hard and
probably not actually as useful
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28055>
(cherry picked from commit 0f66589c2a)
This fixes an issue hit by one of darktable's kernels, where the sampler
argument got assigned the location of a dead kernel parameter turning it
into a zombie and leading us to trash the kernel input buffer's layout.
Fixes: 25b8a34b48 ("rusticl/kernel: inline samplers")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28121>
(cherry picked from commit 2df640c4f6)
If you look at the sampler message header on Gfx9+, you'll see that we
mostly only use 2 dwords (dw2 & dw3). DW2 has a bunch of sampler
parameters, DW3 is the sampler handle.
On Gfx9 we can micro optimize by copying r0 into the header because
the HW mostly doesn't care about other DWs. We just have to clear dw2
on non VS/FS stages.
On Gfx11+, we always have to do a careful copy of the r0.3 bits to
mask out the bottom unrelated bits. So there, just clearing the entire
header makes more sense.
On Xe2+, the dw4 of the header references the sampler feedback surface
handle and bit0 is a boolean to know whether to use that surface or
not. So it *REALLY* matters to have that as 0. If we copy r0, we'll
get random bits in dw4, leading to enable that surface.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28082>
(cherry picked from commit 75c6ad9907)
With RADV, when VS/TES and FS are compiled separately, the PrimitiveId
is exported unconditionally because it's not possible to know if the
FS reads it or not. This happens with fast-link GPL and shader object.
Though, the PrimitiveID should be ignored when it's implicitly exported
because otherwise the stream output LDS offset is incorrect.
This fixes a bunch of failures with transform feedback and Zink/RADV
when shader object is enabled on RDNA3.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27981>
(cherry picked from commit d12984edb8)
The algorithm used to rendering smooth lines worked under the assumption
that line coords were in the [0, 1] range. This was correct when using
an orthogonal projection, but not when using a perspective projection.
With a perspective projection (where the value for 1/Wc set in the VPM
is not 1.0), line coords values are also affected by this projection, so
the values are not in this range.
To deal with this, we normalize the line coords using the Wc value so
the range becomes [0, 1], and the smooth line rendering works as
expected.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10496
Fixes: ee4d51f8b2 ("v3d: Add a lowering pass for line smoothing")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28072>
(cherry picked from commit 69fbd5cb90)
The previous version had an optimization where, instead of actually
waiting on the FALCON to return, it would just do a bunch of nops in
some cases. This seems broken at least on Turing+ and results in
registers not ending up with the right values. It only really shows up
when you set two registers back-to-back in which case the second
SET_PRIV_REG may mess up the first.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27927>
(cherry picked from commit 0ed7bce8e5)
The driver is written that we should support ETNA_NUM_VARYINGS and reporting
a bigger number will cause some troubles. I had a quick look at galcore's
hw database and there are entries that report a higher value.
So I think what we want is to the minimum value of what kernel driver reports
and what the gallium driver should be able to handle.
Fixes: 84816c22e4 ("etnaviv: ask kernel for max number of supported varyings")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27923>
(cherry picked from commit 93255abe30)
As per spec, any colors, or color components, associated with a fragment
that are not written by the fragment shader are undefined.
So we might as well just write vec4(1.0) to output, since HW doesn't allow
us to have an empty FS.
Backport-to: 23.3
Backport-to: 24.0
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24855>
(cherry picked from commit 6998c48f77)
the ES spec imposes additional requirements for copy commands,
specifically that the formats have matching component sizes
the existing check used the driver's internal formats to check
for a match, which is broken since the spec requires the match be
between the passed internalFormat and the buffer's effective internal
format (i.e., this has no relation to what the driver supports)
fixes KHR-GLES3.copy_tex_image_conversions.forbidden* on a bunch of drivers
cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28030>
(cherry picked from commit 2cd192f879)
Batches must be ignored if batch count is zero, so all batch inspections
have to be gated behind batch count. For memcpy, it's UB if either src
or dst is NULL even when size is zero.
Side note:
- For original commit, this fixes just the memcpy UB
- For current codes, this fixes to not skip ffb batch prepare
Fixes: 493a3b5cda ("venus: refactor batch submission fixup")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28071>
(cherry picked from commit 8af267eb00)
This fixes the validation error
VUID-VkShaderModuleCreateInfo-pCode-08737
triggered by piglit:
spec@arb_gpu_shader5@execution@built-in-functions@fs-interpolateatsample-block-array:
GLSL.std.450 InterpolateAtSample: expected Sample to be 32-bit integer
%47 = OpExtInst %float %1 InterpolateAtSample %45 %float_0
Fixes: 9f6be8effb
zink: store and use alu types for ntv defs
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28043>
(cherry picked from commit b7d6d90dab)
We had a "Don't read out-of-bounds" sanity check for creating an alpha
when ATEST was needed, but that check happened only after we already
did a bi_extract(), which meant that the bi_extract could get into
trouble and assert() when there weren't enough components. Fixed by
re-arranging the calculation.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund>@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28045>
(cherry picked from commit 0e1862a2ab)
The code path for emitting tessellation commands when the TES needed
scratch space was failing to emit 3DSTATE_TE, and instead only emitting
3DSTATE_DS. This meant that you could get HS and DS enabled with
tessellation itself turned off, which is utter nonsense and would
cause a GPU hang.
Alchemist and later takes a different path and don't take this bug,
but all earlier hardware would hit it. Discovered while working on
compiler changes that caused a single piglit test to spill minorly,
and thus break entirely.
Fixes: 4256f7ed58 ("iris: Fill out scratch base address dynamically")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28032>
(cherry picked from commit 9e5fd49cbe)
If the app requests a swizzle on the shadow sampler which doesn't just
return the red channel or literal 0s/1s, we'll crash attempting to build
the result vector. Use something that's probably valid.
Cc: mesa-stable
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28001>
(cherry picked from commit cda6877cb6)
We precompile static state and count it as dynamic, so we have to
manually clear bitset that tells which dynamic state is set, in order to
make sure that future dynamic state will be emitted. The issue is that
framework remembers only a past REAL dynamic state and compares a new
dynamic state against it, and not against our static state masquaraded
as dynamic.
Example:
- Set dynamic state S with value A
- Bind pipeline with dynamic state S
- Draw
- Bind pipeline with static state S with value B
- Draw
- Set dynamic state S with value A
- Bind pipeline with dynamic state S
- Draw
Previously, at the last draw the dynamic state S was not dirty and
current dynamic state was equal to the past dynamic state, so
it was not emitted, while GPU used value B from static pipeline.
This fix, at the point of static pipeline binding, clears the
bitset which tells that dynamic state S was previously set.
This forces the next dynamic state to be re-emitted.
Fixes broken rendering in Arma 3, and probably some other
games running through DXVK.
Fixes: 97da0a7734
("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27961>
(cherry picked from commit a76fcebfc0)
There are multiple problems currently :
- blorp blitter commands overwrite the protection value coming from
the driver
- anv & iris are using render target MOCS for compute commands
Driver already have the ability to pass the MOCS values so we choose
to stick to that in this change. But now the driver need to select the
right MOCS depending on the engine the commands are going to run onto.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27956>
(cherry picked from commit 194afe8416)
ZINK_BIND_RESOURCE_DESCRIPTOR and ZINK_BIND_SAMPLER_DESCRIPTOR are
always used together, so that we can replace these two values with
ZINK_BIND_DESCRIPTOR and use only one bit to represent the value.
With that we can also remove the aliasing of ZINK_BIND_DESCRIPTOR with
PIPE_BIND_CONST_BW.
Fixes: 13c6ad0038
zink: use a single descriptor buffer for all non-bindless types
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28016>
(cherry picked from commit 8e239dda41)
technically this needs to be MUCH higher since there's no limitation
on the number of semaphore waits that can be submitted, but this is
enough to handle zink usage
fixes KHR-GL46.sparse_buffer_tests.BufferStorageTest
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27992>
(cherry picked from commit 9a53e3b1fd)
I started seeing
ACO ERROR:
In file ../src/amd/compiler/aco_validate.cpp:98
Operand and Definition types do not match: s1: %44 = p_parallelcopy %158
test_basic: ../src/amd/compiler/aco_interface.cpp:85: void validate(aco::Program*):
Assertion `is_valid' failed.
since commit 52ee4cf229 ("nir/builder: Teach nir_pack_bits and
nir_unpack_bits about 32_4x8").
Fixes: e0d232c2fc ("aco: implement nir_op_pack_32_4x8"). I
Suggested-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27972>
(cherry picked from commit 3d4dfae7eb)
Indeed, main_shader_part_ngg_es was not freed.
For instance, this issue is triggered on a radeonsi/gfx10 gpu with
"piglit/bin/arb_gpu_shader5-tf-wrong-stream-value -auto -fbo":
Direct leak of 1464 byte(s) in 1 object(s) allocated from:
#0 0x7f17904b99a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7f1785d65ac2 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3132
#2 0x7f1783af67d8 in util_queue_thread_func ../src/util/u_queue.c:309
#3 0x7f1783b51dfa in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#4 0x7f178f69d38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 2024 byte(s) in 1 object(s) allocated from:
#0 0x7f17904b97ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f1785d5443a in read_chunk ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:221
#2 0x7f1785d62cf5 in si_load_shader_binary ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:293
#3 0x7f1785d65255 in si_shader_cache_load_shader ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:423
#4 0x7f1785d65ef9 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3169
#5 0x7f1783af67d8 in util_queue_thread_func ../src/util/u_queue.c:309
#6 0x7f1783b51dfa in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#7 0x7f178f69d38a (/lib64/libc.so.6+0x8438a)
Fixes: 8f72f137ad ("radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27958>
(cherry picked from commit f93f215898)
Commit 90e364edb0 contained a typo in the glsl_dvec4_type() helper,
instead returning a glsl_ivec4_type. As an ivec4 is 2x smaller than
a dvec4, this also broke piglit sanity on crocus/hsw.
This also fixes the dvec2 helper, though it has not been specifically
tested anywhere.
Fixes: 90e364edb0 ("compiler/types: Add a few more helpers to get builtin types")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27917>
(cherry picked from commit f9acfeeb59)
For instance, this issue is triggered with
"piglit/bin/object-namespace-pollution glBitmap program -auto -fbo":
Direct leak of 112 byte(s) in 7 object(s) allocated from:
#0 0x7f472540e7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f471a9ce18f in rc_remove_unused_constants ../src/gallium/drivers/r300/compiler/radeon_remove_constants.c:101
#2 0x7f471a9b0836 in rc_run_compiler_passes ../src/gallium/drivers/r300/compiler/radeon_compiler.c:476
#3 0x7f471a9b0ad5 in rc_run_compiler ../src/gallium/drivers/r300/compiler/radeon_compiler.c:498
#4 0x7f471a9ec862 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:172
#5 0x7f471a9e1ab2 in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#6 0x7f471a9e6303 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:591
#7 0x7f471a9544fe in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1073
#8 0x7f4718f2ebe5 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1070
#9 0x7f4718f374b5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1116
#10 0x7f4718f38273 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1281
#11 0x7f4718f38273 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
#12 0x7f4718f389e9 in st_program_string_notify ../src/mesa/state_tracker/st_program.c:1378
#13 0x7f47199d9f99 in set_program_string ../src/mesa/main/arbprogram.c:413
Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27957>
(cherry picked from commit 29df85788a)
Do not continue and call drmIoctl on an invalid file descriptor.
Fix defect reported by Coverity Scan.
Argument cannot be negative
The negative argument will be interpreted as a very large unsigned value.
CID: 1544377
Cc: mesa-stable
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27788>
(cherry picked from commit b6962bbfc8)
When lower_simd_width() encounters an instruction that needs a larger
SIMD, for example SHADER_OPCODE_TXS_LOGICAL in Gfx4 needs at least
SIMD16. In this case the builder needs to be at least as large as
max_width, otherwise the group() setup will assert.
Turns out this did not assert before "by accident", since it was
relying on the default fs_visitor builder that had a dispatch width of 64,
a bogus placeholder value, expected not to be used.
However, when we changed the code to remove that builder (and the bogus
value), we created a new builder in the pass shader dispatch_width --
which work fine except in the case where we want to "lower" the SIMD above
the shader dispatch width. The fix is to also consider the already
calculated max_width when creating the builder.
Fixes: 5b8ec015f2 ("intel/compiler: Don't use fs_visitor::bld in remaining places")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10338
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27782>
(cherry picked from commit 337641cfcc)
Xe KMD also checks if cpu_caching caching set during bo creationg
matches with caching of the PAT index set in the VM unbind.
This was being unnoticed until now by luck and lack of testing in MTL.
So here always setting PAT index for all VM operations that has a bo
associated.
Fixes: eb18a92ef9 ("iris: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27893>
(cherry picked from commit 963c08b623)
Let's say you have an image in R32_UINT format, a view is created in
R32_SFLOAT and used as color attachment.
When resolving the attachment, our current code uses the image format
(R32_UINT in this case). But resolve mode might apply only to SFLOAT,
so we currently run into an assert in blorp.
We should instead use the view format. There is an exception for
depth/stencil view because the format we want to resolve is actually
the depth/stencil format, not just the depth or stencil aspect.
This fixes vkd3d-proton's test_multisample_resolve_formats.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27875>
(cherry picked from commit 5a7e58a430)
If monolithic shaders were inlined, there might not be a radv_shader
associated with some stages. Zero out the shader allocation info in that
case, the shader will get identified by hash instead.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27890>
(cherry picked from commit b588cb29a3)
The stw_device and its screen are set up independently. It's possible
to have a device without a screen if the DLL is loaded but never
called into, since DllMain for PROCESS_ATTACH sets up the stw_device,
but the screen is initialized later on the first call to get pixel
formats. If the DLL is loaded and then unloaded, don't crash.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27892>
(cherry picked from commit f96d31bc8a)
Found by inspection. Original code was returning the size instead of the
number of levels. This was probably an over zealous search-and-replace
when PIPE_CAP_MAX_TEXTURE_2D_LEVELS was changed to _SIZE.
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Fixes: 0c31fe9ee7 ("gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27800>
(cherry picked from commit 1b890825f6)
this is a bandaid fix that allows users (zink) to actually call the
functions intended to be called. the real fix would be to figure out
which extensions are enabled on the device and then only GPA the
functions associated with those extensions
that's too hard though so I'm slapping some flex tape on it
cc: mesa-stable
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27834>
(cherry picked from commit 5d91db9666)
This extension has been broken ever since the initial commit. It created
an XRGB DRIImage for the driver to render to, so whilst the presentation
was opaque, the buffer also completely lacked an alpha channel.
Fix it by making sure we only modify the FourCC we send to the Wayland
server when creating a buffer.
Closes: mesa/mesa#5886
Fixes: 9aee7855d2 ("egl: implement EGL_EXT_present_opaque on wayland")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27709>
(cherry picked from commit 9ea9a963aa)
When we bind a descriptor set with dynamic descriptors, we can't ignore
dynamic descriptors in previously-bound higher descriptor sets. For
example, assume we have descriptor sets A and B, each of which has one
dynamic storage buffer, and we do:
CmdBindDescriptorSets(firstSet=1, descriptorSetCount=1, A)
CmdBindDescriptorSets(firstSet=0, descriptorSetCount=1, B)
and in the first CmdBindDescriptorSets the pipeline layout includes a
descriptor set layout compatible with B in set 0. Then, following
"Pipeline Layout Compatibility," set 0 is disturbed:
When binding a descriptor set to set number N, a previously bound
descriptor set bound with lower index M than N is disturbed if the
pipeline layouts for set M and N are not compatible for set M.
Otherwise, the bound descriptor set in M is not disturbed
When it's disturbed, it's effectively turned into a set with 1 undefined
dynamic storage buffer:
When a descriptor set is disturbed by binding descriptor sets, the
disturbed set is considered to contain undefined descriptors bound
with the same pipeline layout as the disturbing descriptor set.
This disturbed set is compatible with B, so in the second
CmdBindDescriptorSets this clause doesn't apply:
If, additionally, the previously bound descriptor set for set N was
bound using a pipeline layout not compatible for set N, then all
bindings in sets numbered greater than N are disturbed.
and A remains valid to access. The code before 88db7364 worked only if
the pipeline layout when binding B contained a descriptor layout
compatible with A in set 1, because it used the pipeline layout's total
size when allocating the internal dynamic descriptors array, but that
isn't actually a requirement, so the previous code was already broken.
After 88db7364 we only allocate as much space as required by the current
descriptors being bound, because I misread the rules here, which made it
more broken and broke 3DMark Wildlife Extreme that does something like
this.
In order to properly fix this we need to keep track of the maximum ever
seen dynamic descriptor size, similar to what we already do for
descriptor sets, and use that. We have no idea what needs to be
preserved when binding a descriptor set with dynamic descriptors, so we
have to be conservative.
Fixes: 88db7364 ("tu: Rework dynamic offset handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27750>
(cherry picked from commit db0291c235)
Previously we were optimistic and tied this to certain format but wa
description lists other formats and bspec clearly disallows the usage.
Issue can be seen with different 16bpp tests, effect looks a bit like
dithering pattern but it is not, it is just rep16 failing.
Fixes:
GTF-GL46.gtf42.GL3Tests.texture_storage.texture_storage_texture_as_framebuffer_attachment
on DG2 and MTL, some 565 EGL tests on Android and internal issue on game
that displays a dither like pattern on the background while it's not
supposed to do that.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10646
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27794>
(cherry picked from commit 1a4f220c29)
For instance, this issue is triggered with
"piglit/bin/ext_framebuffer_multisample-accuracy all_samples color depthstencil -auto -fbo":
Direct leak of 1160 byte(s) in 1 object(s) allocated from:
#0 0x7fbe8897d7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7fbe7e7abfcc in rc_constants_copy ../src/gallium/drivers/r300/compiler/radeon_code.c:47
#2 0x7fbe7e7ec902 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:174
#3 0x7fbe7e7e1b22 in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#4 0x7fbe7e7e6373 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:591
#5 0x7fbe7e75456e in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1073
#6 0x7fbe7cd2ebe5 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1070
#7 0x7fbe7cd374b5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1116
#8 0x7fbe7cd38273 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1281
#9 0x7fbe7cd38273 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
#10 0x7fbe7d798ca8 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:724
#11 0x7fbe7d798ca8 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:952
#12 0x7fbe7d6790d5 in link_program ../src/mesa/main/shaderapi.c:1336
#13 0x7fbe7d6790d5 in link_program_error ../src/mesa/main/shaderapi.c:1447
...
SUMMARY: AddressSanitizer: 2528456 byte(s) leaked in 1057 allocation(s).
Fixes: 54f6e72b27 ("r300: better register allocator for vertex shaders")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27792>
(cherry picked from commit 4d00edda00)
readback should trigger on the current backbuffer, not the most recently
presented buffer. if e.g., a clear is only triggered through glFlush,
this clear should be read back rather than the contents of the last-presented
buffer
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27553>
(cherry picked from commit d2ed77072c)
the previous code could recycle a currently-submitting state by hitting
a race condition where zink_screen_check_last_finished(batch_id) returned
true because batch_id was 0
this can no longer recycle the current batch, but the race should still be
eliminated for consistency: check 'submitted' since this guarantees batch_id
is valid
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27729>
(cherry picked from commit 3283415bbd)
The EXT_texture_format_BGRA8888 spec clearly defines GL_BGRA as a
color-renderable format, so we need to support it here as well.
This has been broken since the day support for the extension was added.
Oh well, let's fix it up!
Fixes: 1d595c7cd4 ("gles2: Add GL_EXT_texture_format_BGRA8888 support")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27720>
(cherry picked from commit 3b23e9d89d)
With TES, the primitive ID is an input variable but it's considered a
sysval by SPIRV->NIR. Though, its value is greater than
VARYING_SLOT_VAR0 which means its location was adjusted by mistake.
This fixes compiling a tessellation evaluation shader in debug build
with Enshrouded.
Fixes: dfbc03fa88 ("spirv: Fix locations for per-patch varyings")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27413>
(cherry picked from commit 78ea304a06)
We need to allocate "shared size" bytes for each workgroup but
we were incorrectly multiplying by the number of workgroups in
each supergroup instead, which would typically cause us to allocate
less memory than actually required.
The reason this issue was not visible until now is that the kernel
driver is using a large page alignment on all BO allocations and
this causes us to "waste" a lot of memory after each allocation.
Incidentally, this wasted memory ensured that out of bounds
accesses would not cause issues since they would typically land
in unused memory regions in between aligned allocations, however,
experimenting with reduced memory aligments raised the issue,
which manifested with the UE4 Shooter demo as a GPU hang caused
by corrupted state from out of bounds memory writes to CS
shared memory.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27675>
(cherry picked from commit 1880e7cfed)
Indeed, vertex_buffer was not properly freed.
For instance, this issue is triggered with:
"piglit/bin/fcc-read-after-clear blit rb -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 8a963d122d ("r300g/swtcl: don't do stuff which is only for HWTCL")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27678>
(cherry picked from commit 3b90c46bdf)
There are a couple mistakes here :
- using a bitfield as an index to generate a bitfield...
- in anv_nir_push_desc_ubo_fully_promoted(), confusing binding
table access of the descriptor buffer with actual descriptors
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ff91c5ca42 ("anv: add analysis for push descriptor uses and store it in shader cache")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27504>
(cherry picked from commit cf193af762)
according to spec, these should return NONE if the format is
not supported for a given texture target, but mesa was incorrectly
returning a hardcoded value for all cases without checking the driver
instead, check whether the driver can create a texture for a given
format to correctly handle this non-support case
cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27621>
(cherry picked from commit 893780b362)
Left-shifting by 11*8 or 14*8 is undefined. This fixes many
dEQP-VK.query_pool.statistics_query.* failures (but not pre-existing
flakes) for release builds using clang.
Fixes: 48aabaf225 ("radv: do not harcode the pipeline stats mask for query resolves")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27651>
(cherry picked from commit ec5d0ffb04)
this is the case where:
* a batch A is submitted
* a no-op flush occurs
* the frontend gets the fence from already-flushed batch A
* zink recycles batch A
* the frontend waits on fence A
fixes#10598
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27623>
(cherry picked from commit fb2ae7736f)
now that zink_gfx_lib_cache::stages_present exists (and is correct),
this value can be used directly to effect cache eviction instead of depending
on the prog->stages_present value, which may not even be the same prog that
owns a given zink_gfx_lib_cache instance
this fixes the case where a shader used in multiple progs with differing shader
masks would never have all its gpl pipelines freed
fixes leaks with caselist:
KHR-Single-GL46.arrays_of_arrays_gl.InteractionUniformBuffers1
KHR-Single-GL46.subgroups.quad.framebuffer.subgroupquadbroadcast_3_float_vertex
Fixes: d786f52f1f ("zink: prevent crash when freeing")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27358>
(cherry picked from commit e8ce53a33d)
some passes (e.g., opt_shrink_vector) operate on the assumption that
sparse tex ops have a certain number of components and then remove components
and unset the sparse flag if they can optimize out the sparse usage
zink's sparse ops do not have the standard number of components, which
causes such passes to make incorrect assumptions and tag them as
not being sparse, which breaks everything
fix#10540
Fixes: 0d652c0c8d ("zink: shrink vectors during optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27414>
(cherry picked from commit 2085d60438)
This works around some Unity engine behaivor with ANGLE-on-Venus, when
cmd pools are created on main thread once while the render thread only
does descriptor pool creation for set allocations during recording time.
This change also explicitly forces async pipeline create for threads
creating the device instead of implicitly via feedback cmd pool create.
This ensures intended behavior when feedback is disabled.
Fixes: d17ddcc847 ("venus: dispatch background shader tasks to secondary ring")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27347>
(cherry picked from commit 1718980e85)
The runtime is turning GENERAL layouts into FEEDBACK_LOOP ones when it
detects feedback loops in a render pass. This is breaking drivers that
would like to use a different HW layout for those 2 layouts because if
the application inserts barrier in the render pass, the barriers the
driver sees are inconsistent.
This could lead to barrier of this type :
- GENERAL -> FEEDBACK_LOOP (runtime)
- GENERAL -> GENERAL (app)
- FEEDBACK_LOOP -> GENERAL (runtime)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23523>
(cherry picked from commit 76cf391255)
this logic relies on constant indexing for compact arrays, but this is
frequently not the case for compact array builtins (e.g., gl_TessLevelOuter).
the usual strategy of lowering to temps isn't viable in TCS, which means
io lowering has to be able to handle indirect access to these builtins
without crashing
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27534>
(cherry picked from commit 9e2c7314f2)
The round up in 'next_address_8kb = DIV_ROUND_UP(push_constant_kb, 8)'
was not decreasing the amount of URB available for Mesh and Task, what
could cause an over allocation of URB.
There was also no minimum entries enforcement for Mesh and Task, what
could cause 0 r.mesh_entries to be set in a case where tue_size_dw is
90% > than mue_size_dw. Same for r.task_entries when Task is enabled.
Also adding a few more asserts to help debug.
This fixes at least dEQP-VK.mesh_shader.ext.properties.mesh_payload_size
in LNL but it has potential to fixes other Mesh tests as well.
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27555>
(cherry picked from commit d0fba810b3)
When doing query result copies in 3D mode, we're flushing the render
target cache, but the shader writes go through the dataport.
Fixes flakes/fails in piglit with shader query copies forced with Zink :
$ query_copy_with_shader_threshold=0 ./bin/arb_query_buffer_object-coherency -auto -fbo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b3b12c2c27 ("anv: enable CmdCopyQueryPoolResults to use shader for copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
(cherry picked from commit c53a4711cb)
When bitrate or fps change is detected, only update rate control
parameters instead of completely reinitializing encode session.
This fixes an issue where if application changed bitrate or fps often,
the output bitrate would significantly overshoot the target bitrate in some
cases. In other cases, the output bitrate would be extremely low instead.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27548>
(cherry picked from commit 8d44a11508)
It is possible to free memory backing images before images are
destroyed :
VkFreeMemory:
"Memory can be freed whilst still bound to resources, but those
resources must not be used afterwards."
The spec leaves us the option to keep a reference on the associated
memory and free it only when all the bound resources have been
destroyed. Here we choose to free memory immediately.
One particular test in the CTS
(dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics)
does the following :
imgA = vkCreateImage()
imgB = vkCreateImage()
memA = vkAllocateMemory()
vkBindImageMemory(imgA, memA) # Aux mapping with ref count = 1
vkFreeMemory(memA) # Aux mapping removed, ref count = 0
memB = vkAllocateMemory() # Same address as memA
vkBindImageMemory(imgB, memB)
vkDestroyImage(imgA) # Removes the mapping of imgB-memB
vkQueueSubmit() # hang with pagefault in AUX-TT
The solution implemented in this change is to not do anything AUX-TT
related in vkFreeMemory(). This soluation has some consequences,
because a virtual memory address range freed and reallocated cannot be
rebound in the AUX-TT until all the associated resources have released
their AUX-TT mapping (to bring back the AUX-TT refcount of the range
to 0). This should still be better than keeping the memory allocated
through refcounting of the anv_bo.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7b87e1afbc ("anv: track & unbind image aux-tt binding")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10528
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27566>
(cherry picked from commit e0b4dfbbda)
If a program does two blits in a row, we internally do a sequence of
operations that involves binding vb0.
Previously, the vb0 state after each operation would look something like:
| operation | cmd->state.gfx.vb0 | hardware | save->vb0 |
| ---------------------------- | ------------------ | --------- | --------- |
| | user | user | |
| nvk_meta_begin() | user | user | user |
| BindVertexBuffers(internal0) | internal0 | internal0 | user |
| nvk_meta_end() | internal0 | user | |
| nvk_meta_begin() | internal0 | user | internal0 |
| BindVertexBuffers(internal1) | internal1 | internal1 | internal0 |
| nvk_meta_end() | internal1 | internal0 | |
That is, CmdBindVertexBuffers() would update cmd->state.gfx.vb0, but
nvk_meta_end() would not. This meant that the last operation would bind a
driver-internal buffer instead of the original value that the user set.
This change fixes the issue by tracking cmd->state.gfx.vb0 in
nvk_cmd_bind_vertex_buffer(), which both CmdBindVertexBuffers() and
nvk_meta_end() call into.
After this commit, the state looks like:
| operation | cmd->state.gfx.vb0 | hardware | save->vb0 |
| ---------------------------- | ------------------ | --------- | --------- |
| | user | user | |
| nvk_meta_begin() | user | user | user |
| BindVertexBuffers(internal0) | internal0 | internal0 | user |
| nvk_meta_end() | user | user | |
| nvk_meta_begin() | user | user | user |
| BindVertexBuffers(internal1) | internal1 | internal1 | user |
| nvk_meta_end() | user | user | |
To test this commit, build gtk4 commit 87b66de1, run:
GSK_RENDERER=vulkan gtk4-demo --run=image_scaling
then select trilinear filtering in the dropdown and check for rendering
artifacts.
Fixes: e1c66501 ("nvk: Use vk_meta for CmdClearAttachments")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27559>
(cherry picked from commit d98ff2cc4a)
SPECviewperf creo-03 needs GL_EXT_shader_image_load_store in order for
its shaders to compile but we don't support a few corner cases that
didn't make it into the ARB variant. It seems to run fine with an
override, so just do that for now.
Cc: mesa-stable
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27429>
(cherry picked from commit 24d3c83212)
It can be the case that a collect and one of its sources are assigned
to non-overlapping parts of the same merge set, for example:
ssa_1 = ...
ssa_2 = ...
ssa_3 = ...
ssa_4 = collect ssa_1, ssa_2 (kill), ssa_3
... = ssa_4 (kill)
ssa_5 = collect ssa_1, ssa_3
... = ssa_1 (kill)
... = ssa_3 (kill)
... = ssa_5 (kill)
If we merge the first collect first, we get a merge set:
ssa_1 (offset 0)
ssa_2 (offset 2)
ssa_3 (offset 4)
ssa_4 (offset 0)
Now, we decide to merge ssa_1 and ssa_5:
ssa_1 (offset 0)
ssa_2 (offset 2)
ssa_3 (offset 4)
ssa_4 (offset 0)
ssa_5 (offset 0)
ssa_3 cannot become a child of ssa_5 in the interval tree, just like a
source not in the same merge set, so we should not remove it and then
reinsert it assuming that RA will make it a child of ssa_5.
This fixes an RA validation error in Farming Simulater.
Fixes: 0ffcb19 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27497>
(cherry picked from commit aeed5fd98d)
If dual blending is enabled, only 1 output is supported. Multiple
outputs confuse the write combining pass in this case, leading to
incorrect output and/or an assert failure in emit_fragment_store.
The fix is straightforward, just skip the speculative emitting of
multiple outputs in the case where dual source blending is enabled.
This also adds an extra sanity check in `pan_nir_lower_zs_store` to
check for only one blend store being present.
Fixes: c65a9be421 ("panfrost: Preprocess shaders at CSO create time")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9487
Co-Authored-By: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26474>
(cherry picked from commit 49c1b404e5)
The CTS image allocation sometimes doesn't try to allocate a complete
DPB, but the amdgpu kernel module checks for this, so always make
the DPB max sized on uvd instances.
Fixes part of video decode on Fiji/Polaris
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27186>
(cherry picked from commit df9bc11589)
As per this optimisations description:
"Takes assignments to variables that are dereferenced only
once and pastes the RHS expression into where the variables
dereferenced."
However the optimisation is run at compile time before multiple
shaders from the same stage could have been pasted together.
So this optimisation can incorrectly assume a global is only
referenced once since it cannot see the other pieces of the
shader stage until link time.
Here we skip the optimisation if the variable is a global. We
could change it to only run at link time however this
optimisation is only run at link time if we are being forced
to use GLSL IR to inline a function that glsl to nir cannot
handle and this will also be removed in a future patchset.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10482
Fixes: d75a36a9ee ("glsl: remove do_copy_propagation_elements() optimisation pass")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27351>
(cherry picked from commit bc0178af57)
Similar to what was done for Wayland in 58f90fd03f:
the glthread unmarhsal thread needs to be idle to avoid
concurrent calls to get_back_bo.
Also the existing code flushed after setting dri2_surf->back
to NULL so a new back buffer was always allocated by the
glthread flush:
|---------------> dri2_drm_swap_buffers
| get_back_bo (back=0x55eb93c6c488) > # First get_back_bo call
| get_back_bo (back=0x55eb93c6c488 age: 0)<
| # dri2_surf->back = NULL
|-----> FLUSH
| get_back_bo (back=nil) > # Another get_back_bo call
| get_back_bo (back=0x55eb93c6c4c8 age: 3)<
|-----< FLUSH
|---------------< dri2_drm_swap_buffers
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10437
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27143>
(cherry picked from commit 6f47e87a60)
GL ARB_sparse_buffer allows unbound regions in buffers.
VK sparseBinding insists all regions must be bound before first use.
This means we need to use sparseResidencyBuffer to back GL
sparse buffers to get the same semantics.
Fixes GL and piglit sparse buffer tests on zink/nvk.
Fixes: c90246b682 ("zink: implement sparse buffer creation/mapping")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27404>
(cherry picked from commit ff50e80574)
LLVM_LIB_DIR is a variable used for runtime compilations.
When cross compiling, LLVM_LIB_DIR must be set to the
libclang path on the target. So, this path should not
be retrieved during compilation but at runtime.
dladdr uses an address to search for a loaded library.
If a library is found, it returns information about it.
The path to the libclang library can therefore be
retrieved using one of its functions. This is useful
because we don't know the name of the libclang library
(libclang.so.X or libclang-cpp.so.X)
v2 (Karol): use clang::CompilerInvocation::CreateFromArgs for dladdr
v3 (Karol): follow symlinks to fix errors on debian
Fixes: e22491c832 ("clc: fetch clang resource dir at runtime")
Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (v1): Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>
(cherry picked from commit 445aacb421)
If the lane from which the hardware writes the unifa address
is disabled, then we may end up with a bogus address and invalid
memory accesses from follow-up ldunifa.
Instead of always disabling unifa loads in non-uniform control
flow we can try to see if the address is prouced from a nir
register (which is the only case where we do conditional writes
under non-uniform control flow in ntq_store_def), and only
disable it in that case.
When enabling subgroups for graphics pipelines, this fixes a
GMP violation in the simulator with the following test
(which has non-uniform control flow writing unifa with lane 0
disabled, which is the lane from which the unifa takes the
address):
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcastfirst_int
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
(cherry picked from commit 5b269814fc)
c->execute is 0 (not the block index) for lanes currently active
under non-uniform control flow.
Also this simplifies a bit the instructions we emit for flag
generation, both for uniform and non-uniform control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
(cherry picked from commit 7bdc8898b1)
If the ELSE block is cheap then we don't emit the branch instruction
but we still want to generate the flags, since these are setting
the flags for the THEN block too.
Fixes: e401add741 ("broadcom/compiler: skip jumps in non-uniform if/then when block cost is small")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
(cherry picked from commit 29d4924e5e)
descriptor buffer uses mapped buffers. mapping/unmapping buffers
uses a ctx in the function params, but at this time there is no ctx.
since the ctx is not actually used for unmapping descriptor buffers,
this can instead use a special buffer unmap function to avoid invalid access
Fixes: b06f6e00fb ("zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27344>
(cherry picked from commit 0a97d1ebfa)
this is already implied since the buffers must be BAR-allocated,
but it ensures the context isn't accessed during unmap
Fixes: b06f6e00fb ("zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27344>
(cherry picked from commit c900cca96c)
../src/compiler/nir/nir_builder.h: In function ‘nir_build_deref_follower’:
../src/compiler/nir/nir_builder.h:1607:1: error: control reaches end of non-void function [-Werror=return-type]
1607 | }
Fixes: 4a4e175738
nir: Support deref instructions in lower_var_copies
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
(cherry picked from commit 0ab3b3c641)
../src/compiler/nir/nir_lower_int64.c: In function ‘lower_int64_intrinsic’:
../src/compiler/nir/nir_lower_int64.c:1347:1: error: control reaches end of non-void function [-Werror=return-type]
1347 | }
Fixes: bf7a114246
nir/lower_int64: Add lowering for some 64-bit subgroup ops
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
(cherry picked from commit 80a1b91601)
../src/amd/vulkan/radv_sampler.c: In function ‘radv_tex_wrap’:
../src/amd/vulkan/radv_sampler.c:50:1: error: control reaches end of non-void function [-Werror=return-type]
50 | }
| ^
../src/amd/vulkan/radv_sampler.c: In function ‘radv_tex_compare’:
../src/amd/vulkan/radv_sampler.c:76:1: error: control reaches end of non-void function [-Werror=return-type]
76 | }
| ^
Fixes: 4de305cb8a
radv: move sampler related code to radv_sampler.c
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
(cherry picked from commit ca47138fb1)
Memory can be free before images it is bound to. When unmapping the
CCS range in the AUX-TT, we cannot rely on the anv_bo::offset field
because the anv_bo might have been freed.
Just save the mapping address/size and use those values at unmapping
time.
Fixes an assert on CI with :
dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e519e06f4b ("anv: add missing alignment for AUX-TT mapping")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27304>
(cherry picked from commit 9d31680e79)
- no need to flush anything before as we're working on a clean
buffer (SI_OP_SKIP_CACHE_INV_BEFORE)
- L2 must be flushed after the job to avoid rendering artifacts.
Instead of setting it manually, use SI_OP_SYNC_AFTER +
SI_COHERENCY_NONE.
Fixes: 1a99f50c7f ("radeonsi: use a compute shader to convert unsupported indices format")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27095>
(cherry picked from commit cce5920025)
This fixes#9807 but I don't understand why.
Emitting cache flushes before VGT_PRIMITIVE_TYPE is what makes
the problem go away but changing the order in si_draw() is clearer.
The only cases where sctx->flags is modified in si_emit_draw_registers
is handled using si_emit_cache_flush_direct so we can move cache
flushing up without any addtional conditionals.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9807
Fixes: 1e4b539042 ("radeonsi: handle deferred cache flushes as a state (si_atom)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27095>
(cherry picked from commit 0e16da89fe)
Buffers that are not dedicated can also be used for CCS mapped images,
so they need to be aligned to the AUX-TT requirements.
GTK+ is running into such case where it creates an image with a CCS
modifier. When requesting the alignment through
vkGetImageMemoryRequirements() the 64KB/1MB alignment is returned, but
the binding fails with an assert because the VkDeviceMemory has not
been aligned to the AUX-TT requirement and we cannot disable CCS since
the modifier requires it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cdd3178fb ("anv: Meet CCS alignment reqs with dedicated allocs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10433
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27258>
(cherry picked from commit e519e06f4b)
When updating an AFBC-packed resource, we need to make sure it is
legalized before blitting the staging resource to it. We can't rely
on the blit to properly convert the resource as it will result in
blit recursion and a crash.
If the whole texture is updated however, there is no need to unpack
as the content can be discarded. Just create a new BO with the right
format.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 33b48a5585 ("panfrost: Add debug flag to force packing of AFBC textures on upload")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27208>
(cherry picked from commit 1aa832e5f5)
There might be a more efficient path when legalizing a resource if
we don't need to worry about its content. For example, it doesn't
make sense to copy the resource content when converting the modifier
if the resource content is discarded anyway.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 33b48a5585 ("panfrost: Add debug flag to force packing of AFBC textures on upload")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27208>
(cherry picked from commit ee77168d57)
The hardware uses the lane index for per-vertex TCS output reads rather
than the vertex index. Fortunately, it's a pretty easy calculation to
go from one to the other.
Fixes: abe9c1fea2 ("nak: Add NIR lowering for attribute I/O")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27284>
(cherry picked from commit 99ef70d8aa)
VK_ACCESS_2_SHADER_STORAGE_READ_BIT specifies read access to a
storage buffer, physical storage buffer, storage texel buffer, or
storage image in any shader pipeline stage.
Any storage buffers or images written to must be invalidated and
flushed before the shader can access them.
This fixes the following tests on LNL:
- dEQP-VK.synchronization2.op.single_queue.barrier.write\*_specialized_access_flag
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27212>
(cherry picked from commit 3e93ccbc1b)
No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16
and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8,
16 and 32 bit.
Also turn the `lower` into a `has` option as only two drivers actually
support `uror` and `urol` at this momemt.
Fixes crashes with CL integer_rotate on iris and nouveau since we emit
urol for `rotate`.
v2: always lower 64 bit
Fixes: fe0965afa6 ("spirv: Don't use libclc for rotate")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>
(cherry picked from commit f2b7c4ce29)
When building the CFG the instructions are taken of the list in
fs_visitor and added to the lists inside each block. The single
"exec_node" in the instruction is used for those memberships.
In the case the pass rebuilt the CFG, it had no instructions, so
calculate_cfg() had nothing to work with. For now fix the bug by
pulling all the instructions back to the original list.
We can do better here, but punting until upcoming work on
CFG itself.
Issue found in an unpublished CTS test. Small reproduction in our
unit tests now enabled.
Fixes: 65237f8bbc ("intel/fs: Don't add MOV instructions to DO blocks in combine constants")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27131>
(cherry picked from commit 4dbf9181cd)
We're missing the ISA code in renderdoc. You can reproduce with the
Sascha Willems graphics pipeline demo.
The change is large here because we have to fix a confusion between
anv_shader_bin & anv_pipeline_executable. anv_pipeline_executable is
there as a representation for the user and multiple
anv_pipeline_executable can point to a single anv_shader_bin.
In this change we split the anv_shader_bin related logic that was
added in anv_pipeline_add_executable*() and move it to a new
anv_pipeline_account_shader() function.
When importing RT libraries, we add all the anv_pipeline_executable
from the libraries.
When importing Gfx libraries, we add the anv_pipeline_executable only
if not doing link time optimization.
anv_shader_bin related properties are added whenever we're importing a
shader from a library, compiling or finding in the cache.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3d49cdb71e ("anv: implement VK_EXT_graphics_pipeline_library")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26594>
(cherry picked from commit 58c9f817cb)
This fixes VUID-vkCmdDraw-None-08600 violation when running gpl cts:
dEQP-VK...graphics_library.misc.bind_null_descriptor_set.*, where the
final pipeline layout is falsely dropped, leading to incompatible with
the pipeline layout of the bound descriptor set.
Fixes: a65ac274ac ("venus: Do pipeline fixes for VK_EXT_graphics_pipeline_library")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27054>
(cherry picked from commit 80a5df16fe)
The panfrost driver now makes an ioctl to retrieve some new memory
parameters, and DRM_PANFROST_PARAM_MEM_FEATURES is required (does not
default in the caller). This caused drm-shim to stop working. This
patch adds some defaults to get drm-shim working again.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 91fe8a0d28 ("panfrost: Back panfrost_device with pan_kmod_dev object")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27162>
(cherry picked from commit a50b2f8f25)
GFX10 has a hw bug and it can't handle 0-sized index buffer. The
non-indirect draw path was fine but not the indirect path where RADV
emits the index buffer.
This fixes flakes with dEQP-VK.*maintenance6* on NAVI14, and possibly
GPU hangs if there is an indirect draw with a valid index buffer right
before because it would re-use the same index buffer.
Fixes: db9816fd66 ("radv: add support for NULL index buffer")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27142>
(cherry picked from commit 783e3c096f)
On chordal graphs, a greedy coloring can be done in a way that never uses
more colors than are required for the largest clique. However, since we
have vector values and force phi resources into the same spill slots, the
interference graphs are not chordal, and thus, this assumption doesn't hold.
Use twice as many spill slots as upper bound.
Totals from 10 (0.01% of 79242) affected shaders: (GFX11)
MaxWaves: 52 -> 54 (+3.85%)
Instrs: 271386 -> 271779 (+0.14%)
CodeSize: 1362544 -> 1365432 (+0.21%)
VGPRs: 2536 -> 2532 (-0.16%)
SpillVGPRs: 778 -> 818 (+5.14%)
Scratch: 73472 -> 76800 (+4.53%)
Latency: 3331718 -> 3328798 (-0.09%); split: -0.14%, +0.05%
InvThroughput: 1665860 -> 1643350 (-1.35%); split: -1.40%, +0.05%
VClause: 3292 -> 3329 (+1.12%); split: -0.06%, +1.18%
Copies: 46082 -> 46257 (+0.38%)
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27011>
(cherry picked from commit e3098bb232)
If someone were to remove the libraries that are needed for these,
`default` would simply not enable these tests, and the only thing we
could notice is that test jobs would suddenly take less time to run.
Instead, let's have a check to make sure dEQP's cmake has detected
everything and enabled these platforms.
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27041>
(cherry picked from commit 27a1b4e4f3)
DOOM Eternal builds acceleration structures with inactive primitives and
tries to make them active in later AS updates. This is disallowed by the
spec and triggers a GPU hang. Fix the hang by working around the bug.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27034>
(cherry picked from commit a9831caa14)
Commit 73eecffabd ("panvk: Use the vk_pipeline_layout base struct")
reworked the panvk logic to use vk_pipeline_layout, which contains the
number of descriptor set layout referenced by a pipeline layout, thus
deprecating panvk_pipeline_layout::num_sets.
Make panvk_fill_non_vs_attribs() use vk_pipeline_layout::set_count
instead of panvk_pipeline_layout::num_sets and kill the latter so we
can't introduce new users.
Fixes: 73eecffabd ("panvk: Use the vk_pipeline_layout base struct")
Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Constantine Shablya <constantine.shablya@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27107>
(cherry picked from commit b18bfed2c5)
'debian/x86_64_build' job needs 'debian/x86_64_build-base' job, but 'debian/x86_64_build-base' is not in any previous stage
Fixes: f298a0e709 ("ci: make sure we evaluate the python-test rules first")
Fixes: 2c9fdaa830 ("ci: fix python-test dependency error on merge requests")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27042>
(cherry picked from commit 2ce0b5ab0a)
For instance, this issue is triggered with
vs-to-fs-overlap.shader_test -auto -fbo:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7fe64f58e9a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7fe642ca2839 in _mesa_symbol_table_ctor ../src/mesa/program/symbol_table.c:286
#2 0x7fe642ff003d in gl_nir_cross_validate_outputs_to_inputs ../src/compiler/glsl/gl_nir_link_varyings.c:728
#3 0x7fe642d7c7d8 in gl_nir_link_glsl ../src/compiler/glsl/gl_nir_linker.c:1357
#4 0x7fe642be6931 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:562
#5 0x7fe642be6931 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:944
#6 0x7fe642acab55 in link_program ../src/mesa/main/shaderapi.c:1336
#7 0x7fe642acab55 in link_program_error ../src/mesa/main/shaderapi.c:1447
#8 0x7fe6424aa389 in _mesa_unmarshal_LinkProgram src/mapi/glapi/gen/marshal_generated2.c:1911
#9 0x7fe641fd912b in glthread_unmarshal_batch ../src/mesa/main/glthread.c:139
#10 0x7fe641f48d48 in util_queue_thread_func ../src/util/u_queue.c:309
#11 0x7fe641fa442a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
Fixes: 7d1948e9b5 ("glsl: implement cross_validate_outputs_to_inputs() in nir linker")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27071>
(cherry picked from commit bacace8634)
Vivante hardware handles 64bpp render targets and samplers in a odd way
by splitting the buffer and using a pair of texture samplers or a pair
of MRT outputs to access those resources. This isn't implemented in the
driver right now, so we should not advertise support for those formats.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26982>
(cherry picked from commit e481c1269c)
Prior to 06b526de, the mesa format was used for these completeness checks.
That was to address the case where a *different* internal format selected
the *same* mesa format, and the texture shouldn't be considered compatible.
But this didn't address the case where the *same* internal format selected
a *different* mesa format, e.g. because the type passed to the TexImage
API was different.
An old WGL demo app called TexFilter.exe tries to redefine a mipped RGBA16
texture as RGBA8. This incorrect logic caused Mesa to try to copy the RGBA16
data from the smaller mips into the newly created RGBA8 data, because it
thought that the texture was still mip-complete, despite the format changing.
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27023>
(cherry picked from commit 4cb9c77e8e)
ring_seqno_valid indicates a successful ring cmd submission, and can be
used to avoid invalid reply decoding due to failed submit alloc.
Otherwise, the garbled VkResult will mislead into initialization failure
instead of oom.
Below cts failure is fixed:
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
Fixes: ec131c6e55 ("venus: use instance allocator for ring allocs")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27026>
(cherry picked from commit ecd50e70d4)
The max waves for RT prolog need to be recalculated after merging the
resource usage of all shaders invoked from it.
Note that there is no need to panic, as the info was only used to
calculate maximum scratch size and with the RT prolog being low
footprint, this likely only caused overestimation rather than
underestimation.
Fixes: 533ec9843e ("radv: Precompute shader max_waves.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26998>
(cherry picked from commit 63827751e1)
This was broken when I added texcoord support, the problem is that we
failed to properly count the number of used fs inputs and thus we failed
to make the proper decision when to reuse the color varying slot
Also fix the error messages, they were incorrect after the rewrite as
well. This fixes a bunch of piglits.
Fixes: d4b8e8a481
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27003>
(cherry picked from commit 53c17d85ab)
# Bring artifacts back from the NFS dir to the build dir where gitlab-runner
# will look for them.
cp -Rp /nfs/results/. results/
section_end dut_cleanup
if[ -f "${STRUCTURED_LOG_FILE}"];then
cp -p ${STRUCTURED_LOG_FILE} results/
echo"Structured log file is available at https://${CI_PROJECT_ROOT_NAMESPACE}.pages.freedesktop.org/-/${CI_PROJECT_NAME}/-/jobs/${CI_JOB_ID}/artifacts/results/${STRUCTURED_LOG_FILE}"
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.