The code wants the number of components used by the variable in the
current attribute slot, not the total number of components.
For e.g. a 4x3 matrix, glsl_get_components() returns 12, leading to the
following error reported by AddressSanitizer:
```
Test case 'dEQP-VK.tessellation.shader_input_output.cross_invocation_per_patch_mat4x3'..
../src/compiler/nir/nir_lower_io_to_vector.c:265:16: runtime error: index 4 out of bounds for type 'nir_variable *[4]'
```
Fixes: 5ef2b8f1f2 ("nir: Add a pass for lowering IO back to vector when possible")
(cherry picked from commit ba5c65f10b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
The game aliases two images. It binds a memory object to two different
images, the first one being an image with 4 mips and the second with
only one mip but the bind offset is incorrect. It's like it queried
the first image size with different usage flags, so that DCC was
disabled.
Force disabling DCC for mips fixes the incorrect rendering and doesn't
hurt performance.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10200
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2f13723c0a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
In two functions implementing resource discard rebind_resource is called
on resource before its track record is reset. This prevents update of
dirty_resource or dirty_shader_resource because of conditions in
needs_dirty_resource. With rsc->track reset and dirty_resource bits
missing further calls to transfer_map will not try to reallocate
resource storage when needed.
A way to reproduce the issue in both functions is by executing at least
3 draws modifying bound texture or VBO each time. This patch fixes those
cases and some related piglit tests on a5xx and should fix it on other
GPUs. Also it fixes rendering in Firefox and vsraytrace (except vertical
line at right edge).
Fixes: 0a62a874fc ("freedreno: Re-work dirty-resource tracking")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10374
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6d14cad330)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
Many debug flags influence shader codegen but are currently not included
in the hash key. This causes surprising effects as cache lookups may
return shaders compiled with different debug flags than currently in
effect. This patch fixes this by including all debug flags in the
shader hash key.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: c323848b0b ("ir3, tu: Plumb through support for per-shader robustness")
(cherry picked from commit d8c90806e4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
This fixes a corner case of the LNL sub-dword integer restrictions
that wasn't being detected by has_subdword_integer_region_restriction(),
specifically:
> if(Src.Type==Byte && Dst.Type==Byte && Dst.Stride==1 && W!=2) {
> // ...
> if(Src.Stride == 2) && (Src.UniformStride) && (Dst.SubReg%32 == Src.SubReg/2 ) { Allowed }
> // ...
> }
All the other restrictions that require agreement between the SubReg
number of source and destination only affect sources with a stride
greater than a dword, which is why
has_subdword_integer_region_restriction() was returning false except
when "byte_stride(srcs[i]) >= 4" evaluated to true, but as implied by
the pseudocode above, in the particular case of a packed byte
destination, the restriction applies for source strides as narrow as
2B.
The form of the equation that relates the subreg numbers is consistent
with the existing calculations in brw_fs_lower_regioning (see
required_src_byte_offset()), we just need to enable lowering for this
corner case, and change lower_dst_region() to call lower_instruction()
recursively, since some of the cases where we break this restriction
are copy instructions introduced by brw_fs_lower_regioning() itself
trying to lower other instructions with byte destinations.
This fixes some Vulkan CTS test-cases that were hitting these
restrictions with byte data types.
Fixes: 217d412360 ("intel/fs/gfx20+: Implement sub-dword integer regioning restrictions.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
We were accidentally doing a signed integer comparison here for ult32,
or a sign-extending shift for ushr.
One notable bit of fallout was that load_global_uniform_block_intel
address calculations broke on platforms that don't have native 64-bit
integer support, as the iadd64 lowering for "do I need to carry?" was
using ult32...and performing the wrong comparison. We spotted this in
Borderlands 3 on Alchemist once we turned on other optimizations.
Thanks to Lionel Landwerlin for helping spot the problem!
Fixes: c7b312ad45 ("brw: factor out source extraction for rematerialization")
Fixes: 339630ab05 ("brw: enable A64 loads source rematerialization")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5848035443)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
The BROADCOM_SAND128 modifier is usually used with an extra parameter
to pass in the stride via a side channel. Quoting from drm_fourcc.h:
> The pitch between the start of each column is set to optimally
> switch between SDRAM banks. This is passed as the number of lines
> of column width in the modifier (we can't use the stride value due
> to various core checks that look at it , so you should set the
> stride to width*cpp).
So apparently this is just a workaround for limitations in some kernel
APIs. DRM modifiers, however, are arguably a bad fit for extra
parameters that aren't known in advance. In the Wayland/KMS ecosystem
many components depend on being able to treat modifiers as opaque, e.g.
for negotiations etc. In practice the current approach requires various
software components to manually use the
`DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT()` macro - using the
`DRM_FORMAT_MOD_BROADCOM_SAND128` modifier directly with formats like
`NV12` results in a rejection in the KMS driver and corrupted output
in Mesa (because we'd bail out early in `v3d_sand8_blit()`).
Fortunately the stride check limitations mentioned above don't seem to
apply to Mesa though. Thus we can just add support for the base modifier
and stride (coming from V4L2), allowing various toolkits, Wayland
compositors and V4L2 decoder implementations to support e.g.
`NV12` + `DRM_FORMAT_MOD_BROADCOM_SAND128` (`NC12` in V4L2) in a generic
way.
Notes:
1. Wayland compositors trying to offload composition to KMS will still
fail when doing a test commit.
2. There is another limitation - in the V4L2 MPLANE API - that
requires userspace to know the correct offset of the second plane. That's
a known API limitation though and only affects V4L2 decoder implementations.
Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
(cherry picked from commit 758941ab0c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
Since the shader parameters are passed as inline data, push constants
are no longer used and so, not actually set on dispatch. But the
nr_params = 4 was still making the shader emit the code to load them,
causing page faults on simulation, and would also on HW if we didn't
always have a scratch page set.
The uses_inline_data parameter will be set from brw_compile_cs(), called
shortly after this point, so we don't need it here.
The subgroup_size is misleading, as we don't actually require that size
and the code that checks for it isn't even running for this shader.
Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12152
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d32a26b3e6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
As per the code comment added in this commit the nir produced from
glsl to nir doesn't always keep function declarations before the
code that calls them e.g. calls from within other function
implementations. The change in this commit works around this problem by
first cloning all function declarations in a first pass, then cloning
the implementations in a second pass once we have filled the remap
table.
Fixes: cbfc225e2b ("glsl: switch to a full nir based linker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12115
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 59b2549279)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
KernelCI jobs have priority 44 and are very long-running jobs (and
there might be an issue with the KernelCI that makes it create hundreds
of jobs, @sergi is looking into that).
While bumping to 45+ would be enough to allow Mesa release staging
pipelines to run despite the KernelCI, during the CI meeting with @sergi
and @mupuf it was determined that the Mesa releases are an important
enough operation to warrant being a higher priority than user forks
pipelines, so priority 70 was picked (still under the 75 of Marge
pipelines).
Cc: mesa-stable
(cherry picked from commit 50f9bec3ce)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Because dyn_start and dyn_end are indices into
nvk_root_descriptor_table->dynamic_buffers, we would need to offset
cbuf->dynamic_idx by
nvk_root_descriptor_table->set_dynamic_buffer_start[cbuf->desc_set]
in order to do those comparisons correctly.
We could do that, but it's simpler and no less precise to sinply
re-use the same comparison that we do in the other cases here.
This fixes a rendering artifact in Baldur's Gate 3 (Vulkan), which
regressed with the commit listed below.
Fixes: 091a945b57 ("nvk: Be much more conservative about rebinding cbufs")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit dc12c78235)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
The Early-Z optimization is disabled when there is a discard
instruction in the shader used in the draw call.
But if discard is the only reason to disable Early-Z, and at
draw call time the updates in the draw call are disabled we
can enable Early-Z using a shader variant.
If there are occlussion queries active we also need to disable
Early-z optimization.
So this patch enables Early-Z in this scenario.
The performance improvement is significant when running gfxbench
benchmark showing an average improvement of 11.15%
fps_avg helped: gl_gfxbench_aztec_high.trace: 3.13 -> 3.73 (19.13%)
fps_avg helped: gl_gfxbench_aztec.trace: 4.82 -> 5.68 (17.88%)
fps_avg helped: gl_gfxbench_manhattan31.trace: 5.10 -> 6.00 (17.59%)
fps_avg helped: gl_gfxbench_manhattan.trace: 7.24 -> 8.36 (15.52%)
fps_avg helped: gl_gfxbench_trex.trace: 19.25 -> 20.17 ( 4.81%)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable
(cherry picked from commit 5b951bcdd7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
We can end up calling vk_multialloc_alloc with 0 size when
`attachment_count` is 0 and `clearValueCount` is 0.
Addressed:
```
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x7faf033ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fada5cc10 in vk_default_alloc ../src/vulkan/util/vk_alloc.c:26
#2 0x7fac50b270 in vk_alloc ../src/vulkan/util/vk_alloc.h:48
#3 0x7fac555040 in vk_multialloc_alloc
../src/vulkan/util/vk_alloc.h:234
#4 0x7fac555040 in void
tu_CmdBeginRenderPass2<(chip)7>(VkCommandBuffer_T*,
VkRenderPassBeginInfo const*, VkSubpassBeginInfo const*)
../src/freedreno/vulkan/tu_cmd_buffer.cc:4634
#5 0x7fac900760 in vk_common_CmdBeginRenderPass
../src/vulkan/runtime/vk_render_pass.c:261
```
seen in:
dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_texel_buffer.no_fmt_qual.len_252.samples_1.1d.frag
Fixes: 4cfd021e3f ("turnip: Save the renderpass's clear values in the cmdbuf state.")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
(cherry picked from commit c923eff742)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>