On Bifrost/Midgard, when an attribute has a non-zero divisors, the
attribute offset is tweaked to take the base_instance into account,
which implies we have to re-emit the attributes if the base instance
value changed.
Let's not bother tracking the last base instance and re-emit
unconditionally in that case, which is still better than what we had
before 3db963a135 ("panfrost: Emit attribs in
panfrost_update_state_3d() on bifrost/midgard") and fixes the regression
introduced by this commit.
Fixes: 3db963a135 ("panfrost: Emit attribs in panfrost_update_state_3d() on bifrost/midgard")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-By: Chris Healy <healych@amazon.com>
(cherry picked from commit 8891c2aeba)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
This message has been confusing users, especially now that
popular toolkits such as Gtk started using a Vulkan renderer.
Printing a message on non-conformant implementations is also
actually not required. So let's remove it.
We haven't fully finished the GFX12 implementation yet, but on
all other hardware, RADV should work just fine, and is definitely
not meant for "testing use only".
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12314
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit dd980d2b28)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
I stumbled on this limit - it turns out that large local_sizes apply an
additonal limit on gprs per thread. If we violate this limit, then dmesg
just gives us a rather unhelpful message that the channel is killed:
nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:64 type:13 scope:1 part:233
nouveau 0000:01:00.0: fifo:c00000:0008:0040:[hw_tests::test_[14761]] errored - disabling channel
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
(cherry picked from commit b99772e71e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
Fix defect reported by Coverity Scan.
Side effect in assertion (ASSERT_SIDE_EFFECT)
assert_side_effect: Argument ++eot_count of assert() has a side effect.
The containing function might work differently in a non-debug build.
Fixes: ebd6738260 ("intel/elk/chv: Implement WaClearArfDependenciesBeforeEot")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 83809f06a7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
Mesa3D is commonly used, as evidenced by the use of the Mesa3D.org domain.
Additionally, it is unnecessary to advise against using "MesaGL"
since we do not use it ourselves.
Cc: mesa-stable
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 6f08f921bf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
this was missed in the original v3d pass, and then the common code port
inherited the bug. (so strictly this fix "should" be backported even farther
back but it won't apply before the Fixes here, and I don't think we do LTS that
far back anyway).
in theory this should fix a corner case with robustness on the gl (but not
vulkan, at least for apple) drivers on broadcom & apple.
Fixes: f0fb8d05e3 ("nir: Add nir_lower_robust_access pass")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit d9b4867e2a)
Conflicts:
src/compiler/nir/nir_lower_robust_access.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
We need to add a variant of the texld instruction, which is used with a shadow
samper and passed the shadow reference value via src2.
Blob generates such texld's for deqp's GLES3.functional.texture.shadow.2d.* (GC3000).
Fixes spec@arb_depth_texture@texdepth.
Fixes: abe5bd35 ("etnaviv: Switch to isa_assemble_instruction(..)")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5daa47c1f8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
This change prevents the reuse of the bo when the counter is already
zero. At zero, the bo is in a state where the deletion is pending,
and this implementation relying on an atomic counter can't be safely
stopped. In other words, the previous fix ccd3bb4548 lower the
probability of this race condition, but doesn't fix it.
This change prevents a race condition which has a high probability
on r600 with the test below. This change was tested with the thread
sanitizer.
For instance, this issue is triggered on r600 with
"piglit/bin/ext_image_dma_buf_import-refcount-multithread -auto":
==9876==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d000021a20 at pc 0x7f2c9f59f748 bp 0x7f2c8f3aa600 sp 0x7f2c8f3aa5f8
READ of size 4 at 0x60d000021a20 thread T6
#0 0x7f2c9f59f747 in pipe_is_referenced ../src/gallium/auxiliary/util/u_inlines.h:65
#1 0x7f2c9f59f747 in radeon_bo_destroy ../src/gallium/winsys/radeon/drm/radeon_drm_bo.c:342
#2 0x7f2c9f63b541 in radeon_bo_reference ../src/gallium/include/winsys/radeon_winsys.h:794
#3 0x7f2c9f63b541 in r600_texture_destroy ../src/gallium/drivers/r600/r600_texture.c:571
#4 0x7f2c9d65662d in pipe_resource_destroy ../src/gallium/auxiliary/util/u_inlines.h:146
#5 0x7f2c9d65662d in pipe_resource_reference ../src/gallium/auxiliary/util/u_inlines.h:163
#6 0x7f2c9d65662d in st_FreeTextureImageBuffer ../src/mesa/state_tracker/st_cb_texture.c:459
#7 0x7f2c9d5b6991 in _mesa_delete_texture_image ../src/mesa/main/teximage.c:226
#8 0x7f2c9d5f2593 in _mesa_delete_texture_object ../src/mesa/main/texobj.c:532
#9 0x7f2c9d5f2be7 in _mesa_reference_texobj_ ../src/mesa/main/texobj.c:639
#10 0x7f2c9d5f3773 in _mesa_reference_texobj ../src/mesa/main/texobj.h:92
#11 0x7f2c9d5f3773 in delete_textures ../src/mesa/main/texobj.c:1578
0x60d000021a20 is located 0 bytes inside of 144-byte region [0x60d000021a20,0x60d000021ab0)
freed by thread T5 here:
#0 0x7f2ca8b2b4f7 in free (/usr/lib64/libasan.so.6+0xb14f7)
#1 0x7f2c9f59efb3 in radeon_bo_destroy ../src/gallium/winsys/radeon/drm/radeon_drm_bo.c:401
#2 0x7f2c9f63b541 in radeon_bo_reference ../src/gallium/include/winsys/radeon_winsys.h:794
#3 0x7f2c9f63b541 in r600_texture_destroy ../src/gallium/drivers/r600/r600_texture.c:571
#4 0x7f2c9d65662d in pipe_resource_destroy ../src/gallium/auxiliary/util/u_inlines.h:146
#5 0x7f2c9d65662d in pipe_resource_reference ../src/gallium/auxiliary/util/u_inlines.h:163
#6 0x7f2c9d65662d in st_FreeTextureImageBuffer ../src/mesa/state_tracker/st_cb_texture.c:459
#7 0x7f2c9d5b6991 in _mesa_delete_texture_image ../src/mesa/main/teximage.c:226
#8 0x7f2c9d5f2593 in _mesa_delete_texture_object ../src/mesa/main/texobj.c:532
#9 0x7f2c9d5f2be7 in _mesa_reference_texobj_ ../src/mesa/main/texobj.c:639
#10 0x7f2c9d5f3773 in _mesa_reference_texobj ../src/mesa/main/texobj.h:92
#11 0x7f2c9d5f3773 in delete_textures ../src/mesa/main/texobj.c:1578
previously allocated by thread T6 here:
#0 0x7f2ca8b2b9a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7f2c9f5a36d5 in radeon_winsys_bo_from_handle ../src/gallium/winsys/radeon/drm/radeon_drm_bo.c:1198
#2 0x7f2c9f641b2a in r600_texture_from_handle ../src/gallium/drivers/r600/r600_texture.c:1105
#3 0x7f2c9d47550a in dri_create_image_from_winsys ../src/gallium/frontends/dri/dri2.c:1007
#4 0x7f2c9d47eeb9 in dri2_from_dma_bufs ../src/gallium/frontends/dri/dri2.c:1629
#5 0x7f2ca8854360 in dri2_create_image_dma_buf ../src/egl/drivers/dri2/egl_dri2.c:2564
#6 0x7f2ca8854f45 in dri2_create_image_khr ../src/egl/drivers/dri2/egl_dri2.c:2817
#7 0x7f2ca8846f2c in dri2_create_image ../src/egl/drivers/dri2/egl_dri2.c:1864
#8 0x7f2ca87f9dd8 in _eglCreateImageCommon ../src/egl/main/eglapi.c:1850
Fixes: ccd3bb4548 ("winsys/radeon: fix a race between bo import and destroy")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c6bcf88949)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
Currently the list of buffer usage bits specified is hardcoded with
transform feedback bits, which leads to a validation layer error report
with ID VUID-VkBufferCreateInfo-None-09499 when EXT_transform_feedback
is not available.
Only set these bits when EXT_transform_feedback extension is really
available to suppress this error.
Cc: mesa-stable
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 70fa598696)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
The Vulkan spec says:
"The application can enable a logical operation between the
fragment’s color values and the existing value in the framebuffer
attachment. This logical operation is applied prior to updating
the framebuffer attachment. Logical operations are applied only
for signed and unsigned integer and normalized integer
framebuffers. Logical operations are not applied to floating-point
or sRGB format color attachments."
Missing VKCTS coverage has been reported.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12345
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 03b037a0e3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
Video decode target needs custom height alignment, but tex descriptor
still needs to be set to the original size the image was created with.
This makes the descriptor wrong for layer > 0, so we need to calculate
the layer offset and add it to bo address for this case.
Fixes: 5deb476095 ("radv: align video images internal width/height inside the driver.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3474978d52)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
Unigine Heaven crash on GFX8/9 when use aco:
heaven_x64: ../../amd/mesa/src/gallium/drivers/radeonsi/si_nir_lower_abi.c:813: lower_tex: Assertion `samp_index >= 0 && comp_index >= 0' failed.
GFX8/9 will clamp texture comparison value in si_nir_lower_abi,
but it has to be done after si_nir_lower_resource.
Fixes: ae933169 ("radeonsi: lower NIR resource srcs to descriptors last")
(cherry picked from commit 8609f49d05)
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33113>
When we moved building the docs to Meson, we accidentally dropped the -W
flag that we used to have. This lead to us no longer detecting certain
problems in the docs, which is unfortunate.
Let's bring this back gated by the werror meson-option, and wire that up
on the CI end.
Fixes: fdd204538b ("ci: build docs using meson")
Reviewed-by: Dylan Baker <None>
(cherry picked from commit cf07e89d06)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
Without this the game crashes during the loading screen.
The game uses vkUpdateDescriptorSetWithTemplate and, in certain cases,
passes VkDescriptorBufferInfo structures where the offset + range
exceeds the size of the buffer. This triggers an assertion when
vk_buffer_range() is called, causing the game to crash.
When the nvidia vendor id is used the range is consistently set to 65536.
Without it the range varies and is much smaller - never exceeding 1000.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12349
Cc: stable
Reviewed-by: Faith Ekstrand <None>
(cherry picked from commit e38150f2fa)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
There's more than one error-path out of cs_alloc_ins_block(), but only
one of them got the discar_instr_slot treatment. Instead of plugging
this in one more time, let's move this handling up to cs_alloc_ins(),
where we can easily whack two birds with one stone. This makes us
consistently return NULL on error here.
At the same time, we need to patch up cs_flush_block_instrs() here,
because we don't actually set the buffer invalid here. So let's
check for NULL here instead, which is the new contract.
Fixes: 0e6aaab00a ("pan/cs: add block to handle registers backup in exception handler")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 3006c2a7b6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
`cleared_and_retried` variable is not required, as once the cache is
empty, in the second retry it will retry it is already empty so it won't
retry a new allocation.
Fixes: 2adea940f1 ("v3dv/bo: adding a BO cache")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit f6766ccadb)
Conflicts:
src/broadcom/vulkan/v3dv_bo.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
The equivalent bit is set correctly on JM, but was missed for CSF. There
shouldn't need to be any shader changes, the alpha-to-coverage flag in
FAU_ATEST_PARAM is set automatically from the bit in DcdFlags0.
Fixes dEQP-VK.pipeline.*.multisample.alpha_to_coverage*
Fixes: 447075eeee ("panfrost: Add support for the CSF job frontend")
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
(cherry picked from commit 3f90d8dfd2)
Conflicts:
src/panfrost/ci/panfrost-g610-fails.txt
src/panfrost/vulkan/csf/panvk_vX_cmd_draw.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
We can get into a situation where the layer size for a given mip isn't
large enough to hold the pitch times the aligned height, i.e. the height
isn't aligned. This can happen even if the size is 4K aligned. The
hardware seems not to align the height for us, so we have to use the
MINLAYERSZ hammer.
This was found with a Vulkan test when enabling tiling for mutable
textures on a750, but it's also reproducable via
"bin/texelFetch fs sampler3D 76x76" using piglit.
Cc: mesa-stable
(cherry picked from commit ef4c752b6e)
Conflicts:
src/freedreno/fdl/fd6_layout_test.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
This message is printed on non-panfrost/panthor systems on every
physical device enumeration when panvk is present like in distribution
mesa builds.
This "breaks" gtk-4 tests in the default configuration since they fail
on warning log messages. gtk-4 still forwards the vulkan debug report as
warning messages after fixes for issue 11451 to stop handling it as
critical message.
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: d970fe2e9d ("panfrost: Add a Vulkan driver for Midgard/Bifrost GPUs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11451
(cherry picked from commit b06b62bb13)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
If an instruction didn't already use SDWA convert_to_SDWA in apply_extract
will add ubyte0/uword0 selections for v1b/v2b operands. This loses information
that the instruction doesn't care about the high bits and makes the next
apply_extract_twice fail.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 6cb9d39bc2 ("aco: combine extracts with sub-dword definitions")
(cherry picked from commit 3da2d96bc5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
Regression on test:
dEQP-GLES31.functional.geometry_shading.basic.output_256
voffset is missing if buffer store base >=4096, we need to
re-calculate offen after resolve_excess_vmem_const_offset().
Fixes: cdaf269924 ("aco: inline store_vmem_mubuf/emit_single_mubuf_store")
(cherry picked from commit dff14d102d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
It's not valid to call RenderPicture and EndPicture without calling
BeginPicture or when BeginPicture fails. FFmpeg will however call
EndPicture when BeginPicture fails, so we need to handle this.
Use target_id, which is assigned in BeginPicture, as an indication
whether we are inside the Begin - End picture sequence.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
(cherry picked from commit 42e765d48b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32852>
The Wayland protocol defines INVALID as a special marker indicating
that implicit modifiers are supported. If the driver doesn't support
explicit modifiers and the compositor advertises support for implicit
modifiers, fallback to these.
This effectively restores logic removed in 4c06515892, but only
for the specific case of Wayland instead of affecting all APIs.
(Wayland is one of the few APIs defining a special meaning for
INVALID.)
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c06515892 ("dri: revert INVALID modifier special-casing")
(cherry picked from commit da555982b3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
If we supply modifiers to dri_create_image_with_modifiers() and
the driver doesn't support them, the function will fail. We pass
__DRI_IMAGE_USE_LINEAR anyways so stripping the modifier is fine.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c06515892 ("dri: revert INVALID modifier special-casing")
(cherry picked from commit d795b4712c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
If we supply modifiers to dri_create_image_with_modifiers() and
the driver doesn't support them, the function will fail. The X11
server always supports implicit modifiers so we can always fall
back to that.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c06515892 ("dri: revert INVALID modifier special-casing")
(cherry picked from commit 655ac4fff6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
When updating an AFBC-packed resource, the size of the resulting
texture data can change while the BO and modifier stay the same. We
still need to update the texture descriptor in that situation so
that the size is properly reported. Having a smaller size than the
real one might cause artifacts as the GPU doesn't want to read past
the reported size.
A future (more foolproof) fix might involve having a hash key to
track the size of all slices independently, but this patch still
improves the situation and make sure we don't hit a relatively
common issue when using `PAN_MESA_DEBUG=forcepack`.
Fixes: bc55d150a9 ("panfrost: Add support for AFBC packing")
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 30825140d0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
A next patch will emit more instructions in video and copy queues
for Gfx 200 and newer but the current code only creates anv_async_submit
if device has aux_map.
Instead we can always create anv_async_submit and only submit it to
hardware if any instruction was emited.
Fixes: 86813c60a4 ("mi-builder: add read/write memory fencing support on Gfx20+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit b8f93bfd38)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
Last VGT stages (VS, TES or GS) can always be used with a null FS when
nextStage is non-zero. Like if a VS is created with nextStage=TCS, it's
also allowed to draw without binding a CTS (ie. nextStage=None is always
a valid case).
Because we don't want to compile two variants for NONE and FRAGMENT,
let's compile only the FRAGMENT one when necessary.
Fixes new CTS coverage, see https://gerrit.khronos.org/c/vk-gl-cts/+/15976.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0223f0f54d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
With the introduction of the slab allocator, most of our small
allocations now hit that rather than directly hitting the bucket
cache. Those now show up as 2MB slab allocations from the cache's
perspective. So, we don't need quite as many buckets. (Note that
only allocations in IRIS_MEMZONE_OTHER are suballocated today.)
Previously, we had 55 buckets, going from 4KB to 112MB, with sizes
N, N+1/4, N+1/2, N+3/4 for a series of power-of-two N's.
This patch prunes it down to 25 buckets:
- 4K-4MB => power-of-two sizes only
- 6MB => a one-off bucket to reduce waste between 4MB and 8MB
- 8MB+ => the usual N, N+1/4, N+1/2, N+3/4 system
- 64MB => the largest bucket size
In particular, this eliminates the 1.75MB, 2.5MB, 3MB, 3.5MB, and 7MB
buckets in favor of multiples of 2MB. Allocating multiples of 2MB is
preferable because it allows the kernel to allocate 64KB pages rather
than being stuck using inefficient 4K pages. And, the amount of waste
from bumping to the next multiple of 2MB isn't huge in that range of
sizes. We also eliminate buckets larger than 64MB because they're
rarely used, and also the amount of waste from rounding up to the
80/96/112MB buckets can get pretty large.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Fixes: 0b6693a3a1 ("iris: Align fresh BO allocations to 2MB in size")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10219
(cherry picked from commit d85d6ad2a5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
It's allowed to place OpExecuteCallableKHR in a SPIR-V, even if the RT
pipeline doesn't contain any callable shaders. Unreal hits this case and
crashes. We can assume the intrinsic never gets executed, so we can
simply remove it.
Cc: mesa-stable
(cherry picked from commit 0c02a7e8e8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
The current assertion fails as soon as a MAD with src0 and src2 being
immediate is detected.
The assertion was supposted to catch, "If it's ADD3, only one of src0
and src2 can be immediate." The detect this, the opcode test should have
been !=.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: c1c09e3c4a ("brw/emit: Add correct 3-source instruction assertions for each platform")
(cherry picked from commit c52ce6157f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
Some callers of brw_constant_fold_instruction depend on the result being
a MOV of immediate when progress is made. Previously `MUL dst:D src0:D
1:D` would be converted to `MOV dst:D src0:D`. There was also no
handling for `MUL dst:D imm0:D imm1:D`.
This could cause problems if one of the immedate values was -1. The
existing code would convert this to a `MOV dst:D imm0:D` and set the
negate flag on src0. That is not correct.
v2: Fix the is_negative_one case handling of the non-negative-one
source. Add a comment explaining the assertion. Both suggested by Caio.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2cc1575a31 ("brw/algebraic: Refactor constant folding out of brw_fs_opt_algebraic")
(cherry picked from commit 25de9dcd76)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
Some callers of brw_constant_fold_instruction depend on the result being
a MOV of immediate when progress is made. Previously `ADD dst:D src0:D
0:D` would be converted to `MOV dst:D src0:D`. There was also no
handling for `ADD dst:D imm0:D imm1:D`.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2cc1575a31 ("brw/algebraic: Refactor constant folding out of brw_fs_opt_algebraic")
(cherry picked from commit 086e83ccd9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
In this case, num_sources is bigger than this->sources, so if we loop
up to num_sources (instead of this->sources) we'll end up reading past
the end of old_src[]. Only copy up to what we originally had.
This was found by code inspection, I'm not aware of any applications
failing due to the lack of this patch.
Fixes: d9e737212d ("intel/brw: Add a src array for the common case in fs_inst")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit d4a54d4f92)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
Rather than checking hwconfig items when using them, wait until after
devinfo has been fully initialized. This includes having workarounds
implemented.
We can then check if the hwconfig data and final Mesa initialization
agree. If the match fails, we need to investigate if Mesa or the
hwconfig data is wrong.
This code becomes a no-op when not on a release build.
Fixes: a4c5bfd34c ("intel/dev: Use hwconfig for urb min/max entry values")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12141
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 1027b071f9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
Unscissored glClearColor is using i915_fill_blit().
Clearing can be done with the 1 byte formats
GL_ALPHA, GL_LUMINANCE or GL_INTENSITY.
Routine i915_fill_blit() is called with a rgba-mask containing
1 byte, but it is handling this as a 2-byte color.
This fix adds the needed 1 byte setup to both
i915_fill_blit() and i915_copy_blit().
It solves 1 piglit-test concerning arb_clear_texture-base-formats
and 15 tests concerning fbo-clear-formats.
No regression is shown at other piglit-tests.
Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
(cherry picked from commit bed66430ab)
Conflicts:
src/gallium/drivers/i915/ci/i915-g33-fails.txt
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
VRS rates should only be preserved for clears, otherwise the HTILE
buffer should be cleared completely.
This fixes some failures/flakes in CI.
Fixes: 8197d744f5 ("radv: Do not overwrite VRS rates when doing fast clears")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8b755840fc)
Conflicts:
src/amd/vulkan/meta/radv_meta_clear.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
The override used for the immed encoding in #cat3-src-const-or-immed
used a pattern which isn't supported in overrides by isaspec. The
pattern in the base bitset (10) was too strict for immediates since it
didn't allow the most significant bit to be 1.
Fix this by making the base pattern 1 and adding an assert for the next
bit to be 0 in the non-immed case.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 1c6c200c0d ("ir3: add newly found shlg.b16 instruction")
(cherry picked from commit 943f666b69)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
When a timestamped present is not used (MAILBOX or the very first present),
it's possible that the very last queued present ID won't complete in finite time.
Similar to frame callback based workaround, apply a timeout to present
waits when they target the very last submitted presentID.
Only apply the workaround when we're not guaranteed forward progress.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Derek Foreman <derek.foreman@collabora.com>
(cherry picked from commit c3becade15)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
When commit-timing was not supported, but FIFO was we would end
up in a situation with throttling on FIFO barrier and legacy fence.
At that point, the entire point of FIFO falls flat.
There are some caveats with this approach, but it's not expected
that compositors will only support FIFO, and not commit-timing long
term.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Derek Foreman <derek.foreman@collabora.com>
(cherry picked from commit 458842c3b5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
The last argument seems to be used as brw_shader_reloc::delta (from
brw_add_reloc), and we're unconditionally setting it to 0 here, while
the other place where we handle nir_intrinsic_load_reloc_const_intel
seems to be setting the base appropriately.
I found this by inspection while debugging a bug related to this code,
so I'm not aware of any workloads that get improved by this patch.
Related patches:
- ecbec25e84 ("intel/nir: add reloc delta to load_reloc_const_intel intrinsic")
- 99047451c9 ("intel/fs: add plumbing for embedded samplers")
Fixes: ecbec25e84 ("intel/nir: add reloc delta to load_reloc_const_intel intrinsic")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 0dc2a5808e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
DRM_XE_TOPO_EU_PER_DSS and DRM_XE_TOPO_SIMD16_EU_PER_DSS can be any
number of bytes long but it was assuming it was always 4 bytes long.
That was not a issue because Xe KMD return 4 bytes even if only needs
1 or 2 bytes but that is a problem with our HW simulator that was
returning 2 bytes.
Fixes: a24d93aa89 ("intel/dev: Query and compute hardware topology for Xe")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 04bdbeec31)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
We create NIR shaders here, and we need to free them when we're done with
them as well.
These shaders are created using nir_builder_init_simple_shader(), which
allocates using a NULL ralloc-parent, so ralloc_free should be the right
function to free them with.
Fixes: 514c10344e ("vulkan/meta: Add a concept of rect pipelines")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 43738a9a94)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32730>
This was just broken because individual shaders were still stored
on-disk in many situations:
- for shader object, all compute/graphics shaders were stored
- for fast-GPL, graphics shaders were stored
- for pipeline binaries, when the create flag was used
- for rt capture/replay and ray history
This should stop storing unused binaries on-disk and save space.
Found this by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32476>
We never split live ranges, so we don't need to store the location of
each live value when recording live outs, but the physreg assigned to a
register will still be clobbered when we reload it so we have to record
the original physreg and then make sure to use it when reloading the
live out.
We probably never encountered a case where we needed to reload live outs
in a loop before, but after enabling clustered subgroup reductions
dEQP-VK.subgroups.clustered.compute.subgroupclusteredmin_{i,u}64vec4_requiredsubgroupsize
hits this case and fails in RA validation without this fix.
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32468>
Commit 361f362258 ("dri: Unify createImage and createImageWithModifiers")
has introduced new behavior for drivers which don't support explicit
format modifiers. Before this commit, INVALID was not special-cased
and any call to dri_create_image() with one or more modifiers returned
NULL. After this commit, INVALID gained a special meaning: it indicates
that the implicit modifier is accepted by the caller. This is surprising
and is an API break.
This causes further API breaks: for instance, before this commit a BO
created via gbm_bo_create_with_modifiers() was guaranteed to always
return a non-INVALID modifier in gbm_bo_get_modifier().
This is inconsistent with gbm_dri_surface_create(): that function
treats INVALID as a bad entry in the modifier list, and fails if
it's the only acceptable modifier.
Additionally, drivers don't special-case INVALID and just ignore it
if they see it in a modifier list. This causes more inconsistencies.
For instance, let's say that a library user passes the modifier list
{ INVALID, FOO } to GBM. If a driver supports explicit modifiers and
doesn't support FOO for scanout, it'll return NULL. If a driver
doesn't support explicit modifiers, the current logic would return
a non-NULL BO with an INVALID modifier. This discrepency makes it
harder to reason about the system: half of the API ignores INVALID,
while the other half assumes INVALID indicates an implicit modifier.
To fix these issues, revert to the behavior before the commit, and
require use of the dedicated API without supplying any modifier for
implicit modifiers.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 361f362258 ("dri: Unify createImage and createImageWithModifiers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32396>
(cherry picked from commit 105fcb9cfd)
This reverts commit c49a71c03c.
It broke radeonsi.
GBM can't set __DRI_IMAGE_USE_BACKBUFFER if gbm itself doesn't use it as
a back buffer by rendering to it and calling SwapBuffers. If another
library uses it as a back buffer, that library should set
__DRI_IMAGE_USE_BACKBUFFER, not GBM. A different flag could be added
to indicate the behavior that the original commit expected.
Fixes: c49a71c03c - gbm: mark surface buffers as explicit flushed
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11996
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32345>
(cherry picked from commit 1a7c54b840)
This allows `gbm_bo_get_offset()` to return the correct offset for e.g.
the second plane of a resource with the NV12 format. Crucially this
fixes direct scanout / hardware plane usage in Mutter and possibly other
clients.
While on it also add support for stride, modifier and n_planes queries.
The later two should not change in behavior and just safe a few CPU
cycles. The stride query support in theory fixes queries for multi-plane
formats, however in practice most/all currently used formats such as NV12,
P010 and YUV420 use the same stride for all planes.
Cc: mesa-stable
Acked-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32282>
(cherry picked from commit 379de4cdce)
After the breaking commit, gbm_bo_create_with_modifiers({LINEAR}) returns
a BO with gbm_bo_get_modifier() = INVALID. This restores the functionality
and fixes most notably, hardware cursors for cards without modifiers.
Fixes#12039.
Fixes: 361f362258 ("dri: Unify createImage and createImageWithModifiers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31725>
(cherry picked from commit 7d1a32fafd)
In certain cases, the hardware fails to properly process a mipmap level
of these special stencil and depth formats. This happens at width=16.
This change adds a software workaround.
Modifying the corresponding mipmap nblk_x, and the other related
values, could make the tests below to work. Anyway, this method
generates regressions.
This change was tested on palm and cayman and fixes the following tests:
spec/arb_framebuffer_object/framebuffer-blit-levels read stencil: fail pass
spec/arb_depth_buffer_float/fbo-clear-formats stencil/gl_depth32f_stencil8: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31957>
(cherry picked from commit ac78692be4)
This situation is happening, for instance, when the hardware is
using the type FMT_8_8_8_8 (4 bytes) while the software was
requesting a 3 bytes type. The width should be adjusted to the
expected hardware size; otherwise, the last vertex is lost.
Note: The rv770 didn't behave like this. This is definitely
a hardware change between these gpus.
This change was tested on palm and cayman. Here are the tests fixed:
spec/!opengl 2.0/gl-2.0-vertexattribpointer-size-3: fail pass
deqp-gles2/functional/draw/random/62: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_byte3_vec4_dynamic_draw_quads_1: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_short3_vec4_dynamic_draw_quads_1: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_short3_vec4_dynamic_draw_quads_256: fail pass
deqp-gles3/functional/draw/random/117: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/byte/buffer_stride32_components3_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/short/buffer_stride32_components3_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/short/buffer_stride32_components3_quads256: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32184>
(cherry picked from commit 81889f4d5c)
This change fixes the evergreen nonconformity issue on non-mipmap
textures when the minification and the magnification are not in
the same state.
This modification disables 5278436d67 when the minification and
the magnification are different. This fixes the nonconformity
without new regressions. Anyway, I was unable to reproduce
the issue described by 5278436d67 on palm and cayman.
This change was tested on cayman and palm. It fixes 84 deqp-gles2
tests and 128 deqp-gles3 tests:
deqp-gles2/functional/texture/filtering/2d/linear_nearest_*
deqp-gles2/functional/texture/filtering/2d/nearest_linear_*
deqp-gles2/functional/texture/filtering/cube/linear_nearest_*
deqp-gles2/functional/texture/filtering/cube/nearest_linear_*
deqp-gles2/functional/texture/vertex/2d/filtering/linear_nearest_*
deqp-gles2/functional/texture/vertex/2d/filtering/nearest_linear_*
deqp-gles2/functional/texture/vertex/cube/filtering/linear_nearest_*
deqp-gles2/functional/texture/vertex/cube/filtering/nearest_linear_*
deqp-gles3/functional/texture/filtering/2d/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/2d/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/2d_array/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/2d_array/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/3d/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/3d/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/cube/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/cube/combinations/nearest_linear_*
deqp-gles3/functional/texture/vertex/2d/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/2d/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/2d_array/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/2d_array/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/3d/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/3d/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/cube/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/cube/filtering/nearest_linear_*
Fixes: 5278436d67 ("r600: force LOD range to be only one value when mip.min filter is NONE")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32185>
(cherry picked from commit 4d24995adb)
These fields are only valid in certain formats, so set them accordingly.
Note the check if !is_send is used because FORMAT_BASIC is reused for
SEND/SENDS in some platforms. If we start to see more cases like that,
we can create a new FORMAT for it.
The cond_modifier is trickier because on top of that, it is not valid
for 64-bit immediates in some platforms. Found when EU validation
complained about moving 64-bit immediates with higher bits.
Fixes: e4440df2d8 ("intel/brw: Add pred/cmod/sat to brw_hw_decoded_inst")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32287>
(cherry picked from commit 1d704af515)
When we create a new swapchain to replace the one currently presenting on
a surface, we need to reset all these timing variables. Otherwise we can
lose track of corrections that were made for the old swapchain when we
delete undelivered presentation feedback results.
Also, we use these variables when queuing a presentation, but we also use
them in the dispatch code that can be called by WaitForPresent from another
thread. We need to protect these variables against concurrent usage.
This is all much easier to do when they're stored as part of the swapchain
instead of the surface, so just move them there and adjust the locking.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32121>
(cherry picked from commit 2e49448a43)
When we start using timestamps, the current code will generate an event
stream like:
feedback
set barrier
wait barrier
commit
feedback
set timestamp
set barrier
commit
wait barrier
commit
The second content update can cause the feedback request from the first to
send a discarded event if the timestamp is in the past.
Be less clever and just put waits in both our content updates.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32121>
(cherry picked from commit b9c8afae33)
When occluded, the current math always rounds down to 0 cycles and leads
to improperly throttled frame delivery.
Improve the comment, and use a formula that leads to generating future
times even when occluded.
Also remove some dead code, as we can never get here with a period of 0.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32121>
(cherry picked from commit ed2bb692f7)
We assume everywhere that RGB is not planar, so sampling
and color space conversions will not work correctly with RGBP.
Drivers can still support RGBP, but processing entrypoint with
shaders doesn't support it.
Fixes: bdb7f36aa8 ("frontends/va: add support for RGBP rt_format")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32252>
(cherry picked from commit 6c83f3c3bb)
Currently if you try to probe the virtio ICD on a non-virtio system
it will fail in CreateInstance which causes the loader to spit on the
screen.
However instance creation shouldn't fail, the driver should just
not enumerate any devices in this case. It's a bit tricky to ensure
this, but return instance and then handle instance destruction
and fail device enumeration.
Cc: mesa-stable
Reviewed-by: Ryan Neph
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32266>
(cherry picked from commit 25b8f4f714)
Instead of using unreliable polling to wait for foz db updater to parse
and load from the dynamic list, also use inotify to wait for foz db
updater close the list file after its done updating.
Fixes: 4dfd306454 ("disk_cache: Disable the "List" test for RO disk cache.")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32237>
(cherry picked from commit cbb3bb5c7b)
For a TU_PIPELINE_GRAPHICS_LIB we were taking a ref to the descriptor
set layout but never releasing on VK_PIPELINE_COMPILE_REQUIRED.
Since VK_PIPELINE_COMPILE_REQUIRED is technically an error, the user
doesn't call vkDestroyPipeline() for it so the descriptor sets
referenced were never getting freed.
Addresses:
```
Direct leak of 304 byte(s) in 1 object(s) allocated from:
#0 0x7fa5a93ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fa44bac84 in vk_default_alloc ../src/vulkan/util/vk_alloc.c:26
#2 0x7fa32ea5d8 in vk_alloc ../src/vulkan/util/vk_alloc.h:48
#3 0x7fa32ea60c in vk_zalloc ../src/vulkan/util/vk_alloc.h:56
#4 0x7fa32ea750 in vk_descriptor_set_layout_zalloc
../src/vulkan/runtime/vk_descriptor_set_layout.c:49
#5 0x7fa306fc98 in tu_CreateDescriptorSetLayout(VkDevice_T*,
VkDescriptorSetLayoutCreateInfo const*, VkAllocationCallbacks
const*, VkDescriptorSetLayout_T**)
../src/freedreno/vulkan/tu_descriptor_set.cc:161
```
and
```
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f9b4b3ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7f9925e900 in ralloc_size ../src/util/ralloc.c:118
#2 0x7f9925e8d4 in ralloc_context ../src/util/ralloc.c:105
#3 0x7f98b4b214 in tu_pipeline_builder_build<(chip)7>
../src/freedreno/vulkan/tu_pipeline.cc:3898
#4 0x7f98b46bd8 in tu_graphics_pipeline_create<(chip)7>
../src/freedreno/vulkan/tu_pipeline.cc:4203
#5 0x7f98b22588 in VkResult
tu_CreateGraphicsPipelines<(chip)7>(VkDevice_T*,
VkPipelineCache_T*, unsigned int, VkGraphicsPipelineCreateInfo const*,
VkAllocationCallbacks const*, VkPipeline_T**)
../src/freedreno/vulkan/tu_pipeline.cc:4234
```
seen in:
dEQP-VK.pipeline.pipeline_library.shader_module_identifier.pipeline_from_id.graphics.4_variants.no_spec_constants.no_pipeline_cache.all_zeros_id.no_exec_properties.vert_tesc_tese_frag
Cc: mesa-stable
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32188>
(cherry picked from commit 21baf2f6c1)
Also ensure that 'needs_scratch_reg' is always true if SCC might be overwritten.
Few changes, because some p_split_vector get SCC as scratch reg assigned,
and thus, can inhibit some postRA optimizations.
Totals from 3 (0.00% of 79395) affected shaders: (Navi31)
Instrs: 10501 -> 10500 (-0.01%); split: -0.02%, +0.01%
CodeSize: 51580 -> 51520 (-0.12%); split: -0.12%, +0.01%
Latency: 84166 -> 84174 (+0.01%)
InvThroughput: 13109 -> 13111 (+0.02%)
SALU: 859 -> 860 (+0.12%)
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
(cherry picked from commit a04e096339)
We updated maxImageDimension2D etc to report the actual max size, but we
forgot to update GetPhysicalDeviceImageFormatProperties in the same way.
Let's do that to make things consistent.
This fixes the following CTS test-case:
dEQP-VK.wsi.wayland.swapchain.create.image_extent
Fixes: d5ed77800e ("panvk: Fix GetPhysicalDeviceProperties2() to report accurate info")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32221>
(cherry picked from commit 9c1de5c6b3)
On modern Mali GPUs, we can have 16 bits for the X and Y sizes, already
overflowing 32-bit barrier even with a single layer of byte-sized
formats.
So let's make sure we have enough bits to avoid overflows here.
Fixes: d5ed77800e ("panvk: Fix GetPhysicalDeviceProperties2() to report accurate info")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32221>
(cherry picked from commit 00b25ec769)
If we have NIR such as:
32x4 %48 = @load_vulkan_descriptor (%47) (desc_type=SSBO)
32x4 %76 = deref_cast (tint_symbol_11 *)%48 (ssbo tint_symbol_11) (ptr_stride=0, align_mul=4, align_offset=0)
32x4 %77 = deref_struct &%76->tint_symbol_10 (ssbo int) // &((tint_symbol_11 *)%48)->tint_symbol_10
A single nir_rematerialize_deref_in_use_blocks() will rematerialize the
deref_struct and then it's deref_cast. However,
nir_foreach_instr_reverse_safe is not safe if the next iteration's
instruction is removed. This can result in the instruction loop exiting
and the load_vulkan_descriptor never having an LCSSA phi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 439e8c42cc ("nir/lcssa: Fix rematerializing derefs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11770
(cherry picked from commit 65a54b4ec4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
If applications doesn't send any attributes to describe the format,
we would always use driver preferred format (NV12). This is wrong
for any RT format other than the driver preferred (YUV420).
Driver doesn't have a choice here, we must use the matching format.
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
(cherry picked from commit c8a893becd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
The code wants the number of components used by the variable in the
current attribute slot, not the total number of components.
For e.g. a 4x3 matrix, glsl_get_components() returns 12, leading to the
following error reported by AddressSanitizer:
```
Test case 'dEQP-VK.tessellation.shader_input_output.cross_invocation_per_patch_mat4x3'..
../src/compiler/nir/nir_lower_io_to_vector.c:265:16: runtime error: index 4 out of bounds for type 'nir_variable *[4]'
```
Fixes: 5ef2b8f1f2 ("nir: Add a pass for lowering IO back to vector when possible")
(cherry picked from commit ba5c65f10b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
The game aliases two images. It binds a memory object to two different
images, the first one being an image with 4 mips and the second with
only one mip but the bind offset is incorrect. It's like it queried
the first image size with different usage flags, so that DCC was
disabled.
Force disabling DCC for mips fixes the incorrect rendering and doesn't
hurt performance.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10200
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2f13723c0a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
In two functions implementing resource discard rebind_resource is called
on resource before its track record is reset. This prevents update of
dirty_resource or dirty_shader_resource because of conditions in
needs_dirty_resource. With rsc->track reset and dirty_resource bits
missing further calls to transfer_map will not try to reallocate
resource storage when needed.
A way to reproduce the issue in both functions is by executing at least
3 draws modifying bound texture or VBO each time. This patch fixes those
cases and some related piglit tests on a5xx and should fix it on other
GPUs. Also it fixes rendering in Firefox and vsraytrace (except vertical
line at right edge).
Fixes: 0a62a874fc ("freedreno: Re-work dirty-resource tracking")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10374
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6d14cad330)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
Many debug flags influence shader codegen but are currently not included
in the hash key. This causes surprising effects as cache lookups may
return shaders compiled with different debug flags than currently in
effect. This patch fixes this by including all debug flags in the
shader hash key.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: c323848b0b ("ir3, tu: Plumb through support for per-shader robustness")
(cherry picked from commit d8c90806e4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
This fixes a corner case of the LNL sub-dword integer restrictions
that wasn't being detected by has_subdword_integer_region_restriction(),
specifically:
> if(Src.Type==Byte && Dst.Type==Byte && Dst.Stride==1 && W!=2) {
> // ...
> if(Src.Stride == 2) && (Src.UniformStride) && (Dst.SubReg%32 == Src.SubReg/2 ) { Allowed }
> // ...
> }
All the other restrictions that require agreement between the SubReg
number of source and destination only affect sources with a stride
greater than a dword, which is why
has_subdword_integer_region_restriction() was returning false except
when "byte_stride(srcs[i]) >= 4" evaluated to true, but as implied by
the pseudocode above, in the particular case of a packed byte
destination, the restriction applies for source strides as narrow as
2B.
The form of the equation that relates the subreg numbers is consistent
with the existing calculations in brw_fs_lower_regioning (see
required_src_byte_offset()), we just need to enable lowering for this
corner case, and change lower_dst_region() to call lower_instruction()
recursively, since some of the cases where we break this restriction
are copy instructions introduced by brw_fs_lower_regioning() itself
trying to lower other instructions with byte destinations.
This fixes some Vulkan CTS test-cases that were hitting these
restrictions with byte data types.
Fixes: 217d412360 ("intel/fs/gfx20+: Implement sub-dword integer regioning restrictions.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
We were accidentally doing a signed integer comparison here for ult32,
or a sign-extending shift for ushr.
One notable bit of fallout was that load_global_uniform_block_intel
address calculations broke on platforms that don't have native 64-bit
integer support, as the iadd64 lowering for "do I need to carry?" was
using ult32...and performing the wrong comparison. We spotted this in
Borderlands 3 on Alchemist once we turned on other optimizations.
Thanks to Lionel Landwerlin for helping spot the problem!
Fixes: c7b312ad45 ("brw: factor out source extraction for rematerialization")
Fixes: 339630ab05 ("brw: enable A64 loads source rematerialization")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5848035443)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
The BROADCOM_SAND128 modifier is usually used with an extra parameter
to pass in the stride via a side channel. Quoting from drm_fourcc.h:
> The pitch between the start of each column is set to optimally
> switch between SDRAM banks. This is passed as the number of lines
> of column width in the modifier (we can't use the stride value due
> to various core checks that look at it , so you should set the
> stride to width*cpp).
So apparently this is just a workaround for limitations in some kernel
APIs. DRM modifiers, however, are arguably a bad fit for extra
parameters that aren't known in advance. In the Wayland/KMS ecosystem
many components depend on being able to treat modifiers as opaque, e.g.
for negotiations etc. In practice the current approach requires various
software components to manually use the
`DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT()` macro - using the
`DRM_FORMAT_MOD_BROADCOM_SAND128` modifier directly with formats like
`NV12` results in a rejection in the KMS driver and corrupted output
in Mesa (because we'd bail out early in `v3d_sand8_blit()`).
Fortunately the stride check limitations mentioned above don't seem to
apply to Mesa though. Thus we can just add support for the base modifier
and stride (coming from V4L2), allowing various toolkits, Wayland
compositors and V4L2 decoder implementations to support e.g.
`NV12` + `DRM_FORMAT_MOD_BROADCOM_SAND128` (`NC12` in V4L2) in a generic
way.
Notes:
1. Wayland compositors trying to offload composition to KMS will still
fail when doing a test commit.
2. There is another limitation - in the V4L2 MPLANE API - that
requires userspace to know the correct offset of the second plane. That's
a known API limitation though and only affects V4L2 decoder implementations.
Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
(cherry picked from commit 758941ab0c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
Since the shader parameters are passed as inline data, push constants
are no longer used and so, not actually set on dispatch. But the
nr_params = 4 was still making the shader emit the code to load them,
causing page faults on simulation, and would also on HW if we didn't
always have a scratch page set.
The uses_inline_data parameter will be set from brw_compile_cs(), called
shortly after this point, so we don't need it here.
The subgroup_size is misleading, as we don't actually require that size
and the code that checks for it isn't even running for this shader.
Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12152
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d32a26b3e6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
As per the code comment added in this commit the nir produced from
glsl to nir doesn't always keep function declarations before the
code that calls them e.g. calls from within other function
implementations. The change in this commit works around this problem by
first cloning all function declarations in a first pass, then cloning
the implementations in a second pass once we have filled the remap
table.
Fixes: cbfc225e2b ("glsl: switch to a full nir based linker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12115
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 59b2549279)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
KernelCI jobs have priority 44 and are very long-running jobs (and
there might be an issue with the KernelCI that makes it create hundreds
of jobs, @sergi is looking into that).
While bumping to 45+ would be enough to allow Mesa release staging
pipelines to run despite the KernelCI, during the CI meeting with @sergi
and @mupuf it was determined that the Mesa releases are an important
enough operation to warrant being a higher priority than user forks
pipelines, so priority 70 was picked (still under the 75 of Marge
pipelines).
Cc: mesa-stable
(cherry picked from commit 50f9bec3ce)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Because dyn_start and dyn_end are indices into
nvk_root_descriptor_table->dynamic_buffers, we would need to offset
cbuf->dynamic_idx by
nvk_root_descriptor_table->set_dynamic_buffer_start[cbuf->desc_set]
in order to do those comparisons correctly.
We could do that, but it's simpler and no less precise to sinply
re-use the same comparison that we do in the other cases here.
This fixes a rendering artifact in Baldur's Gate 3 (Vulkan), which
regressed with the commit listed below.
Fixes: 091a945b57 ("nvk: Be much more conservative about rebinding cbufs")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit dc12c78235)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
The Early-Z optimization is disabled when there is a discard
instruction in the shader used in the draw call.
But if discard is the only reason to disable Early-Z, and at
draw call time the updates in the draw call are disabled we
can enable Early-Z using a shader variant.
If there are occlussion queries active we also need to disable
Early-z optimization.
So this patch enables Early-Z in this scenario.
The performance improvement is significant when running gfxbench
benchmark showing an average improvement of 11.15%
fps_avg helped: gl_gfxbench_aztec_high.trace: 3.13 -> 3.73 (19.13%)
fps_avg helped: gl_gfxbench_aztec.trace: 4.82 -> 5.68 (17.88%)
fps_avg helped: gl_gfxbench_manhattan31.trace: 5.10 -> 6.00 (17.59%)
fps_avg helped: gl_gfxbench_manhattan.trace: 7.24 -> 8.36 (15.52%)
fps_avg helped: gl_gfxbench_trex.trace: 19.25 -> 20.17 ( 4.81%)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable
(cherry picked from commit 5b951bcdd7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
We can end up calling vk_multialloc_alloc with 0 size when
`attachment_count` is 0 and `clearValueCount` is 0.
Addressed:
```
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x7faf033ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fada5cc10 in vk_default_alloc ../src/vulkan/util/vk_alloc.c:26
#2 0x7fac50b270 in vk_alloc ../src/vulkan/util/vk_alloc.h:48
#3 0x7fac555040 in vk_multialloc_alloc
../src/vulkan/util/vk_alloc.h:234
#4 0x7fac555040 in void
tu_CmdBeginRenderPass2<(chip)7>(VkCommandBuffer_T*,
VkRenderPassBeginInfo const*, VkSubpassBeginInfo const*)
../src/freedreno/vulkan/tu_cmd_buffer.cc:4634
#5 0x7fac900760 in vk_common_CmdBeginRenderPass
../src/vulkan/runtime/vk_render_pass.c:261
```
seen in:
dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_texel_buffer.no_fmt_qual.len_252.samples_1.1d.frag
Fixes: 4cfd021e3f ("turnip: Save the renderpass's clear values in the cmdbuf state.")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
(cherry picked from commit c923eff742)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Non-trivial collects (i.e., ones that will introduce moves because the
sources don't line-up with the destination) may cause source intervals
to get implicitly moved when they are inserted as children of the
destination interval. Since we don't support moving intervals in shared
RA, this may cause illegal register allocations. Prevent this by
creating a new top-level interval for the destination so that the source
intervals will be left alone.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
(cherry picked from commit b36a7ce0f1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Otherwise anv_descriptor_set is accessed through an unaligned pointer,
which is undefined behavior in C.
```
anv_descriptor_set.c:1620:17: runtime error: member access within misaligned address 0x61900002c2b5
for type 'struct anv_descriptor_set', which requires 8 byte alignment 0x61900002c2b5
```
Fixes: 2570a58bcd ("anv: Implement descriptor pools")
(cherry picked from commit a2c4a34303)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Since $RESULTS_DIR is now centrally defined in setup-test-env.sh it's no
longer necessary to manually add a hard-coded results directory for the
b2b-test job results.
This keeps the results directory consistent between b2c-test jobs and lava.
Fixes: 9b6d14aed1 ("ci: Always create results dir from init")
(cherry picked from commit 276447ef81)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack
address calculation, what the HW actually uses is
NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform,
while the latter is fixed to 2048 for all current platforms.
Fixes: 6c84cbd8c9 ("intel/dev/xe: Set max_eus_per_subslice using topology query")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit aee04bf4fb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Copy propagation would incorrectly occur in this code
mov(16) v4+2.0:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0
to create
mov(16) v4+2.0:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, u0<0>:UD NoMask group0
This has different behavior. I think I just made a mistake when I
changed this condition in e3f502e007.
It seems like this condition could be relaxed to cover cases like (note
the change of destination stride)
mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0
I'm not sure it's worth it.
No shader-db or fossil-db changes on any Intel platform. Even the code
for the test case mentioned in the original commit did not change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e3f502e007 ("intel/fs: Allow copy propagation between MOVs of mixed sizes")
Closes: #12116
(cherry picked from commit 80a5d158ae)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled
on this while trying some stuff with !31852.
v2: Don't be lazy. Add proper assertions for all the things on all the
platforms. Based on a suggestion by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 7bed11fbde ("intel/brw: Allow immediates in the BFE instruction on Gfx12+")
(cherry picked from commit c1c09e3c4a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
This is required, otherwise we regress latency in cases where
applications are using FIFO without explicit KHR_present_wait.
This is an unacceptable regression.
The fix is to normalize the behavior to X11 WSI.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: d052b0201e ("vulkan/wsi/wayland: Use fifo protocol for FIFO")
(cherry picked from commit 5f70858ece)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
echo"$LLVM_APT_REPO"| tee /etc/apt/sources.list.d/llvm.list
fi
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.