This change is inspired from iris_destroy_context().
For instance, this issue is triggered with
"piglit/bin/glsl-1.50-gs-max-output -scan 1 20 -auto -fbo":
Direct leak of 320 byte(s) in 2 object(s) allocated from:
#0 0x7f34fc769987 in calloc (/usr/lib64/libasan.so.6+0xb1987)
#1 0x7f34f4fa168a in bo_calloc ../src/gallium/drivers/crocus/crocus_bufmgr.c:288
#2 0x7f34f4fa168a in alloc_fresh_bo ../src/gallium/drivers/crocus/crocus_bufmgr.c:350
#3 0x7f34f4fa168a in bo_alloc_internal ../src/gallium/drivers/crocus/crocus_bufmgr.c:419
#4 0x7f34f4fe50a9 in crocus_get_scratch_space ../src/gallium/drivers/crocus/crocus_program.c:2678
#5 0x7f34f55e8954 in crocus_upload_dirty_render_state ../src/gallium/drivers/crocus/crocus_state.c:6871
#6 0x7f34f55e8954 in crocus_upload_render_state ../src/gallium/drivers/crocus/crocus_state.c:7812
#7 0x7f34f5d9f680 in crocus_simple_draw_vbo ../src/gallium/drivers/crocus/crocus_draw.c:332
#8 0x7f34f5d9f680 in crocus_draw_vbo ../src/gallium/drivers/crocus/crocus_draw.c:438
#9 0x7f34f1d2eeba in tc_call_draw_single ../src/gallium/auxiliary/util/u_threaded_context.c:3735
#10 0x7f34f1d12e03 in batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:394
#11 0x7f34f1d12e03 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:445
#12 0x7f34f1d22c9a in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:680
#13 0x7f34f1d238f8 in tc_texture_map ../src/gallium/auxiliary/util/u_threaded_context.c:2754
#14 0x7f34f120b9d9 in pipe_texture_map_3d ../src/gallium/auxiliary/util/u_inlines.h:579
#15 0x7f34f120b9d9 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:530
#16 0x7f34f10d7355 in read_pixels ../src/mesa/main/readpix.c:1178
#17 0x7f34f10d7355 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1195
#18 0x7f34f10d7e10 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1210
Fixes: f3630548f1 ("f3630548f1da crocus: initial gallium driver for Intel gfx 4-7")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23019>
(cherry picked from commit 6ee0bba3ae)
some resources may not be destroyed immediately and may instead be
queued for deletion onto the current batch state, so ensure that the
current state is the last one to be destroyed so that all deferred resources
are also destroyed
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23033>
(cherry picked from commit 58532057c5)
The helper was creating input locations for some builtin bariables.
This caused validation errors in zink because those builtins can't be
used as input.
Fixes: d0342e28b3 ("nir: Add helper to create passthrough GS shader")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22871>
(cherry picked from commit 83692bfe30)
With the shader cache enabled, intel_clc attempts to write to ~/.cache.
Many distributions' build systems limit file-system access, and will
kill the process thus causing the build to fail.
Fixes: 639665053f ("anv/grl: Build OpenCL kernels")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22968>
(cherry picked from commit 435a607909)
The 'struct drm_amdgpu_cs_chunk_fence' is processed as
'struct drm_amdgpu_cs_chunk_data' which is a union.
This change ensures the proper alignment for this structure
to be processed as 'struct drm_amdgpu_cs_chunk_data'.
The presence of __u64 as one member of
'struct drm_amdgpu_cs_chunk_data' makes the
whole structure expected to be 64-bit aligned.
This is a minor issue detected by the gcc sanitizer (ubsan), for instance at the libdrm library:
../amdgpu/amdgpu_cs.c:937:26: runtime error: member access within misaligned address 0x63100001484c for type 'struct drm_amdgpu_cs_chunk_data', which requires 8 byte alignment
0x63100001484c: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
Fixes: ae7e4d7619 ("amd: rename ring_type --> amd_ip_type and match the kernel enum values")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22920>
(cherry picked from commit acdd6a2a6c)
if a sampler is never used (no derefs) then its binding will never be
applied here, leaving it with binding=0. this will clobber the real binding=0
sampler in driver backends, leading to errors, so try to iterate using
the same criteria as above and apply bindings in the same way
fixes#8974
cc: mesa-stable
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22902>
(cherry picked from commit ccbfcf3933)
Otherwise accesses to non-0 views of input attachments may be considered
out-of-bounds and return 0. This should've been removed when enabling
multiview for GMEM, not sure how it was missed.
Fixes: def56b531c ("tu: Support GMEM with layered rendering and multiview")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20304>
(cherry picked from commit f6902bf425)
With the following test :
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_out_of_bounds_load
There is a :
shader_start:
... <- no control flow
g0 = some_alu
g1 = fbl
g2 = broadcast g3, g1
g4 = get_buffer_size g2
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
eliminate_find_live_channel will remove the fbl/broadcast because it
assumes lane0 is active at get_buffer_size :
shader_start:
... <- no control flow
g0 = some_alu
g4 = get_buffer_size g0
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
But then the instruction scheduler will move the get_buffer_size after
the halt :
shader_start:
... <- no control flow
halt <- on some lanes
g0 = some_alu
g4 = get_buffer_size g0
g5 = send <surface>, g4
get_buffer_size pulls the surface index from lane0 in g0 which could
have been turned off by the halt and we end up accessing an invalid
surface handle.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20765>
(cherry picked from commit 9471ffa70a)
Fixes:
```
[829/1646] Compiling C object src/panfrost/vulkan/libpanvk_v6.a.p/panvk_vX_meta_clear.c.o
In function 'panvk_meta_clear_zs_img',
inlined from 'panvk_v6_CmdClearDepthStencilImage' at ../src/panfrost/vulkan/panvk_vX_meta_clear.c:457:7:
../src/panfrost/vulkan/panvk_vX_meta_clear.c:415:26: warning: storing the address of local variable 'view' in '((struct pan_fb_info *)((char *)commandBuffer + 144))[23].zs.view.zs' [-Wdangling-pointer=]
415 | fbinfo->zs.view.zs = &view;
| ~~~~~~~~~~~~~~~~~~~^~~~~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c: In function 'panvk_v6_CmdClearDepthStencilImage':
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'view' declared here
393 | struct pan_image_view view = {
| ^~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'commandBuffer' declared here
[844/1646] Compiling C object src/panfrost/vulkan/libpanvk_v7.a.p/panvk_vX_meta_clear.c.o
In function 'panvk_meta_clear_zs_img',
inlined from 'panvk_v7_CmdClearDepthStencilImage' at ../src/panfrost/vulkan/panvk_vX_meta_clear.c:457:7:
../src/panfrost/vulkan/panvk_vX_meta_clear.c:415:26: warning: storing the address of local variable 'view' in '((struct pan_fb_info *)((char *)commandBuffer + 144))[23].zs.view.zs' [-Wdangling-pointer=]
415 | fbinfo->zs.view.zs = &view;
| ~~~~~~~~~~~~~~~~~~~^~~~~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c: In function 'panvk_v7_CmdClearDepthStencilImage':
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'view' declared here
393 | struct pan_image_view view = {
| ^~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'commandBuffer' declared here
```
Cc: mesa-stable
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22829>
(cherry picked from commit 2b4ce498ee)
Indeed, these references are not freed.
For instance, this issue is triggered on an evergreen card with
"piglit/bin/shader_runner tests/spec/arb_shader_atomic_counter_ops/execution/all_touch_test.shader_test -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 06993e4ee3 ("r600: add support for hw atomic counters. (v3)")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22798>
(cherry picked from commit 4ca8be82d5)
Indeed, the objects are not freed when the function returns NULL.
"psurf->texture = tex;" is redundant with
"pipe_resource_reference(&psurf->texture, tex);".
For instance, this issue is triggered with
"piglit/bin/ext_texture_array-compressed teximage pbo -fbo -auto"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22799>
(cherry picked from commit d615dfca40)
We track fences in a global list and have a per context "current" fence
which we randomly attach things to. If we take such a fence and emit it
without also creating a new fence for future tasks we can get out of sync
leading to random failures.
Some of our queries could trigger such cases and even though this issues
appears to be triggered by the MT rework, I'm convinced that this was only
made more visible by those fixes and we had this bug lurking for quite a
while.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7429
Fixes: df0a4d02f2 ("nvc0: make state handling race free")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Acked-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22722>
(cherry picked from commit 37c6c5c624)
Instead of forcing vertex buffer stride to be 4 byte aligned only,
DX10 actually allows the stride to be non 4-byte aligned but the
alignment of an element must be the nearest power of 2 greater or equal to the
width of the element's format, or 4, whichever is less. So the requirement is
better met with PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY which if set to
TRUE, the sum of vertex element offset + vertex buffer offset + vertex buffer
stride must be aligned to the vertex attributes component size.
Note: PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY cannot be set
with other alignment-requiring CAPs, so we have to return 0 for all the
other alignement CAPs.
This avoids some unnecessary software vertex translate fallback.
cc: mesa-stable
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22689>
(cherry picked from commit c661f38342)
Indeed, the function nir_to_tgsi() returns an ureg_get_tokens() allocated
object which is assigned locally. The ureg_get_tokens() allocated object
should be freed.
For instance, this issue is triggered with a llvm enabled lima,
"piglit/bin/gl-1.0-rendermode-feedback -auto -fbo":
Direct leak of 512 byte(s) in 1 object(s) allocated from:
#0 0x7faeaa4500 in __interceptor_realloc (/usr/lib64/libasan.so.6+0xa4500)
#1 0x7fa4a88f1c in tokens_expand ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:239
#2 0x7fa4a88f1c in get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:262
#3 0x7fa4a900f4 in copy_instructions ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2079
#4 0x7fa4a900f4 in ureg_finalize ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2129
#5 0x7fa4a91dfc in ureg_get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2206
#6 0x7fa4b20a2c in nir_to_tgsi_options ../src/gallium/auxiliary/nir/nir_to_tgsi.c:4011
#7 0x7fa4a0c914 in draw_create_vertex_shader ../src/gallium/auxiliary/draw/draw_vs.c:77
Fixes: b5e782f5f4 ("aux/draw: use nir_to_tgsi for draw shader in llvm path")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21924>
(cherry picked from commit 6a8e6716ac)
The VK pipeline cache passes a NULL bytes with a nonzero size to a
NULL-data blob to set up the size of the blob. In this case, we don't
actually execute the memcpy, so the non-existent "bytes" doesn't need to
have defined contents. Avoids a valgrind warning:
==972858== Unaddressable byte(s) found during client check request
==972858== at 0x147F4166: blob_write_bytes (blob.c:165)
==972858== by 0x147F4166: blob_write_bytes (blob.c:158)
==972858== by 0x14695FFF: vk_pipeline_cache_object_serialize (vk_pipeline_cache.c:240)
[...]
==972858== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Fixes: 591da98779 ("vulkan: Add a common VkPipelineCache implementation")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22617>
(cherry picked from commit ae2784b832)
Workarounds for defects in Intel silicon have been manually
implemented:
- consult defect database for the current platform
- add workaround code behind platform ifdef or devinfo->ver checks
Some bugs have occurred due to the manual process. Typical failure
modes:
- defect database is updated after a platform is enabled
- version checks are overly broad (eg gfx11+) for defects that were
fixed (eg in gfx12)
- version checks are too narrow for defects that were extended to
subsequent platforms.
- missed workarounds
This commit automates workaround handling:
- Internal automation queries the defect database to collate and
summarize defect documentation in json.
- mesa_defs.json describes all public defects and impacted platforms.
Defects which are extended to subsequent platforms are listed under
the original defect.
- gen_wa_helpers.py generates workaround helpers to be called
in place of version checks:
- NEEDS_WORKAROUND_{ID} provides a compile time check suitable for
use in genX routines.
- intel_device_info_needs_wa() provides a more precise runtime
check, differentiating platforms within a generation and
platform steppings.
Internal automation will generate new mesa_defs.json as needed.
Workarounds enabled with these helpers will apply correctly based on
updated information in Intel's defect database.
Reviewed-by: Dylan Baker <dylan@pnwbakers>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(back ported from commits 3c9a8f7a6d52c71cf9598c78dd63208eceff48cd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22351>
Before testing the waddr for SFU we should first validate this
is indeed a valid (not NOP) magic write. Use the helper we have for
this which gets this right.
total instructions in shared programs: 12898957 -> 12850958 (-0.37%)
instructions in affected programs: 4328937 -> 4280938 (-1.11%)
helped: 19974
HURT: 439
Instructions are helped.
total max-temps in shared programs: 2211503 -> 2210893 (-0.03%)
max-temps in affected programs: 12924 -> 12314 (-4.72%)
helped: 509
HURT: 20
Max-temps are helped.
total sfu-stalls in shared programs: 22233 -> 21975 (-1.16%)
sfu-stalls in affected programs: 722 -> 464 (-35.73%)
helped: 297
HURT: 54
Sfu-stalls are helped.
total inst-and-stalls in shared programs: 12921190 -> 12872933 (-0.37%)
inst-and-stalls in affected programs: 4337977 -> 4289720 (-1.11%)
helped: 20015
HURT: 404
Inst-and-stalls are helped.
total nops in shared programs: 333743 -> 305911 (-8.34%)
nops in affected programs: 86902 -> 59070 (-32.03%)
helped: 14545
HURT: 76
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22593>
(cherry picked from commit 18a3a0d915)
The buffer max_index value in translate_generic struct is relevant for
indexed draw only. So do not clamp the element index in generic_run() as it
is called for non-indexed draw only.
This patch passes index_size to the common generic_run_one function
so index clamping is only performed when a non-zero index_size is specified.
This fixes a text selection bug with kitty terminal emulator running on ARM
when it falls back to the generic translate path for unsigned byte vertex
array.
cc: mesa-stable
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22568>
(cherry picked from commit 13e885842a)
It appears to be possible that IDLE is observed before COMPLETE.
In this case, an application may access present_id in subsequent
QueuePresentKHR and race against the fence worker reading present_id.
Solve this by adding a separate signal_present_id that is used when
completing to avoid the race.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 32f7ff2c20)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22638>
This fixes a performance regression between 22.2 and 23.0. A different
fix is present in the main branch, but for 23.0 it was decided to revert
the offending change rather then the more complete fix.
This reverts commit 582bf4d9f7.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8489
Currently we add all patches that apply, including patches that apply to
nominated but unmerged patches. We want those patches, but we really
only want them if we first merge the patch they fix/revert.
Right now we have three different locks, and it's not clear what they're
all meant to cover, as well as blocking each other in cases they don't
need to. This fixes that by reducing the number of locks to 2, one for
git operations, and one for state operations.
-1 is an invalid buffer index which breaks app expectations, specifically
apitrace, which checks for return value of 0 from checking buffer bindings
to determine whether to inject user vertex buffer bindings and create functional
traces
this should fix capturing traces with drivers using glthread
fixes#8383
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22293>
(cherry picked from commit a17317d2a0)
Conflicts:
src/mesa/main/glthread_bufferobj.c
To implement some of the features of the layer, we need to enable some
of the feature bits at device/command_buffer creation. To do so, we
need to edit some of the structures coming from the application. Most
of those are const so we need to clone them before edition.
This change disables some of the layer features if we run into a
situation where one of the structure we need to clone is unknown such
that we can't make a copy of it (since we don't know its size).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7677
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19897>
(cherry picked from commit b30a75a195)
Indeed, the reference was overwritten.
For instance, this issue is triggered with:
"piglit/bin/shader_runner tests/spec/arb_shader_image_load_store/execution/write-to-rendered-image.shader_test -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: a6b3792843 ("r600: add core pieces of image support.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22394>
(cherry picked from commit 4f42d3b843)
This is the analogous of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9490 but for
r600.
Discoloration of NV12 video frames was observed in Chrome/ChromeOS and
the problem was tracked down to the fact that Mesa was following the
PIPE_FORMAT_R8_G8B8_420_UNORM/lower_yuv_external() path. The symptom is
that (for an unknown reason) the YUV-to-RGB conversion is using the
value of Y as the value of Y, U, and V. So, for example, if the input
value is YUV = (50, 120, 130), then what actually gets converted to RGB
is YUV = (50, 50, 50).
Considering that PIPE_FORMAT_R8_G8B8_420_UNORM was introduced for
freedreno
(https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6693) and it
is already being reported as unsupported for radeonsi, it's reasonable
to assume that GPUs targeted by r600 don't support this path either.
Note: I tested this patch with an AMD Palm device which follows the
evergreen_is_format_supported() path. I did not have access to a device
to test the r600_is_format_supported() path.
v2: Changed >= 2 to > 1.
Fixes: 826a10255f ("st/mesa: Add NV12 lowering to PIPE_FORMAT_R8_G8B8_420_UNORM")
Tested-by: Andres Calderon Jaramillo <andrescj@chromium.org>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22511>
(cherry picked from commit 4405e8a9e1)
This fixes some CTS compiler tests where they relied on the cl_kernel
object to be released in time so it can recompile a program without
throwing CL_INVALID_OPERATION due to still having active kernel objects.
Fixes: 47a80d7ff4 ("rusticl/event: proper eventing support")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22510>
(cherry picked from commit 60cfe15d79)
For instance, this issue is triggered with: "piglit/bin/useprogram-flushverts-2 -auto -fbo" or
"piglit/bin/primitive-restart-draw-mode line_loop -auto"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 27dcb46629 ("gallium: add take_ownership param into set_vertex_buffers to eliminate atomics")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22395>
(cherry picked from commit 28cb33fada)
When a graphics pipeline library is created with only the vertex input
state, the driver binds this state at pipeline bind time. Though the
vertex binding stride is not necessarily dynamic, in this case the
pipeline stride should be used.
This fixes GPU hangs with recent
dEQP-VK.pipeline.fast_linked_library.vertex_input.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22285>
(cherry picked from commit 9085c9d43e)
6148e3aae7 ("mesa: Fix ctx->Texture.CubeMapSeamless") introduced a hack, where
seamless cube maps would be requested even for GLES2 contexts despite the spec,
on the assumption that GLES2 gallium drivers would ignore the bit. But that
requires Gallium drivers to know what GLES version they advertise, which is a
horrible layering violation. When the commit was written 8 years ago, there were
classic drivers to contend with so it made sense as a fix to get GLES 3.0 up and
running. With classic drivers gone, it's time to sunset the hack and restore the
intended behaviour by setting ctx->Texture.CubeMapSeamless only once we know the
version.
In addition to fixing a semantic issue in the Gallium contract and preventing a
regression from the next commit, this fixes cube maps on Mali-T720 under
Panfrost. In general, Panfrost supports GLES3 (and honours the seamless flag
everywhere) but on T720 we only advertise GLES2 due to missing MRT support on
older Midgard devices, so we need the flag set properly to distinguish these
cases.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21978>
(cherry picked from commit f2506780c8)
Conflicts:
src/mesa/main/texstate.c
this is almost certainly a failure case, but drivers still shouldn't
get xfb info if there are no outputs
affects:
spec@glsl-1.50@execution@interface-blocks-api-access-members
cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22448>
(cherry picked from commit a86c710ce5)
Conflicts:
src/gallium/drivers/zink/ci/zink-anv-tgl-fails.txt
src/gallium/drivers/zink/ci/zink-radv-navi10-fails.txt
src/gallium/drivers/zink/ci/zink-radv-vangogh-fails.txt
draw->index_bounds_valid tells drivers that the values of min_index/max_index
are set correctly and can be used e.g. to allocate memory for varyings. If set
incorrectly, the GL promises badness.
But, with primconvert, we go mucking with index buffers and then never update
the bounds. So it doesn't matter if the original index bounds were valid, we
can't promise the original bounds are *still* valid. If we were trying to
optimize CPU overhead, we could try to preserve the new min/max index but seeing
as only older Mali cares about this flag, and if you're using primconvert you're
already screwed, I'm not too inclined to go rework primconvert.
Fixes* page faults in primitive-restart-draw-mode on Mali-G52 for GL_QUAD_STRIPS
and GL_POLYGON, which hit the primconvert path. The full dmesg splat looks like:
[ 5438.811727] panfrost ffe40000.gpu: Unhandled Page fault in AS0 at VA 0x000000100A16BAC0
Reason: TODO
raw fault status: 0x25002C1
decoded fault status: SLAVE FAULT
exception type 0xC1: TRANSLATION_FAULT_1
access type 0x2: READ
source id 0x250
Notice that a high bit is randomly set in the address, this is trying to read
a varying from the actual varying buffer in the vicinity of 0xa16bac0. What's
actually happening is that we're trying to read index #0 despite promising the
driver a minimum index of 2, causing an integer underflow as we try to read
index -2, or as the hardware sees, 4294967294.
As long as we stop lying to panfrost about the bounds being correct, panfrost is
able to calculate the real (post-primconverted) bounds on its own, fixing the
test.
* Alternatively, maybe Panfrost should just ignore this bit, in which I don't
know why we have it in Gallium, since it's probably not conformant to fault on
out-of-range glDrawRangeElements.
Fixes: 72ff53098c ("gallium: add pipe_draw_info::index_bounds_valid")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21891>
(cherry picked from commit 62497d4860)
Conflicts:
src/panfrost/ci/panfrost-g52-fails.txt
on pipeline bind with dynamic state, depth_clip_near needs to either be set by
* applying the dynamic state
* using the pipeline state
the previous code always used the pipeline state
fixes:
dEQP-VK.pipeline.*.extended_dynamic_state.between_pipelines.depth_clamp_enable
Fixes: 650880105e ("vulkan,lavapipe: Use a tri-state enum for depth clip enable")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21814>
(cherry picked from commit c9e757c61e)
Conflicts:
src/gallium/frontends/lavapipe/ci/lvp-fails.txt
Indeed, the unnecessary drmDevice objects were not freed.
For instance, this issue could be triggered with: "piglit/bin/egl_ext_platform_device -auto -fbo":
SUMMARY: AddressSanitizer: 2796 byte(s) leaked in 12 allocation(s).
Fixes: e39d72aec2 ("egl: only take render nodes into account when listing DRM devices")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22408>
(cherry picked from commit f9401a515a)
The instructions manipulation cr0 use the default mask on lane0. So if
for some reason that lane is disabled in some of the dispatchs, we can
end up not executing the instructions.
Fixes flakyness in dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.uniform_float_32_to_16.uniform_matrix_float_rtz_frag
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22314>
(cherry picked from commit daa8003e45)
There's no guarantee that this is a SSA value. Use the helper to handle
both SSA values and register correctly. Otherwise we read trash when we
encounter a register and make bad decisions on types, possibly leading
to our destination being UQ typed when the VGRF is only 32-bit.
Fixes compilation with -Dintel-clc=enabled since 7f6491b76d
(nir: Combine if_uses with instruction uses) but the bug is much older
than that, circa 2017. We were just getting lucky before.
Fixes: 069bf7c907 ("i965/fs: Match destination type to size for ballot")
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22374>
(cherry picked from commit 98bcf650f1)
Sanitizes properties returned through GetPhysicalDeviceFormatProperties2.
Bit-31 is not valid to return in the original
vkGetPhysicalDeviceFormatProperties{2,}. Sanitize the bit returned from the
internal to ensure invalid bits aren't return to the application.
Falls in line with the other vulkan drivers.
Based on original commit by Ryan Houdek.
Closes: #8733
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22217>
(cherry picked from commit fd9c69218a)
If vertex does not complete a primitive, it should not set the odd
flag which miss lead liveness check when culling is enabled.
For example, if odd flag is set regardless of complete flag, when
culling is enabled, 3 vertices of a triangle's init prim flag:
[0x00 0x04 0x01]
then after culling, this triangle has been culled, their prim flag:
[0x00 0x04 0x00]
the second vertex is miss treat as live because its odd flag (code
check prim_flag!=0 for liveness).
Fixes: 1bdeb961bd ("ac/nir/ngg: add gs culling")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8725
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22304>
(cherry picked from commit c082cdacae)
Cache coherent UMA implies that the GPU is reading data through the
CPU caches. Using write-combined CPU pages for such a system would
be bad, since the GPU would then be reading uncached data. One
example of such a system is WARP. This significantly improves WARP's
performance for some apps (including the CTS).
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
(cherry picked from commit eaa8c8097c)
We are seeing endless DRM_IOCTL_SYNCOBJ_WAIT ioctl when system memory is
under pressured.
Commit f9d8d9acbb ("iris: Avoid abort() if kernel can't allocate
memory") avoids the abort() on ENOMEM by resetting the batch. However,
when there's an ongoing OpenGL query, resetting the batch will make the
snapshots_landed never be flipped, so iris_get_query_result() gets stuck
in the while loop forever.
Since there's no guarantee that the next batch after resetting won't hit
ENOMEM, so instead of resetting the batch, be patient and wait until kernel has
enough memory. Once the batch is submiited and snapshots_landed gets
flipped, iris_get_query_result() can proceed normally.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6851
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21829>
the spec for vkGetDrmDisplayEXT says:
"If there is no VkDisplayKHR corresponding to the connectorId on the
physicalDevice, the returning display must be set to VK_NULL_HANDLE.
The provided drmFd must correspond to the one owned by the physicalDevice.
If not, the error code VK_ERROR_UNKNOWN must be returned. (...)
The given connectorId must be a resource owned by the provided drmFd.
If not, the error code VK_ERROR_UNKNOWN must be returned"
We were only setting the display pointer to VK_NULL_HANDLE if the provided
drmFd was valid, however, there are CTS tests checking that it is also set
to NULL when it is not.
Fixes the following test on all drivers exposing EXT_acquire_drm_display
(tested with Intel and V3DV):
dEQP-VK.wsi.acquire_drm_display.acquire_drm_display_invalid_fd
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22283>
(cherry picked from commit 1bbbdbe666)
For instance, with "piglit/bin/arb_shader_image_load_store-host-mem-barrier --quick -auto -fbo":
==18549==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61200000a059 at pc 0x7f65d8937b80 bp 0x7fff6ed19a00 sp 0x7fff6ed199f8
READ of size 1 at 0x61200000a059 thread T0
#0 0x7f65d8937b7f in evergreen_set_shader_images ../src/gallium/drivers/r600/evergreen_state.c:4277
#1 0x7f65d6b471b8 in st_bind_images ../src/mesa/state_tracker/st_atom_image.c:172
#2 0x7f65d6b76b26 in st_validate_state ../src/mesa/state_tracker/st_util.h:129
#3 0x7f65d6b76b26 in prepare_draw ../src/mesa/state_tracker/st_draw.c:88
#4 0x7f65d6b77c8a in st_draw_gallium ../src/mesa/state_tracker/st_draw.c:141
#5 0x7f65d72698a2 in _mesa_draw_arrays ../src/mesa/main/draw.c:1202
Fixes: a6b3792843 ("r600: add core pieces of image support.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22273>
(cherry picked from commit e0ed2b29f4)
vstate draws bind their own vertex buffers unrelated to the bound
gallium buffers, so any draw occurring after a vstate draw must
rebind vertex buffers to ensure the correct ones are bound
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22116>
(cherry picked from commit 4be5caba67)
vulkan resolves only provide "extents" instead of src and dst regions like
GL, which means vk resolves can't be used to downscale images, as such
operations will instead just crop the image
fixes#8655
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22195>
(cherry picked from commit db582e5e7d)
Previosuly it was assumed that primitives where always converted to
triangles if the driver did not support all primitives, however that's
not true for a driver that supports quads but not quad strips.
Fixes piglit spec@!opengl 1.1@dlist-fdo3129-01 on Panfrost
Fixes: dcbf2423d2 ("vbo/dlist: add vertices to incomplete primitives")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21987>
(cherry picked from commit d4a6c97779)
Many of the `V3D_DIRTY_*` flags are above 32 bits, but for now the only
one used here is V3D_DIRTY_SSBO.
`shader->uniform_dirty_bits`, where `dirty` ends up, is already 64 bits.
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22019>
(cherry picked from commit a7c051b5ac)
if a feedback loop hasn't yet been added for an image with both
descriptor and fb binds, queue a check for that to avoid mismatch
affects godot-tps-gles3-high.trace
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21906>
(cherry picked from commit 63f425c7d2)
This fixes the behavior of register coalesce in cases where the source
of a copy is written elsewhere in the program by a force_writemask_all
instruction, which could cause the overwrite to be executed for an
inactive channel under non-uniform control flow, causing
can_coalesce_vars() to give incorrect results. This has been reported
in cases like:
> while (true) {
> x = imageSize(img);
> if (non_uniform_condition()) {
> y = x;
> break;
> }
> }
> use(y);
Currently the register coalesce pass would coalesce x and y in the
example above, which is invalid since in the example above imageSize()
is implemented as a force_writemask_all SEND message, whose result is
broadcast to all channels, so when a given channel executes 'y = x'
and breaks out of the loop, another divergent channel can execute a
subsequent iteration of the loop overwriting 'x' with a different
value, hence coalescing y and x into the same register changes the
behavior of the program.
Note that this is a regression introduced by commit a4b36cd3dd. In
order to avoid the problem without reverting that patch, we prevent
register coalesce if there is an overwrite of the source with
force_writemask_all behavior inconsistent with the copy and this
occurs anywhere in the intersection of the live ranges of source and
destination, even if it occurs lexically before the copy, since it
might be physically executed after the copy under divergent loop
control flow.
Fixes: a4b36cd3dd ("intel/fs: Coalesce when the src live range is contained in the dst")
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21351>
(cherry picked from commit 76b4255cd8)
tc fence lifetimes can exceed the lifetimes of their parent contexts,
which means they can be destroyed after mfence->fence has been destroyed
to avoid invalid memory access on a destroyed fence, store all the assigned
tc fences into an array on the real fence and then use that to unset fence
pointers on any outstanding tc fences
fixes flakiness in dEQP-EGL.functional.sharing.gles2.multithread.random_egl_sync.images.texsubimage2d.12
in caselist:
dEQP-EGL.functional.query_context.get_current_surface.rgba4444_pbuffer
dEQP-EGL.functional.create_surface.platform_window.rgba5551_depth_no_stencil
dEQP-EGL.functional.query_surface.simple.pbuffer.rgb888_depth_no_stencil
dEQP-EGL.functional.color_clears.multi_context.gles2.rgb888_pixmap
dEQP-EGL.functional.color_clears.multi_context.gles1_gles2.rgba8888_window
dEQP-EGL.functional.color_clears.multi_context.gles1_gles2_gles3.rgb888_window
dEQP-EGL.functional.render.multi_thread.gles2_gles3.rgba5551_pbuffer
dEQP-EGL.functional.sharing.gles2.multithread.random_egl_sync.buffers.buffersubdata.3
dEQP-EGL.functional.sharing.gles2.multithread.random_egl_sync.programs.link.6
dEQP-EGL.functional.sharing.gles2.multithread.random_egl_sync.images.texsubimage2d.12
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21843>
(cherry picked from commit 9df68c633e)
Rounding up the polygon list BO can waste large amounts of memory. In a common
case I observed, it rounded up 11MB to 16MB, wasting 5MB. That adds up quickly
across processes, especially on the 2GB machines.
This only applies to Midgard. On Bifrost and newer, the driver does not
explicitly allocate this data structure. Cc stable because this rounding is
incorrect and the increase in RAM usage can cause real problems (especially
given how slow the shrinker is).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21831>
(cherry picked from commit f2617944bf)
The logic that updates `ctx->gfx_pipeline_state.final_hash` assumed that
the program is replaced. It is supposed to xor `final_hash` with the
hash first and then with the new hash however when the program is
updated it end up xor-ing the new hash twice so it does nothing.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: 15450d2c2e ("zink: incrementally hash all pipeline component hashes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21925>
(cherry picked from commit 1538a28803)
nir->info.subgroup_size can be set to an enum :
SUBGROUP_SIZE_VARYING = 0
SUBGROUP_SIZE_UNIFORM = 1
SUBGROUP_SIZE_API_CONSTANT = 2
SUBGROUP_SIZE_FULL_SUBGROUPS = 3
So compute the API subgroup size value and compare it to the dispatch
size to determine whether we need some bound checking.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9ac192d79d ("intel/fs: bound subgroup invocation read to dispatch size")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>
(cherry picked from commit 56474fae93)
For instance, to load uniform data with the LSC we usually rely on
tranpose messages which have to execute in SIMD1. Those end up being
considered as partial writes so within loops their life span spread to
the whole loop, increasing register pressure.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21867>
(cherry picked from commit efde1917c9)
If both, any-hit and intersection shader, use scratch vars,
it could happen that they end up in the same location and
overwrite each other.
Found by inspection.
Fixes: c3d82a9622 ('radv: Add pass to lower anyhit shader into an intersection shader.')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21863>
(cherry picked from commit 481f78ab93)
The check in alloc_bo_from_cache() was skiping any try to get a bo
from cache but after use a protected bo was still being put in some
cache bucket and could be used for cases that don't require a
protected bo.
Using a protected bo in cases that don't require it can have
performance implications.
So here returning NULL when trying to get a cache bucket for a
protected bo, this will cause bo->real.reusable to be set to false
avoiding the bo to be reused.
Fixes: 9402ac8023 ("iris: handle protected BO creation")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21824>
(cherry picked from commit 266d961fdc)
The previous solution breaks potential loop header phis.
Move the dummy-break to the bottom of the loop.
Fixes: dEQP-VK.reconvergence.subgroup_uniform_control_flow_ballot.*
Fixes: a9c4a31d8d ('aco: handle NIR loops without breaks')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736>
(cherry picked from commit cd1e5b1858)
the recording rp_info may be a pointer to a member of the array being
reallocated, so test for this and re-set it to avoid invalid memory
access
found with this caselist:
KHR-GL46.texture_gather.offset-gather-unorm-2darray
KHR-GL46.texture_view.view_sampling
cc: mesa-stable
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21729>
(cherry picked from commit 6ee5337d94)
Seems to fix a hang in the following titles :
- Age of Empire 4
- Monster Hunter Rise
where the HW is hung on a PIPE_CONTROL after a GPGPU_WALKER but no
MEDIA_INTERFACE_DESCRIPTOR_LOAD was emitted since the switch from 3D
to GPGPU.
This would happen in the following case :
vkCmdBindPipeline(COMPUTE, cs_pipeline);
vkCmdDispatch(...);
vkCmdBindPipeline(GRAPHICS, gfx_pipeline);
vkCmdDraw(...);
vkCmdDispatch(...);
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17247>
(cherry picked from commit 11bc2bde83)
Modifying pitch for all LINEAR surface isn't correct;
the original change that modified surf_pitch was only
intended for YUV textures.
This fixes vkGetImageSubresourceLayout rowPitch return value
for VK_FORMAT_BC3_UNORM_BLOCK + VK_IMAGE_TILING_LINEAR.
Fixes: fcc499d5 (ac/surface: adjust gfx9.pitch[*] based on surf->blk_w)
v2: add check for UYVY format (Pierre-Eric)
v3: move blk_w division to above if check (Pierre-Eric)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21595>
(cherry picked from commit 347a5b79f9)
this otherwise may have been a surface that was never drawn to or
already had its layout corrected, in which case a deferred barrier
is not only unnecessary, it might be broken
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21739>
(cherry picked from commit e0dfe058c4)
reason:
some h264 streams have some strange pictures, from
vaapi input these pictures don't have a reference frame,
however, they are not intra only pictures, in MB layer
these pictures are looking for some references, if they
cannot find it. It could cause PF.
when reference pictures exist, it will need to set used_for
reference_flags, therefore if that is set, however the
number of reference frames is zero, which is odd, it
should be avoided.
solution:
In the above case, to scan the ref list so that it will
make at least one reference available to avoid crash, since
this is not accurate enough, it could cause some artifacts.
And in that case, it will need to be checked individually
for another solution.
closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1462
closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8401
Cc: mesa-stable
Tested-by: llyyr <llyyr.public@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21732>
(cherry picked from commit 0f3370eede)
vulkancts test dEQP-VK.wsi.direct_drm.surface.create_simulate_oom is failing
because in wsi_display_alloc_connector() function memory allocation for
connector is not checked for return NULL. create_simulate_oom test simulates
out of memory, hence memory allocation fails for connector and later when
tried to dereference connector program will segfault.
This patch fixes the dEQP-VK.wsi.direct_drm.surface.create_simulate_oom test
segfault issue by checking if connector is NULL afer memory allocation.
Cc: mesa-stable
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21701>
(cherry picked from commit af953616a1)
The issue here is that for draw indirect count variants, we want to
jump after the last generated draw call to the next location where
commands are. But if we have more than 67M draws (8k * 8k chunks), we
only know the location once we've generated each of the 8k * 8k
chunks.
This change adds a CPU side pointer in the push constant struct so
that we can create a single linked list of chunks to edit and go
through to write the correct jump address after all the generated
space has been allocated.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c950fe97a0 ("anv: implement generated (indexed) indirect draws")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20497>
(cherry picked from commit 63fa6d9f49)
Using texelFetch to read samples from an 8xMSAA fast cleared image on
Haswell can read transparent black pixels around triangles from where
there should be none. This issue isn't present when using sample
shading, resolving the image using vkCmdResolveImage or in a copy the
image. The easiest way to fix this is by just disabling non-zero fast
clears for 8xMSAA images.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7587
Cc: mesa-stable
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21444>
(cherry picked from commit e509afacf3)
I'm not planning to stand mesa-swrast back up until we get Kata set up, so
turn the testing back on at a reduced fraction on so that
venus/llvmpipe/etc. dev can still get some coverage.
I haven't turned lavapipe back on, because it is now unstable in memory
model / atomics tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21880>
(cherry picked from commit 5bb9ab896c)
stable:
- add additional suites to the shared runner list
Register writes from the pre-header might not be correct for any but
the first loop iteration because they can be clobbered inside the loop.
Foz-DB Navi21:
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 251384 -> 251508 (+0.05%)
Instrs: 47644 -> 47664 (+0.04%)
Latency: 801801 -> 801852 (+0.01%)
InvThroughput: 177579 -> 177593 (+0.01%)
Copies: 4752 -> 4771 (+0.40%)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8376
Fixes: d3b0f78110 ("aco/optimizer_postRA: Initialize loop header with preheader information")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21540>
(cherry picked from commit e1eabab6fe)
Now that all color blend bits are dynamic, emit_cb_state() is doing
almost nothing and half of that is wrong.
In the case that color write enable is dynamic, at the time the pipeline
state is emitted, it sees all the color attachments as having write
disabled and stores the WriteDisabled bit for each channel.
When all dynamic state is flushed, we have the right values already but
the values recorded into the command buffer get ORed with the ones
stored in the pipeline, and so WriteDisabled tag along when they
shouldn't.
Since all disabled color attachments are handled already when dynamic
state is flushed, there's no point in doing so at pipeline creation
time too. And since the only other thing done by emit_cb_state() is
writing three hardcoded values, they might as well be taken care of in
the same place as everything else.
Fixes CTS from the future:
dEQP-VK.pipeline.*.extended_dynamic_state.*.color_blend_equation_*dynamic*
dEQP-VK.pipeline.*.extended_dynamic_state.*.color_blend_all_*
Fixes: fc3fd7c69e (anv: dynamic color write mask)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21509>
(cherry picked from commit b71957635f)
The workaround address is used as a source for push constants when
there's no resource available, that address must be 32B aligned.
This fixes invalid address being used for buffers in
3DSTATE_CONSTANT_* packets.
Now that intel_debug_write_identifiers() already add the padding,
there's no need to include extra "+ 8" to the offset.
Thanks to Xiaoming Wang that contributed to find and fix this issue.
Fixes: 2a4c361b06 ("iris: add identifier BO")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21479>
(cherry picked from commit a4a0417263)
Fixes: 5b205ef413
r600: Store nir shaders serialized to save memory
Direct leak of 4096 byte(s) in 1 object(s) allocated from:
#0 0x7faf89c3bb48 in __interceptor_realloc (/usr/lib64/libasan.so.6+0xb1b48)
#1 0x7faf7be5981d in grow_to_fit ../src/util/blob.c:67
#2 0x7faf7be5a538 in grow_to_fit ../src/util/blob.c:49
#3 0x7faf7be5a538 in blob_reserve_bytes ../src/util/blob.c:177
#4 0x7faf7be5a538 in blob_reserve_uint32 ../src/util/blob.c:190
#5 0x7faf7d248a8c in nir_serialize ../src/compiler/nir/nir_serialize.c:2109
#6 0x7faf7df4fdbb in r600_pipe_shader_create ../src/gallium/drivers/r600/r600_shader.c:401
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21443>
(cherry picked from commit 9b4eb73907)
This also removes the loop so opt_remove_cast_cast() will only optimize
cast(cast(x)) and not cast(cast(cast(x))). However, since nir_opt_deref
walks instructions top-down, there will almost never be a tripple cast
because the parent cast will have opt_remove_cast_cast() run on it.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
(cherry picked from commit af9212dd82)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21467>
When mesh shader is spawned on a different slice than the originating
task shader, then input task urb handle can come from a different
slice, so masking this information off will load data from the current
slice, instead of the one where real data are.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21007>
(cherry picked from commit dd9bf86725)
When establishing a texture transfer map, if there is any pending changes on the
texture, instead of trying direct map with DONTBLOCK first, just
use the upload buffer path.
Fixes piglit tests gen-teximages, arb_copy_images-formats
Cc: mesa-stable
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 1b9b060f0e)
When an EGLImage is created from a 2D texture and used for texture sharing,
the texture surface might not have been created with the SHARED bind flag.
To allow these surfaces for sharing, this patch sets the USAGE SHARED bit
for surfaces that can be potentially used for sharing even when the SHARED
bind flag is not originally set. Instead of unconditionally enabling the
SHARED bind flag for all surfaces and unnecessarily bypass the surface cache
optimization, this patch only enables the USAGE SHARED bit for surfaces
that also have the RENDER TARGET bind flag.
When the surface handle is inquired and if the surface is currently
marked as cachable, we will need to unset the cachable bit so
the surface handle will not be recycled again.
This patch fixes an assertion in svga_resource_get_handle() when the
EGL_MESA_image_dma_buf_export extension is used in webgl benchamrk running
from firefox in Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 75b7296fc3)
This change is wrong for two reasons:
- it hangs most of the time maybe, because changing PSTATE when the
application is running is broken somehow
- it increases the time between triggering and generating the capture
considerably, because there is a delay for changing PSTATE
This restores previous logic where PSTATE is set to profile_peak at
logical device creation. Though, it also re-introduces an issue with
multiple logical devices (kernel returns -EBUSY) but this will be
fixed in the next commit.
This fixes GPU hangs when trying to record RGP captures on my NAVI21.
Note that profile_peak is only required for some RDNA2 chips (including
VanGogh).
Cc: mesa-stable
This reverts commit 923a864d94.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21222>
(cherry picked from commit 663877e894)
We loop, effectively, over two stacks: ready and to_do and finish only
when both are empty. In the case where ready is empty, we pull one off
of to_do, add a copy to a temporary, and push it onto the ready stack.
Previously, we assumed that we would never get to the temporary copy
case if to_do has exactly one entry because that would imply that there
was only one copy left which means there can't possibly be a cycle to
break. This was true until c7fc44f9eb ("nir/from_ssa: Respect and
populate divergence information") which changed things such that
temporary copies sometimes get added in the case where a convergent
value is copied both to convergent and divergent destinations.
This patch adjusts our loop iteration to always attempt to clear the
ready stack before checking if there's anything left on the to_do stack.
I also added an assert to make the exit condition more clear.
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 4e09d37f3b)
There is an optimization in the parallel copy algorithm where, after a
copy has been performed, we can treat the destination as the new source
for future copies of the same source. In particular, consider the
following parallel copy: A -> B, C -> A, A -> C. In this case, after we
have done the A -> B copy, we can make note that the value in A is now
in B and emit the sequence: A -> B, C -> A, B -> C. This allows us to
resolve the swap cycle between A anc C without allocating a temporary
register because we know B is also a copy of A.
When one of the registers involved is convergent and the other is
divergent, this optimization is problematic because, while convergent to
divergent copies are fine, we can't re-use the divergent copy in later
copies if any of those copies are to a convergent variable. We could,
but it would require a read_first_invocation which would get messy. In
In c7fc44f9eb ("nir/from_ssa: Respect and populate divergence
information"), we attempted to deal with this by limiting the rename
optimization to the case where the divergence matched.
The problem is that we did the re-name part whenever the divergence
matched but only marked it as ready if the thing being copied was a
destination. (We actually left two instances of loc[a] = b, one which
always happened and one which only happened if we also wanted to flag
the source as being ready to use as a destination.) While this
technically doesn't cause any problems, it may result in more inter-mov
dependencies which hurts instruction scheduling. For example, if we had
the parallel copy A -> B, A -> C, A -> D, we now end up emitting the
sequence A -> B, B -> C, C -> D which has many more data hazards between
instructions caused by the constant shuffling.
This commit restores the original logic in which we only perform the
rename optimization if the rename would free up a register we will later
use as a destination. This isn't entirely optimal as it still doesn't
prove that there is a cycle involved first, but it should lead to a
reduction in unnecessary dependencies.
No shader-db changes on SKL or DG2
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 5afba073c6)
This reverts commit 2dfebf3487.
It causes GPU hangs in piglit tests like
spec@glsl-1.20@execution@clipping@vs-clip-vertex-enables, for reasons I'm
totally unclear on. The commit was not necessary, because the frontend
lowering already handles disabled clip planes by storing 0.0 to the
corresponding clipdist array element in that shader variant. Add a note
to that effect.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21298>
(cherry picked from commit 5c246e21b7)
Conflicts:
src/freedreno/ci/freedreno-a530-fails.txt
base_mip_width is used in si_compute_copy_image when the
SI_IMAGE_ACCESS_BLOCK_FORMAT_AS_UINT flag is used.
width = tex->surface.u.gfx9.base_mip_width;
This will be incorrect if we don't adjust it. For instance,
with a 260x256 image, surf_pitch and base_mip_width are
320 before surf_pitch is updated to be 192.
Both need to match, or computing the width from base_mip_width
leads to incorrect result.
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21253>
(cherry picked from commit affa8a9fb2)
Before dropping this function, handle the two callers of this function:
* The call in iris_blorp.c is redundant. The required cache flushes are
already handled by the callers of blorp functions. Delete this.
* The call in iris_resolve.c is still providing a benefit because it
calls iris_emit_buffer_barrier_for internally. Inline the needed
barrier.
Cc: 23.0 <mesa-stable>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21303>
(cherry picked from commit 5d24682aae)
Memory accesses can get corrupted when there's a disagreement between:
* the aux-mode of existing cache lines for a surface and
* the aux-usage in that surface's RENDER_SURFACE_STATE object
We have already prevented hardware from seeing this conflict for
rendering operations, but due to how the L3 is shared among multiple
clients in gfx12 (e.g., sampler engine, render engine, etc.), we need to
expand the scope of the existing solution. Now, before any access of a
compressible resource, we make sure to flush the prior aux-mode from the
caches.
The majority of changes here refactor things for use in a new function,
flush_previous_aux_mode. The remaining change calls that function from
within iris_resource_prepare_access.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6558
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7625
Cc: 23.0 <mesa-stable>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21303>
(cherry picked from commit 7c367bef0d)
Commit introduces a fix that allows for gbm to be built with an empty
backend. There are situation especially in a Yocto/OE cross compilation
environment where you want to build with an empty backend. The particular
situation is as such:
The mesa-gl recipe is the preferred provider for virtual/libgbm, virtual/libgl,
virtual/mesa, etc... But the x11 DISTRO_FEATURE in't included this leads to build
errors such as:
| /../../../ld: src/gbm/libgbm.so.1.0.0.p/main_backend.c.o: in function `find_backend':
| backend.c:(.text.find_backend+0xa4): undefined reference to `gbm_dri_backend'
| /../../../ld: src/gbm/libgbm.so.1.0.0.p/main_backend.c.o:(.data.rel.ro.builtin_backends+0x4):
undefined reference to `gbm_dri_backend'
| collect2: error: ld returned 1 exit status
Issue should be replicable by setting -Ddri3=disabled and -Dgbm=enabled
Add fix to bypasses compilation issue by excluding gbm dri backend. If
HAVE_DRI || HAVE_DRIX not specified.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Signed-off-by: Vincent Davis Jr <vince@underview.tech>
(cherry picked from commit 842ca28465)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20821>
non-strictLine Vulkan drivers use either parallelogram or bresenham
rasterization for default line modes.
This method of rasterisation produces close enough results that it
in practice is GL/GLES spec compliant (at least cts wise).
Don't emit a feature missing warning for this case.
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20985>
(cherry picked from commit f7b2dbb2bd)
Pointed out by GCC:
In function ‘load_text_file’,
inlined from ‘standalone_compile_shader’ at ../src/compiler/glsl/standalone.cpp:491:38,
inlined from ‘main’ at ../src/compiler/glsl/main.cpp:98:45:
../src/compiler/glsl/standalone.cpp:358:17: error: ‘free’ called on pointer ‘block_195’ with nonzero offset 48 [-Werror=free-nonheap-object]
358 | free(text);
| ^
In function ‘ralloc_size’,
inlined from ‘load_text_file’ at ../src/compiler/glsl/standalone.cpp:352:31,
inlined from ‘standalone_compile_shader’ at ../src/compiler/glsl/standalone.cpp:491:38,
inlined from ‘main’ at ../src/compiler/glsl/main.cpp:98:45:
../src/util/ralloc.c:117:18: note: returned from ‘malloc’
117 | void *block = malloc(align64(size + sizeof(ralloc_header),
| ^
Fixes: a9696e79fb ("main: Close memory leak of shader string from load_text_file.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21215>
(cherry picked from commit 49a6bdde8e)
Drop the unused ctx parameter, to match the main Mesa code.
Fixes ODR violation flagged by -Wodr with LTO enabled:
../src/mesa/main/shaderobj.h:74:1: error: ‘_mesa_reference_shader_program_data’ violates the C++ One Definition Rule [-Werror=odr]
74 | _mesa_reference_shader_program_data(struct gl_shader_program_data **ptr,
| ^
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: type mismatch in parameter 1
76 | _mesa_reference_shader_program_data(struct gl_context *ctx,
| ^
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: ‘_mesa_reference_shader_program_data’ was previously declared here
../src/compiler/glsl/standalone_scaffolding.cpp:76:1: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
Fixes: 717a720e9c ("mesa: drop unused context parameter to shader program data reference.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21215>
(cherry picked from commit bf67f32d4b)
A stackless (or at least using allocated memory for stack) version
might be nice but for now this works around some games compiling
large shaders and hitting stack overflows.
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21231>
(cherry picked from commit 0a17c3afc5)
Draw states were not disabled after a dynamic renderpass which
spans several command buffers, the next renderpass if started in
the same command buffer wouldn't emit the full draw state,
since TU_CMD_DIRTY_DRAW_STATE was not set by previous renderpass.
The issue could be observed when corrupting all regs at cmdbuf start in:
dEQP-VK.dynamic_rendering.primary_cmd_buff.random.seed7_geometry
Fixes: cb0f414b2a
("tu: Add support for suspending and resuming renderpasses")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21148>
(cherry picked from commit 2d20564a6a)
Vulkan spec 18.8. Primitives Generated Queries:
When a generated primitive query for a vertex stream is active,
the primitives-generated count is incremented every time a
primitive emitted to that stream reaches the transform feedback
stage, whether or not transform feedback is active.
We can see the order of stages in chapter 27 Fixed-Function
Vertex Post-Processing, which shows that the transform feedback
stage is before rasterization (and therefore culling).
Conclusion is that culled primitives should be included
in the primitives generated query.
This commit makes sure to emit the primitives generated query
code before culling and uses the input primitive count passed
to the current wave instead of the exec mask after culling.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21037>
(cherry picked from commit 3a819bd22e)
(cherry picked from commit 18f0d33423)
Point sprite coordinates in general need to be inverted,
not just the texcoords converted to point sprite.
Move point coord y inversion out to its own pass.
Fixes GTF-GL46.gtf21.GL2FixedTests.point_sprites.point_sprites
with FBO dEQP surface.
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21050>
(cherry-picked from commit 782f1e9e01)
There was a VUID-VkImageViewCreateInfo-image-04739 in the Vulkan 1.3
spec that said:
If image was created with the
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT flag and format is a
non-compressed format, viewType must not be VK_IMAGE_VIEW_TYPE_3D
That VUID has since been removed, and when a view of a 3D image is
created, with put the depth into the array_len, so it won't be always 1.
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20803>
(cherry picked from commit 58ababdee6)
failing to set this yields patterns like
* bind fs
* bind samplerviews
* draw
* bind fs2
* ~~unbind samplerviews~~ (eliminated)
* draw
the eliminated unbinding of samplerviews between draws also eliminates a descriptor update,
triggering various artifacts in certain corner cases (like DOOM2016 shadows)
it's possible to manage the updating during shader binding, but the detection is a bit more
complex, and the cpu overhead from maintaining the current codepath with an
extra pipe_context::set_sampler_views (et al) isn't high enough to warrant further investigation
at this time
fixes#8252
Fixes: 153af03b94 ("gallium: Add cap to request state validation for all dirty state")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21176>
(cherry picked from commit b9181c3218)
A few lines earlier uni_offsets is accessed with ubo scaled by
PIPE_MAX_CONSTANT_BUFFERS:
if (uni_offsets[ubo * PIPE_MAX_CONSTANT_BUFFERS + i] == offset)
Found by inspection.
Looking at the before and after NIR code for
dEQP-VK.graphicsfuzz.cov-int-initialize-from-multiple-large-arrays,
using the correct indexing appears to enable the pass to inline an
additional uniform. My guess is that when a uniform is used more than
once, the first loop wouldn't find the offset recored in the table
because it was recorded at the wrong location.
Fixes: d23a9380dd ("lavapipe: implement extreme uniform inlining")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21144>
(cherry picked from commit a7696a4d98)
this was including the generated tcs bits, which was likely to be wrong
and thus break optimal key hashing, requiring more pipelines
it also wasn't setting the optimal key value correctly during precompile,
which meant the wrong hash value was used and the precompiled libs were never
actually accessible
cc: mesa-stable
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21169>
(cherry picked from commit 4cf54e2ed2)
GL Spec says that imageLoad from incomplete images must return 0.
This is not really spec compliant as for proper behavior nullDescriptor
and robustImageAccess2 is needed.
A workaround for lack of either of these requires a shader variant.
Clearing the null surface and hoping the app doesn't write to the image
is closer to spec, while avoiding a shader recompile.
KHR-GL46.shader_image_load_store.incomplete_textures tests this.
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21135>
(cherry picked from commit 22e91af1a7)
Conflicts:
src/gallium/drivers/zink/zink_context.c
u_vector_add() don't keep the returned pointers valid.
After the initial size allocated in u_vector_init() is reached it will
allocate a bigger buffer and copy data from older buffer to the new
one and free the old buffer, making all the previous pointers returned
by u_vector_add() invalid and crashing the application when trying to
access it.
This is reproduced when running
dEQP-VK.synchronization.signal_order.timeline_semaphore.* in DG2 SKUs
that has 4 CCS engines, INTEL_COMPUTE_CLASS=1 is set and of course
perfetto build is enabled.
To fix this issue here I'm moving the storage/allocation of
struct intel_ds_queue to struct anv_queue/iris_batch and using
struct list_head to maintain a chain of intel_ds_queue of the
intel_ds_device.
This allows us to append or remove queues dynamically in future if
necessary.
Fixes: e760c5b37b ("anv: add perfetto source")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20977>
(cherry picked from commit 8092bc2158)
Conflicts:
src/intel/ds/intel_driver_ds.cc
src/intel/vulkan/anv_utrace.c
src/intel/vulkan_hasvk/anv_utrace.c
Currently, we postpone binning syncs until we record draw calls
and can validate if any of them require accessing protected
resources in the binning stage, however, if the draw calls are
recorded in a secondary command buffer and the barriers have
been recorded in the primary command buffer, we won't apply the
binning sync in the secondary when we record the draw calls
and so we must apply it when we execute the secondary in the
primary.
Fixes flakyness in:
dEQP-VK.api.command_buffers.record_many_draws_secondary_2
cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21162>
(cherry picked from commit fec15a225f)
The CLE parser in the sim will read this many bytes for each instruction
in a CL, so we should ensure we have at least that many bytes available
in the BO when reading the last instruction, otherwise we can trigger
a GMP violation. It is not clear whether this behavior applies to real
hardware too.
cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21162>
(cherry picked from commit c2601f0690)
Fixes: 46b099e3
("meson: Ignore unused variables in release builds")
46b099e3 has some issues:
- it doesn't enable unused variables warning on release builds
with assertions enabled;
- it doesn't disable unused variables warning on debug builds
with assertions disabled;
- it doesn't disable unused variables warning when building
with MSVC and assertions are disabled regardless of buildtype,
see #8147. 3/4 regressions reported there have this limitation
alone as root cause.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21154>
(cherry picked from commit 4347072443)
according to spec, the maximum number of acquired images can be calculated with
swapchain_size - VkSurfaceCapabilitiesKHR::minImageCount + 1
the previous calculation was both wrong and occurring in the wrong place,
so this corrects both issues
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21095>
(cherry picked from commit 89cf0a3bdc)
simple way to reproduce this is to run these 4 together:
KHR-GL46.gpu_shader5.images_array_indexing
KHR-GL46.shader_image_load_store.advanced-allMips
KHR-GL46.shader_image_load_store.advanced-sso-simple
KHR-GL46.shader_image_load_store.incomplete_textures
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21134>
(cherry picked from commit 104040b5c7)
apparently an actual null descriptor is illegal here, and it's wasted cpu
anyway, so just cache the dummy surface on init and use that data when
fbfetch isn't active but the layout requires it
Fixes: 7ab5c5d36d ("zink: use EXT_descriptor_buffer with ZINK_DESCRIPTORS=db")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21100>
(cherry picked from commit 813bb9e442)
We first do an incomplete check for whether the linker supports
--gc-sections, then potentially add C and C++ arguments assuming that it
works, then later do a complete check to see if it actually works and
use --gc-sections. This means we can end up putting functions and data
in separate sections when we can't gc them.
Combine the checks, do less work, and be more accurate.
fixes: f51ce21e4e
("meson: Drop adding -Wl,--gc-sections to project c/cpp arguments.")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21083>
(cherry picked from commit fd9b50aa1c)
etna_resource_from_handle() is called for each plane of a multiplanar
resource, so there is no point in looping over all planes to do the
renderonly scanout import. In fact that will cause us to lose track
of the scanout imports from later planes when the earlier planes are
redoing the import, overwriting the pointer to the allocated
renderonly_scanout struct.
Drop the loop and just do the import for the current plane.
Fixes: 826f95778a ("etnaviv: always try to create KMS side handles for imported resources")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20993>
(cherry picked from commit 175732bb51)
compute programs can be reused across contexts, which means storing any
pointers directly like this is going to lead to desync and crash
instead, make this like regular descriptor templates and calculate the offset
from the current context to ensure that everything works as it should
fixes#8201
Fixes: 7ab5c5d36d ("zink: use EXT_descriptor_buffer with ZINK_DESCRIPTORS=db")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21020>
(cherry picked from commit 68e914a4ca)
Complicated CFG and lots of SALU can cause this to take an extremely long
time to finish.
Fixes
dEQP-VK.graphicsfuzz.cov-value-tracking-selection-dag-negation-clamp-loop
and Monster Hunter Rise demo compile times.
fossil-db (gfx1100):
Totals from 57 (0.04% of 134574) affected shaders:
Instrs: 170919 -> 171165 (+0.14%)
CodeSize: 860144 -> 861128 (+0.11%)
Latency: 961466 -> 961505 (+0.00%)
InvThroughput: 127598 -> 127608 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8153
Fixes: 5806f0246f ("aco/gfx11: workaround VALUPartialForwardingHazard")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20941>
(cherry picked from commit bfd4ac4581)
To fix a hypothetical issue:
v0 = start_linear_vgpr
if (...) {
} else {
use_linear_vgpr(v0)
}
v0 = phi
We need a p_end_linear_vgpr to ensure that the phi does not use the same
VGPR as the linear VGPR.
This is also much simpler.
fossil-db (gfx1100):
Totals from 1195 (0.89% of 134574) affected shaders:
Instrs: 4123856 -> 4123826 (-0.00%); split: -0.00%, +0.00%
CodeSize: 21461256 -> 21461100 (-0.00%); split: -0.00%, +0.00%
Latency: 62816001 -> 62812999 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 9339049 -> 9338564 (-0.01%); split: -0.01%, +0.00%
Copies: 304028 -> 304005 (-0.01%); split: -0.02%, +0.01%
PreVGPRs: 115761 -> 115762 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621>
(cherry picked from commit bbc5247bf7)
ac/surface puts the raw pip_bank_xor there, which needs the extra
shift for the actual tile_swizzle.
(I think long term we should refactor this in ac/surface but for
now lets fix like radeonsi to avoid race conditions.)
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20979>
(cherry picked from commit b0a9772cc6)
list_is_linked() isn't the right function to use in order to check if
the BO is on a cache bucket or the zombie list, as this checks if the
next pointer of the list isn't NULL. This is always the case with the
BO list item as it's always initialized, so the next pointer points to
the list head itself when the BO isn't on any list.
Use list_is_empty() to check if the BO is actually linked into one
of the deferred destroy lists.
Fixes: 1b1f8592c0 ("etnaviv: drm: properly handle reviving BOs via a lookup")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20940>
(cherry picked from commit 196882a147)
This game seems to incorrectly set the render area and since we switched
to full dynamic rendering, the framebuffer dimensions is no longer used.
Forcing the render area to be the framebuffer dimensions restore the
previous logic and it fixes rendering issues.
Fixes: c7d0d328d5 ("radv: Set the window scissor to the render area, not framebuffer")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20900>
(cherry picked from commit 1a93cd1556)
The JIT generated FS shader has logic to obey sample mask when:
multisample is enabled, or multisample is disabled but FS writes sample
mask and pipe_rasterizer_state::no_ms_sample_mask_out.
However it did not handle the case where multisample was disabled, FS
did not write sample_mask, and sample mask was zero. Instead it relied
upon the setup to discard the primitives, but that went away with commit
da5840f3.
We could restore the discard on zero mask behavior, but we would again
blurring the semantics of rasterization discard. Instead this change
adds logic to primitive setup to cull the primitives when sample mask is
zero.
Fixes: da5840f3 ("llvmpipe: Faithfully honour pipe_rasterizer_state::rasterizer_discard flag")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20934>
(cherry picked from commit 77092ca8f4)
RRGetScreenInfo re-probes connector status, which may result in an EDID
transfer for every output, which according to Adam Jackson can be on the
order of 100ms for a single EDID block. So our previous implementation
of this eglGetMscRateANGLE was blocking for excessive periods of time
instead of being a quick query of the refresh rate like users expect.
This changes our eglGetMscRateANGLE implementation from using
RRGetScreenInfo to RRGetScreenResourcesCurrent and RRGetCrtcInfo.
This obtains the same monitor info without re-probing connectors.
Fixes a severe performance regression in Chromium WebGL performance.
While we're re-implementing the extension, we also implement proper
multi-monitor support: if there are multiple active CRTCs, we determine
which contains the largest portion of the surface, as specified in the
EGL_ANGLE_sync_control_rate extension.
We also now report fractional refresh rates correctly rather than
rounding to the nearest Hz.
Fixes: 4752655649 ("egl/x11: implement ANGLE_sync_control_rate")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6996
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7038
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20665>
(cherry picked from commit 41d5f0ee09)
add a special tracker here to set the state only when necessary
Fixes: 659c39fafb ("zink: rework primitive rasterization type logic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20886>
(cherry picked from commit 06a125942b)
Conflicts:
src/gallium/drivers/zink/ci/zink-lvp-validation-settings.txt
Stable:
- removed CI file taht doesn't exist in 23.0 branch
The shifting was off-by-one compared to how it is done in the kernel. Also, excess_length needs to be casted to uint64_t to prevent zeroing everything except the 5 LSBs.
Fixes: abf3bcd6 ("radv: Add RMV resource tracking")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20820>
(cherry picked from commit e07729e8de)
The ACP entries created by copy propagation to track the implied
copies of LOAD_PAYLOAD instructions don't model the behavior of
LOAD_PAYLOAD correctly, since (as of 41868bb682) header
moves are implicitly retyped to UD and the destination of non-header
copies implicitly uses the same type as the corresponding source, even
though the ACP entries created for such copies could incorrectly
represent a type conversion, which can lead to mis-optimization of the
program.
According to Marcin, this fixes the func.mesh.ext.workgroup_id.task.q0
crucible test.
Fixes: 41868bb682 ("i965/fs: Rework the fs_visitor LOAD_PAYLOAD instruction")
Reported-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18980>
(cherry picked from commit 7b5e933629)
ChromeOS still uses Python 3.6, but the glsl2spirv script uses module
'__future__.annotations', introduced in Python 3.7. Fix the build by
removing module, but otherwise preserve the type annotations.
Fixes: 949c3b55db ("util/glsl2spirv: add type annotations")
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20237>
(cherry picked from commit bca22a6578)
`count` should not be incremented before the check, because it causes
the modifiers array to be filled starting from position 1 instead of 0.
This bug causes one less format modifier to be available than would
otherwise be expected, which could then lead to a dmabuf query failing
in situations where a supported modifier wouldn't be advertised.
It also causes garbage data to be advertised as a modifier in position 0
of the array, although this is not very likely to cause issues.
Fixes: 2a1217513 ("panfrost: Implement panfrost_query_dmabuf_modifiers")
Cc: mesa-stable
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20879>
(cherry picked from commit 6c446377ff)
gfx11 renamed PRIM_GRP_SIZE to VERTS_PER_SUBGRP but another change was
was missed.
Update our code based on PAL's UniversalCmdBuffer::CalcGeCntl function
(especially useVgtOnchipCntlForTess being false for gfx11).
Fixes: 25a66477d0 ("radeonsi/gfx11: register changes")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20728>
(cherry picked from commit f73cdda983)
This is needed when switching away from GPGPU mode. See the previous
commit for anv. This is not likely to make a practical difference for
iris because it never switches back and forth between modes like anv.
Fixes: 172e0b0ebf ("iris: Update PIPELINE_CONTROL flush when switching pipeline mode in TGL+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20774>
(cherry picked from commit bd8e8d204d)
See the comments in emit_apply_pipe_flushes(). Flushing HDC is not
sufficient in GPGPU mode, and we need to set the untyped data port flush
bit as well.
Fixes many dEQP-VK failures with INTEL_COMPUTE_CLASS=1 on Alchemist.
Fixes: 1067ec90a5 ("anv: Update PIPELINE_CONTROL flush when switching pipeline mode in TGL+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20774>
(cherry picked from commit a8108f1d44)
Otherwise, the access chain you emitted last time may not dominate the
current use.
Fixes the following validation failure in
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.bits_unique_per_sample.multisample_texture_2:
UNASSIGNED-CoreValidation-Shader-InconsistentSpirv(ERROR / SPEC):
msgNum: 7060244 - Validation Error: [
UNASSIGNED-CoreValidation-Shader-InconsistentSpirv ] Object 0: handle =
0x55cf3cea2c60, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bbb14 |
SPIR-V module not valid: ID '67[%67]' defined in block '23[%23]' does
not dominate its use in block '31[%31]'
Fixes: 8899f6a198 ("zink: fix gl_SampleMaskIn spirv generation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20756>
(cherry picked from commit 4286633eec)
Fixes validation failures:
Test case 'dEQP-GLES31.functional.android_extension_pack.shaders.es32.extension_directive.oes_sample_variables'..
MESA: error: Validation Error: [
UNASSIGNED-CoreValidation-Shader-InconsistentSpirv ] Object 0: handle =
0x563a1838b790, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bbb14 |
SPIR-V module not valid: [VUID-StandaloneSpirv-Flat-04744] Fragment
OpEntryPoint operand 31 with Input interfaces with integer or float type
must have a Flat decoration for Entry Point id 4.
%gl_SampleId = OpVariable %_ptr_Input_uint Input
Test case 'KHR-GL46.shader_ballot_tests.ShaderBallotAvailability'..
MESA: error: Validation Error: [ UNASSIGNED-CoreValidation-Shader-InconsistentSpirv ] Object 0: handle = 0x5558e12f17e0, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bbb14 | SPIR-V module not valid: [VUID-StandaloneSpirv-Flat-04744] Fragment OpEntryPoint operand 28 with Input interfaces with integer or float type must have a Flat decoration for Entry Point id 4.
%gl_SubgroupLocalInvocationId = OpVariable %_ptr_Input_uint Input
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20756>
(cherry picked from commit 2a33d509ca)
I've been running into failures with tests like :
dEQP-VK.robustness.robustness2.bind.notemplate.rgba32i.unroll.nonvolatile.uniform_buffer_dynamic.no_fmt_qual.len_4.samples_1.1d.frag
With the load_global_const_block_intel NIR intrinsic, you can load a
vec8/vec16 with a predicate. The predicate is correctly uniformized to
feed into the SEND instruction's flag register.
The problem is that a series of optimization first remove the
find_live_channel and then changes the broadcast into a simple MOV
instruction, on the assumption that the first channel is always active
if there is not control flow. This is correct.
But after that the cmod optimzation will remove this instruction :
mov.nz.f0.0(16) null:D, vgrf16+0.0<0>:D NoMask
because it seems to be equivalent to :
cmp.g.f0.0(16) vgrf16:D, vgrf12:D, 63d
In this case vgrf16 is the predicate to the load block SEND
instruction. Since the execution mask is different between both, some
of the channels of the SEND instruction end up not being loaded or
loaded with the wrong predication and we end up with incorrect UBO
data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20852>
(cherry picked from commit a50d2fdb46)
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT = 0x01 has been incidently the
correct memory type index, but isn't guaranteed to be, which is why it
hasn't caused issues yet
Fixes: f9515d93 ("zink: allocate/place memory using memoryTypeIndex directly")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20264>
(cherry picked from commit c71287e70c)
When this was last changed, the smoothing_enabled flag seems to have
been forgotten about, breaking line-smoothing (and probably also polygon
smoothing).
Fixes: 4147add280 ("radeonsi: update db_eqaa even if msaa is disabled")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20810>
(cherry picked from commit 9f4f131f2e)
Conflicts:
src/amd/ci/radeonsi-raven-fails.txt
src/amd/ci/radeonsi-stoney-fails.txt
unlocks were incorrectly added to paths using dri2_egl_display() as
well as those using dri2_egl_display_lock()
pthread_mutex_unlock() when unlocked is documented by posix as
being undefined behaviour. On OpenBSD pthread_mutex_unlock() will call
abort(3) if this happens.
Fixes: f1efe037df ("egl/dri2: Add display lock")
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20712>
(cherry picked from commit 0594b3c143)
It was wrong. For example, if the draw vertex count was 10 and the upload
vertex count was 150, u_vbuf wouldn't unroll the draw and would instead
memcpy 150 vertices. This fixes that case.
Fixes: 068a3bf0d7 - util: move and adjust the vertex upload heuristic equation from u_vbuf
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20824>
(cherry picked from commit 4f6e785876)
this is an artifact of very old code before the dynamic state was set
for all graphics pipelines
now the checks only cause blend constants to not be updated, which triggers
bugs and validation failures
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20799>
(cherry picked from commit b4d18f2ad1)
offset field in kgsl_command_object is NOT used by KGSL, so
we should offset directly to iova.
Fixes weird hangs on KGSL. E.g. fixes the hang in:
dEQP-VK.memory.pipeline_barrier.transfer_dst_storage_texel_buffer.1024
cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20795>
(cherry picked from commit 926f626b95)
For a depth-only-with-discard pipeline, spi_shader_col_format needs to be
fixed up to a single channel export, or otherwise discard will not work.
Since col_format can change depending on the dynamic state, precompute the
need for this workaround on pipeline creation and apply it when emitting
prolog states.
Fixes: eb07a11b8f ("radv: add support for compiling PS epilogs on-demand")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20704>
(cherry picked from commit 1617dac6c3)
Because anv_execbuf_add_bo_bitset() calls anv_execbuf_add_bo(), which
can fail if its memory allocations fail.
I have seen dEQP tests exercising memory allocation failures during
anv_execbuf_add_bo(), but I don't think the path coming from
add_bo_biset() was specifically exercised. Anyway, add the error check
just in case.
v2: Rebase.
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
stable:
reapplied to src/intel/vulkan/anv_batch_chain.c, as it has not been
moved to src/intel/vulkan/i915/ in 23.0
In anv_execbuf_add_syncobj(), we try to not create or use
exec->syncobj_values if we don't need to. But when we figure we're
going to need it (i.e., when timeline_value is not zero), then we
create exec->syncobj_values with vk_zalloc, which means every previous
value is set to zero, as it should be. This is all correct.
The problem starts when we add a 16th element. In this case we double
exec->syncobj_array_length and realloc the buffer by using vk_alloc
and copying the old array to the new one. After that, we write the
timeline_value to the array only if it's not zero, and that's the
problem: since we just used vkalloc and memcpy, we don't have any
guarantees that the new array will be zero after the 16th element, and
if timeline_value is zero we write nothing to that position.
Once we start using exec->syncobj_values we have to commit to using
it, so the "if (timeline_value)" check near the end of the function
has to be changed to "if (exec->syncobj_values)", so we actually set
elements after the 16th to zero when they need to be zero. Another
approach to fix this would be to memset the new elements once we
double syncobj_array_length.
In practice, I couldn't find any application or deqp test that used
more than 3 elements in exec->syncobj_array_length, and we need more
than 16 elements in order to be able to reproduce the bug, so I'm not
aware of any real-world bug that goes away with this patch. This issue
was found while reading code.
If we craft a little Vulkan program that submits a ton of timeline and
binary semaphores on vkQueueSubmit, then waits for them, we get the
following error without this patch:
MESA: error: ../../src/intel/vulkan/anv_batch_chain.c:1910: execbuf2 failed: Invalid argument (VK_ERROR_DEVICE_LOST)
v2: Rebase.
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
(cherry picked from commit ad6a036a68)
Conflicts:
src/intel/vulkan/i915/anv_batch_chain.c
Stable:
reapplied to src/intel/vulkan/anv_batch_chain.c, as this hasn't been
moved in the staging branch.
internal_format is the "real" format of a resource, and the "real" format
of imported resources is the external-facing format, not the pipe format
this ensures the correct format is available for internal ops, such as nplanes queries
Fixes: 2e2775c11b ("zink: fix PIPE_RESOURCE_PARAM_NPLANES with format modifier")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20753>
(cherry picked from commit 072e29a22e)
When the base array layer of the image view is > 0, addrlib computes
the offset (in HwlComputeSubResourceOffsetForSwizzlePattern) which is
then added to the base VA in RADV. But if the driver doesn't reset
the base array layer, the hw will compute incorrect addressing
(ie. base array will be added twice). This also matches AMDVLK.
This fixes a VM fault followed by a GPU hang on RDNA2 when trying
to join a multiplayer game with medium settings in Halo Infinite.
Fixes: 98ba1e0d81 ("radv: Fix mipmap views on GFX10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20761>
(cherry picked from commit 8d191b2cfb)
The ideal case for performance is to have a single buffer for
all display list. The caveat is that large buffers are less
likely to be freed because they're refcounted: it only takes
1 user (diplay list) to keep it in VRAM.
This lowers VRAM usage when replaying the trace attached
of the trace attached to !6140 from 5.5 GB to about 1.8 GB.
Viewperf snx performance isn't affected.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6140
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20748>
(cherry picked from commit 0f5c8c3dc3)
grow_vertex_storage may call wrap_filled_vertex, which will
trigger the assert incorrectly because the new size will be
smaller than 'new_size' but it's correct because
'vertex_store->used' has been reset to 0.
Fixes: a08baaff97 ("vbo/dlist: fix indentation in vbo_save_api.c")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20748>
(cherry picked from commit 491f6b138e)
This avoids a violation of the Vulkan memory model that was leading to
intermittent failures of at least 8k test-cases of the Vulkan CTS
(within the group dEQP-VK.memory_model.*) on TGL and DG2 platforms.
In theory the issue may be reproducible on earlier platforms like IVB
and ICL, but the SYNC.ALLWR instruction is not available on those
platforms so a different (likely costlier) fix will be needed.
The issue occurs within the sequence we emit for a NIR memory barrier
with acquire semantics requiring the synchronization of multiple
caches, e.g. in pseudocode for a barrier involving the TGM and UGM
caches on DG2:
x <- load.ugm // Atomic read sequenced-before the barrier
y <- fence.ugm
z <- fence.tgm
wait(y, z)
w <- load.tgm // Read sequenced-after the barrier
In the example we must provide the guarantee that the memory load for
x is completed before the one for w, however this ordering can be
reversed with the intervention of a concurrent thread, since the UGM
fence will block on the prior UGM load and potentially take a long
time, while the TGM fence may complete and invalidate the TGM cache
immediately, so a concurrent thread could pollute the TGM cache with
stale contents for the w location *before* the UGM load has completed,
leading to an inversion of the expected memory ordering.
v2: Apply the workaround regardless of whether the NIR barrier
intrinsic specifies multiple storage classes or a single one,
since an acquire barrier is required to order subsequent requests
relative to previous atomic requests of unknown storage class not
necessarily specified by the memory scope information of the
intrinsic.
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20690>
(cherry picked from commit 4a2e7306dd)
Commit e07c5467 ("v3dv/format: use XYZ1 swizzle for three-component formats")
removes the only code that handled the clamp_to_transparent_black_border
variable. Therefore, the variable can be deleted, as it is not currently
being used.
Fixes: e07c5467 ("v3dv/format: use XYZ1 swizzle for three-component formats")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20746>
(cherry picked from commit 86c9bdcd9a)
We don't use a base workgroup ID for BLOCS. It needs to be lowered, or
else we'll assert fail when compiling the compute shader.
(Note for stable: this patch doesn't fix a bug in 4abdecce22
specifically, but rather is a missing patch that needed to go along with
the rest of MR 20068, on whichever branches it exists on.)
Fixes: 4abdecce22 ("iris: Lower load_base_workgroup_id to zero")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20750>
(cherry picked from commit a6c6a4ad04)
We cannot change this, as it has already been communicated to app
partners. Also this breaks chrome's GPU quirk matching (which in some
cases is non-gpu-related, but when all you have is a hammer, everything
looks like a nail).
Fixes: 9c1fbc076a ("Return 'Mesa' for GL_VENDOR for community drivers")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20757>
(cherry picked from commit 6f91a5ab07)
In the original change I noticed that missing robustness on swkms seemed
to be an oversight, since it was enabled on sw-non-kms, so I exposed the
ext based on the underlying pipe query. However it turns out that there
is a dri_screen flag for allowing robust contexts that exists to do error
checking for GLX, which was under an !swkms check. So we would expose the
ext, but then throw an error if you tried to create one.
Fixes: e6285ea55f ("egl: Replace the robustness DRI2 ext check with a pipe cap query.")
Closes: #8066
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20679>
(cherry picked from commit 6b0db6bf8b)
opaque should not be set when logicops are enabled, that needs blending
even on Bifrost. Fixes is for when I believe the bug became possible to hit.
The logical error is older.
Fixes Piglit logicop tests again.
Fixes: d849d9779a ("panfrost: Avoid blend shader when not blending")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20685>
(cherry picked from commit 41d99c10d1)
Unlike literally every other mesa/st emulation, for some inexplicable reason we
need to pretend to support the CAP and then set a different EMULATE cap instead
of the emulation keying off the lack of support for the CAP. Set the CAPs
accordingly so we get NV_primitive_restart (with emulation of non-fixed
indices).
This gets Mesa to advertise GL 3.1 on Mali-G57 as intended.
Fixes: 30c14f54cf ("panfrost: Disable PIPE_CAP_PRIMITIVE_RESTART on v9")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20702>
(cherry picked from commit fe4dc59e99)
Future changes to nir_lower_blend cause fsat(reg.yx) instructions to be
generated, which correspond to "FCLAMP.v2f16 x.h10" pseudoinstructions. These
get their swizzles lowered, but we forgot to clear the swizzle out, so we end up
with extra swap (cancelling out the intended swizzle).
Fix the lowering logic.
Fixes: ac636f5adb ("pan/bi: Use FCLAMP pseudo op for clamp prop")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20683>
(cherry picked from commit ed46c617b0)
We set the different data being freed to NULL after freeing it, and
checks for NULL before freeing it.
This fixes several double free crash with v3dv, when running OOM wsi
tests, like for example:
dEQP-VK.wsi.xlib.swapchain.simulate_oom.composite_alpha
Although note that only one person got those on a new fresh install of
the Raspbian OS, so this problem was rare.
Fixes: 5b13d74583 ("vulkan/wsi/drm: Break create_native_image in pieces")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20695>
(cherry picked from commit b27e42dcb5)
This isn't allowed for the same reason that AFBC of regular luminance-alpha
isn't allowed (and will raise DATA_INVALID_FAULTs). Reorder the checks to
ensure these formats are checked.
Fixes Piglit texwrap GL_EXT_texture_sRGB-s3tc.
Fixes: 476be5cb27 ("panfrost: Don't use texture format swizzles on v7")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20686>
(cherry picked from commit 5fdfd8044d)
If we pass a physical 2D texture descriptor we should also pass 2
coords. Otherwise it just uses the random content in the second
register which ends up funny sometimes.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20696>
(cherry picked from commit edca10e9c9)
If the previously emitted graphics pipeline uses the value A for
col_format_non_compacted and the new bound graphics pipeline uses B.
At bind time, radv_cmd_state::col_format_non_compacted will be set to
B and the rbplus flag will be dirtied. But if there is no draws and a
new graphics pipeline is bound with the same value as A, the next
draw will emit the rbplus state with B instead of A.
This can be basically triggered with meta operations after drawing
because the driver saves/restores the bound pipeline.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8073
Fixes: 11469f7553 ("radv: copy the non-compacted color format at pipeline bind time")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20692>
(cherry picked from commit 5b3fb44ecc)
I looked through the changes to features.txt since the 22.3 branchpoint
and added those to the new_features list as well. It was pointed out
that VK_KHR_present_wait wasn't in the list of new features.
2023-01-13 09:51:01 -08:00
9701 changed files with 898780 additions and 2211579 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.