some gnome tests are seeing this:
==4579== Invalid read of size 4
==4579== at 0x161B28FB: UnknownInlinedFun (simple_mtx.h:106)
==4579== by 0x161B28FB: st_save_zombie_sampler_view (st_context.c:213)
==4579== by 0x161D762A: st_texture_release_all_sampler_views.part.0 (st_sampler_view.c:272)
==4579== by 0x161D7CB6: st_texture_release_all_sampler_views (st_sampler_view.c:258)
==4579== by 0x161D7CB6: st_delete_texture_sampler_views (st_sampler_view.c:292)
==4579== by 0x16191B9E: _mesa_delete_texture_object (texobj.c:523)
==4579== by 0x16191CDC: _mesa_reference_texobj_ (texobj.c:637)
==4579== by 0x1619632F: _mesa_reference_texobj (texobj.h:92)
==4579== by 0x1619632F: _mesa_free_texture_data (texstate.c:1114)
==4579== by 0x162EEE1D: _mesa_free_context_data (context.c:1155)
==4579== by 0x161B3C0F: st_destroy_context (st_context.c:999)
==4579== by 0x160F155A: dri_destroy_context (dri_context.c:277)
==4579== by 0x1603C371: dri2_destroy_context (egl_dri2.c:1592)
==4579== by 0x1602EBF5: eglDestroyContext (eglapi.c:918)
==4579== by 0x502DF3C: gdk_gl_context_dispose (gdkglcontext.c:211)
==4579== Address 0x34dc9b08 is 7,016 bytes inside an unallocated block of size 10,336 in arena "client"
==4579==
It appears we destroy the mutex and zombie objects, but freeing
context data seems to add them back.
This might not be the complete answer.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28565>
(cherry picked from commit f8e48b561e)
We need invalidate CCHE when we optimize out an invalidation of UCHE,
for example a storage image write to texture read. We missed this
earlier because of the blob's tendency to always over-flush, but the
blob does use this when building acceleration structures.
Fixes: 95104707f1 ("tu: Basic a7xx support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28445>
(cherry picked from commit fb1c3f7f5d)
this breaks interfaces where the consumer reads its input indirectly
and only some of the components are written by clobbering all
the components, even if the unwritten components are never accessed
Fixes: 459b49a174 ("zink: add a new linker pass to handle mismatched i/o components")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28466>
(cherry picked from commit 460cd99ea5)
A last memory leak related to constants_remap_table is happening.
This memory leak is triggered by two deqp-gles2 tests.
For instance, this issue is triggered with
"deqp-gles2 --deqp-case=dEQP-GLES2.functional.uniform_api.random.13":
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x7f1b4a5de7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f1b401a2cdf in rc_remove_unused_constants ../src/gallium/drivers/r300/compiler/radeon_remove_constants.c:101
#2 0x7f1b40185386 in rc_run_compiler_passes ../src/gallium/drivers/r300/compiler/radeon_compiler.c:476
#3 0x7f1b40185625 in rc_run_compiler ../src/gallium/drivers/r300/compiler/radeon_compiler.c:498
#4 0x7f1b401c14d2 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:172
#5 0x7f1b401b669a in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#6 0x7f1b401baf73 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:592
#7 0x7f1b40128db7 in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1071
#8 0x7f1b3e67799d in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1073
#9 0x7f1b3e680285 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1119
#10 0x7f1b3e6812fa in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1284
#11 0x7f1b3e6812fa in st_finalize_program ../src/mesa/state_tracker/st_program.c:1363
#12 0x7f1b3f13d501 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:754
#13 0x7f1b3f13d501 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:990
#14 0x7f1b3efeef75 in link_program ../src/mesa/main/shaderapi.c:1336
#15 0x7f1b3efeef75 in link_program_error ../src/mesa/main/shaderapi.c:1445
Fixes: 29df85788a ("r300: fix constants_remap_table memory leak")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28522>
(cherry picked from commit 24a5165cdf)
For BROADCAST and MOV_INDIRECT, required_exec_type was returning
brw_int_type(type_sz(t), false), which is an unsigned type. However,
get_exec_type(inst) returns the original type for either Q or UQ.
This meant that has_invalid_exec_type would detect a mismatch and
trigger lowering.
That lowering would insert new 64-bit MOVs, which would need to be
lowered on platforms which don't support Q/UQ. Except, we already
ran that lowering pass earlier. So, the unlowered Q/UQ MOVs would
reach the software scoreboarding pass, and trigger failures in the
inferred_exec_pipe() function, as no pipe is available to handle
64-bit integer operations.
It turns out that we don't need the region lowering pass to do
anything for these opcodes. The generator code for both BROADCAST
and MOV_INDIRECT already handle decomposing Q/UQ operations into
32-bit MOVs when they're not supported. And, it also implicitly
converts to integer types, even for floating point sources. The
inferred_exec_pipe function already special cases them to note
that they'll always be handled on the integer pipe, so that matches.
Just drop the region lowering code for these opcodes.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit 7a24f29fbb)
We are overriding the type to Q/UQ, so we need to split to two MOVs
if 64-bit integer math is not supported. For reference, Meteorlake
does support 64-bit floats but would still not work correctly here.
See also brw_broadcast(), which does similar indirects but correctly
checks has_64bit_int instead of has_64bit_float.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit a90edad9f7)
==134077== 96 bytes in 1 blocks are definitely lost in loss record 1 of 3
==134077== at 0x4840808: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==134077== by 0x6D6F690: vk_default_alloc (vk_alloc.c:26)
==134077== by 0x52EEEBE: vk_alloc (vk_alloc.h:48)
==134077== by 0x52EEEEE: vk_zalloc (vk_alloc.h:56)
==134077== by 0x52EF47E: xe_exec_process_syncs (anv_batch_chain.c:132)
==134077== by 0x52EF8F6: xe_execute_trtt_batch (anv_batch_chain.c:215)
==134077== by 0x5301670: anv_queue_submit_trtt_batch (anv_batch_chain.c:1697)
==134077== by 0x603D135: gfx125_write_trtt_entries (genX_cmd_buffer.c:6091)
==134077== by 0x5370B44: anv_sparse_bind_trtt (anv_sparse.c:595)
==134077== by 0x5370CFC: anv_sparse_bind (anv_sparse.c:629)
==134077== by 0x5370E6E: anv_init_sparse_bindings (anv_sparse.c:670)
==134077== by 0x5328037: anv_CreateBuffer (anv_device.c:5071)
Note to backporters: this is only for when xe.ko is being used and
ANV_SPARSE_USE_TRTT=1 is exported. This is not the regular code path.
Fixes: 18bd00c024 ("anv/trtt: don't wait/signal syncobjs using the CPU anymore")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28455>
(cherry picked from commit 38af7254e2)
Xe KMD don't refcount anything, so resources could be freed while they
are still in use if we don't wait for exec_queue to be idle.
This issue was found with Xe KMD error capture, VM was already
destroyed when it attemped to capture error state but it can also
happen in applications that did not hang.
This fixed the '*ERROR* GT0: TLB invalidation' errors when running
piglit all test list.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27500>
(cherry picked from commit 665d30b544)
When LLVM 18 is used, pass the no-verify-fixpoint option when
running the instcombine pass. Otherwise LLVM may abort with an
error.
The background here is that this option is enabled by default for
testing purposes, because instcombine is normally only explicitly
invoked like this inside tests. If it is used in an actual
production pipeline, the no-verify-fixpoint option needs to be
enabled.
This should fix the issue reported at
https://bugzilla.redhat.com/show_bug.cgi?id=2268800.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28101>
(cherry picked from commit 99f0449987)
When tu_shader object is destroyed through vk_pipeline_cache, the relevant
destroy callback should relay to the general tu_shader_destroy function
that will also clean up owned resources.
During shader creation, the ir3_shader object should be destroyed once the
shader variants are retrieved. Since those variants are owned by tu_shader
they should be freed up in tu_shader_destroy.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: a03525d8db ("tu: Split program draw state into per-shader states")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27847>
(cherry picked from commit 3ee81ffe14)
It was by pure luck that all sources (and the result) of nir_dpas_intel
had the same number of components. It is possible to support matrix
sizes where the accumlator matrix and the result matrix are larger
(e.g., 16x8 * 8x16 = 16x16).
This breaks all of the assumptions of NIR's infrastructure for code
generating intrinsics. Fix the by making the accumulator matrix be the
first source. The accumulator and the result will always have the same
dimensions (due to rules of matrix multiplication) and the same type
(due to restructions of the cooperative matrix extension). This forces
them to have the same number of components.
This doesn't fix all the potential problems. NIR expects that all
0-sized sources will have the same number of components. This just
ensures that the result has the correct number of components.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit a8115221e5)
If the destination was the accumulator but is no longer, having the flag
set is not correct. On Xe2 this also causes a validation error.
v2: Reword the comment to be more clear. Suggested by Jordan.
Fixes: efa4e4bc5f ("intel/fs: Introduce regioning lowering pass.")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit be4fa59a72)
This reverts commit be8b7980e6.
Store the cache entry so that the fast path picks up the optimized
pipeline when its available from a background optimized_compile_job().
Observed traces where it would take the fast path back and forth using
an unoptimized pipeline and never pick up the optimized pipeline leading
to >50% fps drop.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28440>
(cherry picked from commit d6978b1af2)
opt_copy_propagation() can sometimes propagate FIXED_GRF sources into
SHADER_OPCODE_SENDs as the message payload. For example, GS input
reads, which simply take a URB handle and have the offset in the
descriptor. For non-VGRFs, there isn't a payload to split, so just
skip past such send messages.
Fixes: 589b03d02f ("intel/fs: Opportunistically split SEND message payloads")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28067>
(cherry picked from commit ba11127944)