In order to turn on/off through SNMP DuT under PoE switch, the SNMP key
in some vendors don't directly use the interface number, but a number
shifted a base number.
Define this base number as BM_POE_BASE environment in the runner.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29306>
(cherry picked from commit 90f8be9bda)
Macro values that define values for different HW generations should
use the V3DV_X helper instead of being defined under a V3D_VERSION #if
condition.
Without this change, the original V3D_CLE_READAHEAD and
V3D_CLE_BUFFER_MIN_SIZE definitions used were only working for 4.2 HW.
For the 7.1 HW (RPi5) the 4.2 definitions were applied.
The CLE MMU errors were hidden as they were reported at dmesg as
"MMU error from client PTB (1) at 0x1884200, pte invalid" instead of
client CLE. So fixes all v3d dmesg warnings for PTB MMU errors on RPi5.
With this change we really don't need different functions per HW generation,
so we rename back file v3dx_cl.c to v3d_cl.c. As before, we can use
only the packets definitions for 4.2 HW as they use the same opcode as 7.1 HW.
Fixes: 11dce2ac81 ("v3d: fix CLE MMU errors avoiding using last bytes of CL BOs.")
Fixes: e2c624e74e ("v3d: Increase alignment to 16k on CL BO on RPi5")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29496>
(cherry picked from commit f32a258503)
Macro values that define values for different HW generations should
use the V3DV_X helper instead of being defined under a V3D_VERSION #if
condition.
Without this change, the original V3D_CLE_READAHEAD and
V3D_CLE_BUFFER_MIN_SIZE definitions used were only working for 4.2 HW.
For the 7.1 HW (RPi5) the 4.2 definitions were applied.
The CLE MMU errors were hidden as they were reported at dmesg as
"MMU error from client PTB (1) at 0x1884200, pte invalid" instead of
client CLE. So fixes all v3dv dmesg warnings for PTB MMU errors on RPi5.
With this change we really don't need different functions per HW generation,
so we rename back file v3dvx_cl.c to v3dv_cl.c. As before, we can use
only the packets definitions for 4.2 HW as they use the same opcode as 7.1 HW.
It fixes also an indentation error introduced with 26c8a5cd72.
Fixes: bb77ac983e ("v3dv: Increase alignment to 16k on CL BO on RPi5")
Fixes: 26c8a5cd72 ("v3dv: fix CLE MMU errors avoiding using last bytes of CL BOs.")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29496>
(cherry picked from commit 07d3d55783)
The SamplerDescriptor structure has a field which describes how
floating point coordinates should be converted to fixed point.
Setting this to "true" (which causes round to nearest even) fixes
a failing CTS test.
The CTS test in question is:
dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_float_color
The OpenGL spec is somewhat vague about how rounding is to be
performed, so it appears both settings should be legal; this may
indicate a problem with the CTS. Nevertheless "round to nearest even"
is probably a better default and since it fixes the failing test we
may as well use it.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29464>
(cherry picked from commit d91d2c275e)
Currently for an "unavailable" query, if VK_QUERY_RESULT_PARTIAL_BIT is
set, anv will return (slot.end - slot.begin). This can cause underflow
because slot.end might still be at the initial value of 0.
This commit fixes the issue by returning 0 in that situation.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29447>
(cherry picked from commit f8ccf70c99)
The queueFlags of the associated queue may have more flags than just the
type of queue it is, based on what that queue supports, like sparse or
protected content. Check that the queue is a blitter engine instead.
Fixes a bunch of dEQP-VK.api.copy_and_blit.core.*_transfer on MTL with
ANV_SPARSE=0
Fixes: 17b8b2cffd ("anv: Add support for a transfer queue on Alchemist")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29336>
(cherry picked from commit 8d098ecfea)
The code for checking flow control did not realize that
`LD_TEX` and `LD_TEX_IMM` were memory accesses, and hence was
not inserting waits where these were necessary. This showed up
as flakes in KHR-GLES31.core.shader_image_load_store.basic-glsl-misc-fs
Cc: mesa-stable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29363>
(cherry picked from commit 272dcaff01)
In the sort functions used to sort varyings in gl_nir_link_varyings,
we were only checking the first input for whether or not it is xfb.
Check both inputs, and also provide a definite order for the xfb vs.
non-xfb varyings (the xfb come last, as the initial sort established).
This fixes a problem encountered on panfrost, where qsort could
mix xfb and non-xfb varyings which started out separate.
Note that the sort is still not stable. We probably should make it
stable, but that is a more extensive change that's handled in a later
commit.
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29178>
(cherry picked from commit 5102a922e7)
We can emit spill setup before RA if we use scratch. In that case
we have the same situation as during spilling, with the caveat that
we have already emitted the instructions so we need to find them
(they should be the only instructions ones before the instructions
accessing payload registers) and flag them as such.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29343>
(cherry picked from commit 865e682ad7)
We read our payload registers first in the shader so we generally don't have
to care about temps being allocated to them and stomping their value before
we can read them. Hoewer, spilling setup instructions are an exception since
these will be inserted first when there is any spilling in the program.
To fix this, we flag RA nodes involved with these instructions so we can
then try to avoid assiging these registers to them.
Fixes CTS failures with V3D_DEBUG=opt_compile_time, particularly:
dEQP-VK.binding_model.buffer_device_address.set0.depth2.basessbo.convertcheckuv2.nostore.single.std140.comp_offset_nonzero
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29343>
(cherry picked from commit cb83f25b39)
When compilation is required, we should return
VK_PIPELINE_COMPILE_REQUIRED. The spec prevents the application from
passing a module or SPIR-V code so we have nothing to compile if the
cache lookup fails :
VUID-VkPipelineShaderStageCreateInfo-stage-06844:
If a shader module identifier is specified for this stage, a
VkShaderModuleCreateInfo structure must not be present in the pNext
chain
VUID-VkPipelineShaderStageCreateInfo-stage-06848:
If a shader module identifier is specified for this stage, module
must be VK_NULL_HANDLE
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11208
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29340>
(cherry picked from commit 5f2288095b)
The GLES spec limits the valid combinations of format and type that
may be returned by queries and/or used by ReadPixel. The list of valid
combinations appears in table 8.2 of the GLES 3.2 spec. Our code for
reporting the type and format of the current framebuffer, however,
does not verify that the combination is legal for GLES. For example,
RGBA and UNSIGNED_SHORT_1_5_5_5_REV is not a valid GLES combination,
but it's what we were returning for a panthor 16 bit frame buffer.
We can fix this either by changing the format or type that we return
(internally we can handle any format/type combination). We advertise the
read_format_bgra extension, so we could return GL_BGRA for the format.
However, very few applications (including notably the Khronos CTS for GLES)
cope well with BGRA. So instead we change the type to a non-_REV one
so that the combination appears in the GLES spec table of legal values.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29144>
(cherry picked from commit 4d298673da)
As we are marking the last V3D_CLE_READAHEAD bytes as unusable we don't
need to reserve V3D_CL_MAX_INSTR_SIZE bytes for the CLE packet.
This reverts c2601f0690 ("v3dv: ensure at least V3D_CL_MAX_INSTR_SIZE
bytes in last CL instruction")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29023>
(cherry picked from commit 7afebc15ce)
We increase the alignment to 16k for BOs allocated for the CL on RPi5 HW.
So we have the same ratio of usable space because of HW readahead as
than on RPi4, as readahead has been increased from 256 to 1024 bytes on
RPi5.
We have also concluded that when the kernel is running with 16k pages
that is the default on Raspberry Pi 5 HW, BO allocations are aligned to
16k so this increase has no cost and we would be using memory more
efficiently.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29023>
(cherry picked from commit bb77ac983e)
We increase the alignment to 16k for BOs allocated for the CL on RPi5 HW.
So we have the same ratio of usable space because of HW readahead as
than on RPi4, as readahead has been increased from 256 to 1024 bytes on
RPi5.
We have also concluded that when the kernel is running with 16k pages
that is the default on Raspberry Pi 5 HW, BO allocations are aligned to
16k so this increase has no cost and we would be using memory more
efficiently.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29023>
(cherry picked from commit e2c624e74e)
The last V3D_CLE_READAHEAD bytes of the CLE buffer are unusable because
using them would prefetch the next readahead bytes of the CL that would
be outside the allocated BO. To guarantee that we can chain a BO to the
current CL we always reserve space for the BRANCH or
RETURN_FROM_SUB_LIST packets.
Not taking this into account has been generating kernel dmesg errors like
"MMU error from client CLE".
As V3D_CLE_READAHEAD is different from RPi4 (256 bytes) to RPi5 (1024 bytes).
So we needed to rename v3dv_cl.c to v3dvX_cl.c to have different objects per
V3D_VERSION.
Extra assertions have been included to validate that we don't write
packets over the usable size of the CL silently.
v2: - Do not declare unusable the space needed for the BRANCH packet,
but take it into account for all space reservations.
v3: - Squash here ("v3dv: Secondary CL needs also to handle CLE readahead")
- Remove spureous parenthesis (Iago Toral)
- Refactor to avoid checking for needs_return_from_sub_list inside
cl_alloc_bo adding unusable_space as new parameter.
v4: - Improved logic for chaining BOs moving it to cl_alloc_bo using
a new enum v3dv_cl_chain_type to identify the different kinds
of BO chaining. Now we increase the size of the BO just before
submitting the BRACH/RETURN_FROM_SUB_LIST packages.
v5: - Assert on BO size updates that we are within the BO size.
(Iago Toral)
v6: - Remove changes at cmd_buffer_end_render_pass_secondary as we
assumed that cl->bo was already allocated when ending the
secondary CL, but it can be NULL. And this was already handle
by current code.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29023>
(cherry picked from commit 26c8a5cd72)
The last V3D_CLE_READAHEAD bytes of the CLE buffer are unusable because
using them would prefetch the next readahead bytes of the CL that would
be outside the allocated BO. To guarantee that we can chain a BO to the
current CL we always reserve space for the BRANCH packet.
Not taking this into account has been generating kernel dmesg errors like
"MMU error from client CLE".
As V3D_CLE_READAHEAD is different from RPi4 (256 bytes) to RPi5 (1024 bytes).
So we needed to rename v3d_cl.c to v3dX_cl.c to have different objects per
V3D_VERSION.
Extra assertions have been included to validate that we don't write
packets over the usable size of the CL silently.
v2: - Remove spurious blank line (Iago Toral)
- Do not declare unusable the space needed for the BRANCH packet,
and take it into account for all reservations.
v3: - Handle BRANCH packet reserve only when CLE BO allocation is done.
v4: - Assert on BO size updates that we are within the BO size.
(Iago Toral)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29023>
(cherry picked from commit 11dce2ac81)
The original fix from
0f3370eede ("raseonsi/vcn: fix a h264 decoding issue")
would in some cases also trigger for I frames with interlaced streams.
Instead of checking used_for_reference_flags, use slice type and
only add one reference for P/B frames if needed.
This change still fixes playback of the sample from the original issue,
avoids the issue with interlaced streams and also fixes the case where
application provides no references at all.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11060
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29055>
(cherry picked from commit 5f4a6b5b00)
This change handles the case when "vertex_fetch_shader.cso" is null,
it implements the previous behavior in this specific case. This
situation is happening with clover.
For instance, this issue is triggered with "piglit/bin/cl-custom-buffer-flags":
==6467==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000000c (pc 0x7ff92908fe6e bp 0x7ffe86ae5ad0 sp 0x7ffe86ae5a30 T0)
==6467==The signal is caused by a READ memory access.
==6467==Hint: address points to the zero page.
#0 0x7ff92908fe6e in evergreen_emit_vertex_buffers ../src/gallium/drivers/r600/evergreen_state.c:2123
#1 0x7ff92908444b in r600_emit_atom ../src/gallium/drivers/r600/r600_pipe.h:627
#2 0x7ff92908444b in compute_emit_cs ../src/gallium/drivers/r600/evergreen_compute.c:798
#3 0x7ff92908444b in evergreen_launch_grid ../src/gallium/drivers/r600/evergreen_compute.c:927
#4 0x7ff9349f9350 in clover::kernel::launch(clover::command_queue&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&) ../src/gallium/frontends/clover/core/kernel.cpp:105
#5 0x7ff9349c331d in std::function<void (clover::event&)>::operator()(clover::event&) const /usr/include/c++/11.4.0/bits/std_function.h:590
#6 0x7ff9349c331d in clover::event::trigger() ../src/gallium/frontends/clover/core/event.cpp:54
#7 0x7ff9349c82f1 in clover::hard_event::hard_event(clover::command_queue&, unsigned int, clover::ref_vector<clover::event> const&, std::function<void (clover::event&)>) ../src/gallium/frontends/clover/core/event.cpp:138
#8 0x7ff9348daa47 in create<clover::hard_event, clover::command_queue&, int, clover::ref_vector<clover::event>&, clEnqueueNDRangeKernel(cl_command_queue, cl_kernel, cl_uint, const size_t*, const size_t*, const size_t*, cl_uint, _cl_event* const*, _cl_event**)::<lambda(clover::event&)> > ../src/gallium/frontends/clover/util/pointer.hpp:241
#9 0x7ff9348daa47 in clEnqueueNDRangeKernel ../src/gallium/frontends/clover/api/kernel.cpp:334
Fixes: 659b7eb2 ("r600: better tracking for vertex buffer emission")
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10079
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29163>
(cherry picked from commit f8a1d9f787)
There's a success path on found dmabuf while the iova won't be cleaned
up. This change defers iova alloc till lookup miss and also to prepare
for later racy dmabuf re-import fix.
Also documented a potential leak on error path due to unable to tell
whether a gem handle should be closed or not without refcounting.
Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29093>
(cherry picked from commit 6ca192f586)
If p_start_linear_vgpr allocates a VGPR that is already blocked, RA
will try moving the blocking VGPR somewhere else. If
p_start_linear_vgpr is inserted right before the branch, that move will
be inserted after exec has been overwritten, which might cause the move
to be skipped for some threads.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28041>
(cherry picked from commit 590ea76104)
For dma-buf, if the import and finish occur back-2-back for the same
dma-buf, zombie vma cleanup will unexpectedly close the re-imported
dma-buf gem handle. This change fixes it by trying to resurrect from
zombie vmas on the dma-buf import path.
Fixes: 63904240f2 ("tu: Re-enable bufferDeviceAddressCaptureReplay")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29173>
(cherry picked from commit a1392394ba)
Indeed, the object returned by LLVMCreatePassBuilderOptions()
was not freed.
For instance, this issue is triggered with "piglit/bin/cl-api-build-program":
Direct leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x7f6b15abdf57 in operator new(unsigned long) (/usr/lib64/libasan.so.6+0xb2f57)
#1 0x7f6afff6529e in LLVMCreatePassBuilderOptions llvm-18.1.5/lib/Passes/PassBuilderBindings.cpp:83
#2 0x7f6b1186ee41 in optimize ../src/gallium/frontends/clover/llvm/invocation.cpp:521
#3 0x7f6b1186ee41 in clover::llvm::link_program(std::vector<clover::binary, std::allocator<clover::binary> > const&, clover::device const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ../src/gallium/frontends/clover/llvm/invocation.cpp:554
#4 0x7f6b1150ce67 in link_program ../src/gallium/frontends/clover/core/compiler.hpp:78
#5 0x7f6b1150ce67 in clover::program::link(clover::ref_vector<clover::device> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, clover::ref_vector<clover::program> const&) ../src/gallium/frontends/clover/core/program.cpp:78
#6 0x7f6b11401a2b in clBuildProgram ../src/gallium/frontends/clover/api/program.cpp:283
Fixes: 2d4fe5f229 ("clover/llvm: move to modern pass manager.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29164>
(cherry picked from commit df39994d51)
Commit 7d9ea77b45 ("glx: add automatic zink fallback loading between hw and sw drivers")
added an automatic zink fallback even when the zink gallium is not
enabled at build time.
It leads to unexpected error log while loading drisw driver and
zink is not installed on the rootfs:
MESA-LOADER: failed to open zink: /usr/lib/dri/zink_dri.so
Fixes: 7d9ea77b45 ("glx: add automatic zink fallback loading between hw and sw drivers")
Signed-off-by: Romain Naour <romain.naour@smile.fr>
Reviewed-by: Antoine Coutant <antoine.coutant@smile.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27478>
(cherry picked from commit 02ab51a61e)
commit 1887368df4 ("glx/sw: check for modifier support in the kopper path")
added dri3_priv.h header and dri3_check_multibuffer() function in drisw that
can be build without dri3.
Commit 4477139ec2 added a guard around dri3_check_multibuffer()
function but not around dri3_priv.h header.
Add HAVE_DRI3 guard around dri3_priv.h header.
Fixes: 1887368df4 ("glx/sw: check for modifier support in the kopper path")
v2: Remove the guard around dri3_check_multibuffer() function.
Signed-off-by: Romain Naour <romain.naour@smile.fr>
Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27478>
(cherry picked from commit 3163b65ba7)
We need them for the case where explicit sample locations are not
enabled. While we're at it, fix the case where rasterization_samples=0.
This can happen when rasterizer discard is enabled. This fixes MSAA
resolves with NVK+Zink. In particular, it fixes MSAA for the Unigine
Heaven and Valley benchmark.
This also fixes all of the spec@arb_texture_float@multisample-formats
piglit tests.
Fixes: 41d094c2cc ("nvk: Support dynamic state for enabling sample locations")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10786
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29147>
(cherry picked from commit a160c2a14e)
this otherwise leads to a mismatch where some types of screen may have
the mutex initialized while others don't, in which case dri_release_screen()
will attempt to destroy an uninitialized mutex
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29021>
(cherry picked from commit bc15c95c7a)
Using the same buffer without a barrier actually can lead to data races as
drivers might not properly synchronize the content. Using the stream
uploader is a neat fix which prevents us from having to use a barrier but
still keep high throughput when launching kernels back-to-back.
Fixes: 5ff33f9905 ("rusticl: use real buffer for cb0 for drivers prefering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27666>
(cherry picked from commit 8da8c6c2d8)
On pre-Valhall HW, the fragment shader metadata was part of the RSD
(renderer state descriptor), which was emitted at draw time, but
Valhall introduces a shader program descriptor containing only the
shader information, and this one is emitted at shader preparation
time.
If we don't add the FS state BO to batch, we might end up with a batch
being executed after the shader object has been destroyed, leading to
page faults when the GPU tries to access the shader program descriptor.
We make the panfrost_batch_add_bo() unconditional since it gracefully
handles the NULL case (which will happen on v7-).
Fixes: 087b63cb07 ("panfrost: Allow uploading fragment SPDs")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Antonino Maniscalco <antonino.maniscalco@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28926>
(cherry picked from commit 2cc317763c)
Currently ffmpeg has a bug in VAAPI AV1 decode that in some cases
it submits the same slice data buffer as many times as there is tiles.
However, in other cases it behaves correctly and all slice data buffers
contain different parts of bitstream to decode which this change breaks.
Now that the va frontend is passing correct offsets, this fixes decoding
AV1-TEST-VECTORS/av1-1-b8-22-svc-L1T2 with ffmpeg.
This reverts commit e6701f7231.
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28960>
(cherry picked from commit 88dfe04b08)
The slice parameter data offset refers to offset in the submitted data buffer.
When multiple slice data buffers are submitted, the offsets of all slices needs
to be adjusted to be based from the start of the first data buffer.
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28960>
(cherry picked from commit 6746d4df6e)
D16 rounds towards zero for fp32 -> fp16, but for fixed point it rounds to
nearest even in fp16. MIMG without D16 also rounds to nearest even, but in fp32.
This means D16 and f2f16_rtz(tex@32) can produce different results.
Sadly this also means we can never use d16 if fp16 rounding isn't undefined.
Cc: mesa-stable
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28730>
(cherry picked from commit 3a35522c8a)
Fixes fs-uint-to-float-of-extract-int8.shader_test and
fs-uint-to-float-of-extract-int16.shader_test added by piglit!883.
No shader-db or fossil-db changes on any Intel platform.
v2: Expand the comment explaining the potential problem. Suggested by
Caio.
Fixes: 29ce110be6 ("i965/fs: Remove extract virtual opcodes.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891>
(cherry picked from commit bf5d82654a)
When a job is submitted to the flush_queue the resource dt_idx is reset,
and if a readback is requested then we have to make sure that the
corresponding kopper_preset has finished before we can acquire the image
for readback, so wait for the according fence in this case.
This fixes the validation error UNASSIGNED-Threading-MultipleThreads-Write
triggered by piglit "read-front" lavapipe.
Fixes: 8ade5588e3
zink: add kopper api
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28127>
(cherry picked from commit 811ed62865)
if swapchain creation fails (e.g., insane cts swapchain configs), the
swapchain gets demoted to a non-window image that is still accessed by
the frontend. this image should not ever hit corresponding zink entrypoints
for swapchain-only images, which requires a flag to test swapchain-edness
cc: mesa-stable
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28904>
(cherry picked from commit a50c17802a)
optimized pipeline compile jobs may still be ongoing during ctx
destroy, and these must complete too or else crashes will occur
fixes shutdown crash with dEQP-EGL.functional.sharing.gles2.multithread.simple.images.texture_source.teximage2d_render
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28900>
(cherry picked from commit bd1a3921d1)
it's possible for a shader to be precompiling its separate shader variants
during destruction, which requires that the programs set be iterated
under lock in order to prune every new variant as it is created without
crashing
fixes crashes in spec@arb_separate_shader_objects@400 combinations.*
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28900>
(cherry picked from commit f18a1d3a31)
This array has 3 components, because it's meant to hold the X, Y and Z
components of the work-group size sysval. However, mir_pick_ubo assumes
vec4 for the push-uniforms, which ends up promoting this to 4
components.
So let's make sure we don't write that last component. It's not going to
do anything good.
In practice, this leads to the viewport descriptor being smashed, which
doesn't actually do any real-world harm, because this only happens in
compute batches where that descriptor is unused. However, writing
outside of arrays is undefined behavior, so we should fix it regardless.
Fixes: 5006167061 ("panfrost: Hook-up indirect dispatch support")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28856>
(cherry picked from commit 186f7fa915)
There're many cases in which the ring submissions must succeed. We don't
worry about real oom since things would fail earlier. For simulated oom
from random intentional allocs, there isn't robust way to fail those
must succeeds. e.g. the commands that don't have return codes or valid
error return struct defaults. So real oom propagation is still at best
effort.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28914>
(cherry picked from commit 3e16d25d1a)
We were accidentally leaving XY_BLOCK_COPY_BLT's Source and Destination
MOCS fields set to 0 (Error: Reserved for Non-Use) on Gfx12.0 systems.
This was causing assert fails in debug builds, since we try to ensure
that we don't do that. In theory, MOCS 0 is supposed to be equivalent
to MOCS 2 (all the caching), but...we probably ought to use MOCS 3
(uncached). Every Gfx12.5+ platform requires it, so although there
isn't a note about Gfx12.0 needing that, it's possible that it does.
We're currently only using the blitter for DRI PRIME blits on Gfx12.0,
anyway, and I think we're flushing all the caches regardless.
This bug was somewhat obscure to hit:
- You need a hybrid graphics system with Gfx12.0 and some other GPU
- You have to be using "reverse PRIME", i.e. rendering on the integrated
GPU and displaying on the discrete one. This is not the common case.
- You have to be using a debug build.
No observable performance delta in GfxBench5 Car Chase (an arbitrary
program) when rendering on Alderlake GT1 and displaying on an Arc A770.
Fixes: 194afe8416 ("anv/iris/blorp: use the right MOCS values for each engine")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28894>
(cherry picked from commit e6fb3ba037)
This was missing and this caused test failures for formats different
than VK_FORMAT_R8_UINT which is the only one supported for FSR.
Fixes recent
dEQP-VK.api.info.unsupported_image_usage.*.fragment_shading_rate_attachment.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28893>
(cherry picked from commit e8d94536d2)
Bspec 57023: RENDER_SURFACE_STATE:: Shader Channel Select Red
"Render Target messages do not support swapping of colors with
alpha. The Red, Green, or Blue Shader Channel Selects do not
support SCS_ALPHA. The Shader Channel Select Alpha does not support
SCS_RED, SCS_GREEN, or SCS_BLUE."
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28791>
(cherry picked from commit 2d8686ccd5)
We track stencil clears and writes to optimize them. Unfortunately, the
code for doing this tracks the whole resource, not individual layers or
levels within the resource, which can result in incorrect output when
different levels or layers are accessed. Modified to optimize only the first
layer/level; this will handle the common case of a single stencil texture
while allowing arrays or mipmaps to still work (albeit slightly slower).
The original optimization was introduced in a2463ec271 ("panfrost:
Constant stencil buffer tracking") but the code has been reformatted
since then, so this change won't apply as-is that far back (although it's
fairly obvious how to apply it by hand).
Fixes: a2463ec271 ("panfrost: Constant stencil value tracking")
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28832>
(cherry picked from commit dae6b6a23d)
Since the group is created from the onset, we have to make
sure that four or eight src values don't have a readport
conflict, so force a pre-loading of the values to registers
evenly distributed over the channels and let copy-propagation
take care of cleaning up un-neccesary moves.
Fixes: 79ca456b48
r600/sfn: rewrite NIR backend
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28840>
(cherry picked from commit 07995b98a8)
For integer types, the signedness is determined by flags on the muladd
instruction. The types of the sources play no role. Previously we were
using the signedness of the type and ignoring the mask.
Adjust the types passed to the dpas_intel intrinsic to match.
Fixes various
dEQP-VK.compute.*.cooperative_matrix.khr_*.matrixmuladd_cross.* tests on
different Intel platforms. Some platforms had failing tests, and some
platforms failed EU validation before the tests could fail.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28822>
(cherry picked from commit 2ce558d928)
The pointer "xfb" is allocated with a clone of "so->nir" and lost
without further processing.
The function panfrost_shader_compile() was the one processing "xfb".
The call of this function was removed with the commit 40372bd720.
This makes "xfb" not required anymore.
For instance, this issue is triggered on a Mali-T820 with
"piglit/bin/arb_transform_feedback2-change-objects-while-paused -auto":
Indirect leak of 32776 byte(s) in 1 object(s) allocated from:
#0 0xf78f30a6 in malloc (/usr/lib/libasan.so.6+0x840a6)
#1 0xee9cd4ee in ralloc_size ../src/util/ralloc.c:118
#2 0xee9cf7ae in create_slab ../src/util/ralloc.c:801
#3 0xee9cf7ae in gc_alloc_size ../src/util/ralloc.c:840
#4 0xef74ab82 in nir_undef_instr_create ../src/compiler/nir/nir.c:888
#5 0xef76212c in clone_ssa_undef ../src/compiler/nir/nir_clone.c:328
#6 0xef76212c in clone_instr ../src/compiler/nir/nir_clone.c:438
#7 0xef7642d8 in clone_block ../src/compiler/nir/nir_clone.c:501
#8 0xef7642d8 in clone_cf_list ../src/compiler/nir/nir_clone.c:555
#9 0xef7657dc in clone_function_impl ../src/compiler/nir/nir_clone.c:632
#10 0xef766cb8 in nir_shader_clone ../src/compiler/nir/nir_clone.c:743
#11 0xf007673e in panfrost_create_shader_state ../src/gallium/drivers/panfrost/pan_shader.c:434
#12 0xeeb6766c in st_create_common_variant ../src/mesa/state_tracker/st_program.c:781
#13 0xeeb71d1c in st_get_common_variant ../src/mesa/state_tracker/st_program.c:834
#14 0xeeb72ea2 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1320
#15 0xeeb72ea2 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1421
#16 0xef3806ec in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:748
#17 0xef3806ec in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:984
#18 0xef2992f6 in link_program ../src/mesa/main/shaderapi.c:1336
#19 0xef2992f6 in link_program_error ../src/mesa/main/shaderapi.c:1445
Fixes: 40372bd720 ("panfrost: Implement a disk cache")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28743>
(cherry picked from commit 4f5e9a21c5)
panfrost_initialize_surface is called when a surface is written to,
and marks that surface as valid. If the surface is a depth buffer
with a separate stencil, that separate stencil should also be marked
as valid; otherwise, readpixel will skip reading it (and hence the
stencil data will be read as uninitialized). This only affects
DEPTH32F_STENCIL8 formats.
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28738>
(cherry picked from commit c939111f3f)
Remove a premature optimization. When PIPE_MAP_DISCARD_WHOLE_RESOURCE
is set we were setting create_new_bo, and then if that was set we skipped
a set of tests which if passed would cause a panfrost_flush_writer.
In fact we need that flush in some cases (e.g. when any batch is
reading the resource). Moreover, we should sometimes copy the resource
(set the copy_resource flag) and that again was being skipped if
create_new_bo was initially true due to PIPE_MAP_DISCARD_WHOLE_RESOURCE
being set.
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28406>
(cherry picked from commit e3d123b7a6)
We take advantage of the needs_quad_helper_invocation information to
only enable the PER_QUAD TMU access on Fragment Shaders when it is needed.
PER_QUAD access is also disabled on stages different to fragment shader.
Being enabled was causing MMU errors when TMU was doing indexed by vertexid
reads on disabled lanes on vertex stage. This problem was exercised by some
shaders from the GTK new GSK_RENDERER=ngl that were accessing a constant buffer
offset[6], but having PER_QUAD enabled on the TMU access by VertexID was
doing hidden incorrect access to not existing vertex 6 and 7 as TMU was
accessing the full quad.
cc: mesa-stable
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28740>
(cherry picked from commit 97f5721bfc)
The device->using_sparse variable is only used at cmd_buffer_barrier()
to decide if we need to apply the heavier-weight flushes that are only
applicable to sparse resources. The big problem here is that we need
to apply the flushes to the non-image and non-buffer memory barriers,
so we were trying to limit those only to applications that ever submit
a sparse resource to the sparse queue.
The reason why we were applying this only to devices that ever
submitted sparse resources is that dxvk games have this thing where
during startup they create and then delete tiny sparse resources, so
switching device->using_sparse to true at resource creation would make
basically every dxvk game start applying the heavier-weight
workaround.
The problem with all that is that even if an application creates a
sparse resource but doesn't ever bind them, the resource should still
behave as an unbound resource (because they are bound with a NULL
bind), so the flushes affecting them should happen. This case is
exercised by vkd3d-proton/test_buffer_feedback_instructions_sm51.
In order to satisfy all the above cases and only really apply the
heavier-weight flushes to applications actually using sparse
resources, let's just count the number of sparse resources that
currently exist and then apply the workaround only if it's not zero.
That covers the dxvk case since dxvk deletes the resources as soon as
they create, so num_sparse_resources goes back to 0.
Testcase: vkd3d-proton/test_buffer_feedback_instructions_sm51
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10960
Fixes: 6368c1445f ("anv/sparse: add the initial code for Sparse Resources")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
(cherry picked from commit 95dc34cd97)
Since DGC preprocessing for IBO is supported, the driver generates
an indexed indirect draw but SQTT markers were missing and this
introduced complete non-sense in RGP captures.
Fixes: e59a16bbb8 ("radv: use an indirect draw when IBO isn't updated as part of DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28710>
(cherry picked from commit 4586451b2d)
I'm not sure why zink would want dri2 here instead of kopper,
I'm sure it's some side effect of something else, let zink
use the kopper paths.
This fixes:
dEQP-GL45-ES3.info.vendor
on zink on nvk with GL cts using EGL.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: 12a47b84b7 ("egl/dri2: trigger drawable invalidation from surface queries for zink")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28707>
(cherry picked from commit 223aedfa5d)
Indeed, the pointer processed by r300_upload_index_buffer() was not the right one.
This is the reason why "deqp-gles2 --deqp-case=dEQP-GLES2.functional.draw.draw_elements.indices.user_ptr.index_byte"
was failing (the logs are below). This change corrects this issue and makes the related deqp tests work properly.
This change considers that r300_upload_index_buffer() sets indexBuffer to NULL. The indexBuffer resource
should be properly freed once the buffer is processed. This is required to avoid another refcnt imbalance
(another kind of memory leak).
==9962==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000721f at pc 0x7fd57b54a9a0 bp 0x7fffd2c39290 sp 0x7fffd2c38a40
READ of size 30 at 0x60200000721f thread T0
#0 0x7fd57b54a99f in __interceptor_memcpy (/usr/lib64/libasan.so.6.0.0+0x3c99f)
#1 0x7fd570d10528 in u_upload_data ../src/gallium/auxiliary/util/u_upload_mgr.c:333
#2 0x7fd57114142b in r300_upload_index_buffer ../src/gallium/drivers/r300/r300_screen_buffer.c:44
#3 0x7fd57113943c in r300_draw_elements ../src/gallium/drivers/r300/r300_render.c:632
#4 0x7fd57113bbc4 in r300_draw_vbo ../src/gallium/drivers/r300/r300_render.c:840
#5 0x7fd570d212e2 in u_vbuf_draw_vbo ../src/gallium/auxiliary/util/u_vbuf.c:1487
#6 0x7fd56fceb873 in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:1709
#7 0x7fd56fcf28c5 in _mesa_DrawElementsBaseVertex ../src/mesa/main/draw.c:1852
Fixes: 330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28523>
(cherry picked from commit 2b6993cb71)
Workaround states that if Destination Alpha Blend
Factor==BLENDFACTOR_ZERO, instead use BLENDFACTOR_CONST_ALPHA with the
constant alpha set to 0.
We had typo while setting the DestinationAlphaBlendFactor, use
BLENDFACTOR_CONST_ALPHA instead of BLENDFACTOR_CONST_COLOR.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28640>
(cherry picked from commit 7cc604ed1b)
As we were checking the bind count after decreasing the ref count,
we could hit a situation where the worker thread decreases the bind count
just after the ref count is decreased, but before the bind count is checked
which would lead to both threads calling the resource dtor.
Cc: mesa-stable
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28232>
(cherry picked from commit d6044cf857)
The BSpec page "Flush Types" (46213) says the following about the Tex
Invalidate bit:
"Requires stall bit ([20] of DW) set for all GPGPU Workloads."
For newer platforms, this is documented in the description of the
texture invalidation bit in the PIPE_CONTROL page (56551):
"CS Stall bit in PIPE_CONTROL command must be always set for GPGPU
workloads when Texture Cache Invalidation Enable bit is set"
Iris had it only for GFX_VER 9 and 11, while Anv had it missing for
everything.
Please notice that this patch includes a revert of 397e728ef4.
Fixes: 397e728ef4 ("iris: Drop GPGPU Tex Invalidate restriction for TGL+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28608>
(cherry picked from commit cf7e1f3817)
The new GTK+ GL renderer is extensively using instanced rendering. SVGA
driver was incorrectly detecting the instanced draws by only checking
whether the instance count was greater than 1. Base instance has to
be also checked to make sure that the draw correctly offsets the vertex
buffer.
Fix instanced draw detection by checking both the instance count and
the base instance. Fixes the new GTK+ 4 GL renderer.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: ccb4ea5a43 ("svga: Add GL4.1(compatibility profile) support in svga driver")
Reviewed-by: Neha Bhende <neha.bhende@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28616>
(cherry picked from commit 955444e068)
When trying to increase the height alignment to unlock multi-pipe resolve for
better performance we need to be careful to not overstep the source dimensions
as this would cause the blit to be rejected.
Do so and also rearrange the code a bit to make it more obvious what is being
done.
Fixes: 797454edfc ("etnaviv: rs: fix blits with insufficient alignment for dual pipe operation")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28598>
(cherry picked from commit 2964812aac)
copy_image calls blit now for multisampled images, including
depth. But blit explicitly uses only PIPE_MASK_RGBA, so it is
incapable of copying depth buffers.
This patch checks the destination format and uses PIPE_MASK_ZS if
it is a depth or stencil. Ideally we would simply use PIPE_MASK_RGBAZS
always, but not all drivers actually handle getting this mask
(they probably should, but that's another story).
The change to copy_image was in 5027b5aa2, so in some sense this
patch "fixes" that. In fact though the issue wasn't in the copy_image
change, it was always latent in blit().
Fixes: 5027b5aa28 ("gallium: stop calling resource_copy_region for multisampled copy_image")
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28585>
(cherry picked from commit 0cb852050d)
When has_vm_control is supported it takes a different code path and
creates one context per engine and in this code path we were not
setting the protected context flag.
The lack of this is not causing any test to fail in our CI but it is
better do what we are supposed to do.
Fixes: fd40134487 ("anv: allow protected GEM context creation")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28299>
(cherry picked from commit 77c004f7ca)
some gnome tests are seeing this:
==4579== Invalid read of size 4
==4579== at 0x161B28FB: UnknownInlinedFun (simple_mtx.h:106)
==4579== by 0x161B28FB: st_save_zombie_sampler_view (st_context.c:213)
==4579== by 0x161D762A: st_texture_release_all_sampler_views.part.0 (st_sampler_view.c:272)
==4579== by 0x161D7CB6: st_texture_release_all_sampler_views (st_sampler_view.c:258)
==4579== by 0x161D7CB6: st_delete_texture_sampler_views (st_sampler_view.c:292)
==4579== by 0x16191B9E: _mesa_delete_texture_object (texobj.c:523)
==4579== by 0x16191CDC: _mesa_reference_texobj_ (texobj.c:637)
==4579== by 0x1619632F: _mesa_reference_texobj (texobj.h:92)
==4579== by 0x1619632F: _mesa_free_texture_data (texstate.c:1114)
==4579== by 0x162EEE1D: _mesa_free_context_data (context.c:1155)
==4579== by 0x161B3C0F: st_destroy_context (st_context.c:999)
==4579== by 0x160F155A: dri_destroy_context (dri_context.c:277)
==4579== by 0x1603C371: dri2_destroy_context (egl_dri2.c:1592)
==4579== by 0x1602EBF5: eglDestroyContext (eglapi.c:918)
==4579== by 0x502DF3C: gdk_gl_context_dispose (gdkglcontext.c:211)
==4579== Address 0x34dc9b08 is 7,016 bytes inside an unallocated block of size 10,336 in arena "client"
==4579==
It appears we destroy the mutex and zombie objects, but freeing
context data seems to add them back.
This might not be the complete answer.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28565>
(cherry picked from commit f8e48b561e)
We need invalidate CCHE when we optimize out an invalidation of UCHE,
for example a storage image write to texture read. We missed this
earlier because of the blob's tendency to always over-flush, but the
blob does use this when building acceleration structures.
Fixes: 95104707f1 ("tu: Basic a7xx support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28445>
(cherry picked from commit fb1c3f7f5d)
this breaks interfaces where the consumer reads its input indirectly
and only some of the components are written by clobbering all
the components, even if the unwritten components are never accessed
Fixes: 459b49a174 ("zink: add a new linker pass to handle mismatched i/o components")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28466>
(cherry picked from commit 460cd99ea5)
A last memory leak related to constants_remap_table is happening.
This memory leak is triggered by two deqp-gles2 tests.
For instance, this issue is triggered with
"deqp-gles2 --deqp-case=dEQP-GLES2.functional.uniform_api.random.13":
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x7f1b4a5de7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f1b401a2cdf in rc_remove_unused_constants ../src/gallium/drivers/r300/compiler/radeon_remove_constants.c:101
#2 0x7f1b40185386 in rc_run_compiler_passes ../src/gallium/drivers/r300/compiler/radeon_compiler.c:476
#3 0x7f1b40185625 in rc_run_compiler ../src/gallium/drivers/r300/compiler/radeon_compiler.c:498
#4 0x7f1b401c14d2 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:172
#5 0x7f1b401b669a in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#6 0x7f1b401baf73 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:592
#7 0x7f1b40128db7 in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1071
#8 0x7f1b3e67799d in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1073
#9 0x7f1b3e680285 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1119
#10 0x7f1b3e6812fa in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1284
#11 0x7f1b3e6812fa in st_finalize_program ../src/mesa/state_tracker/st_program.c:1363
#12 0x7f1b3f13d501 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:754
#13 0x7f1b3f13d501 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:990
#14 0x7f1b3efeef75 in link_program ../src/mesa/main/shaderapi.c:1336
#15 0x7f1b3efeef75 in link_program_error ../src/mesa/main/shaderapi.c:1445
Fixes: 29df85788a ("r300: fix constants_remap_table memory leak")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28522>
(cherry picked from commit 24a5165cdf)
For BROADCAST and MOV_INDIRECT, required_exec_type was returning
brw_int_type(type_sz(t), false), which is an unsigned type. However,
get_exec_type(inst) returns the original type for either Q or UQ.
This meant that has_invalid_exec_type would detect a mismatch and
trigger lowering.
That lowering would insert new 64-bit MOVs, which would need to be
lowered on platforms which don't support Q/UQ. Except, we already
ran that lowering pass earlier. So, the unlowered Q/UQ MOVs would
reach the software scoreboarding pass, and trigger failures in the
inferred_exec_pipe() function, as no pipe is available to handle
64-bit integer operations.
It turns out that we don't need the region lowering pass to do
anything for these opcodes. The generator code for both BROADCAST
and MOV_INDIRECT already handle decomposing Q/UQ operations into
32-bit MOVs when they're not supported. And, it also implicitly
converts to integer types, even for floating point sources. The
inferred_exec_pipe function already special cases them to note
that they'll always be handled on the integer pipe, so that matches.
Just drop the region lowering code for these opcodes.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit 7a24f29fbb)
We are overriding the type to Q/UQ, so we need to split to two MOVs
if 64-bit integer math is not supported. For reference, Meteorlake
does support 64-bit floats but would still not work correctly here.
See also brw_broadcast(), which does similar indirects but correctly
checks has_64bit_int instead of has_64bit_float.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit a90edad9f7)
==134077== 96 bytes in 1 blocks are definitely lost in loss record 1 of 3
==134077== at 0x4840808: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==134077== by 0x6D6F690: vk_default_alloc (vk_alloc.c:26)
==134077== by 0x52EEEBE: vk_alloc (vk_alloc.h:48)
==134077== by 0x52EEEEE: vk_zalloc (vk_alloc.h:56)
==134077== by 0x52EF47E: xe_exec_process_syncs (anv_batch_chain.c:132)
==134077== by 0x52EF8F6: xe_execute_trtt_batch (anv_batch_chain.c:215)
==134077== by 0x5301670: anv_queue_submit_trtt_batch (anv_batch_chain.c:1697)
==134077== by 0x603D135: gfx125_write_trtt_entries (genX_cmd_buffer.c:6091)
==134077== by 0x5370B44: anv_sparse_bind_trtt (anv_sparse.c:595)
==134077== by 0x5370CFC: anv_sparse_bind (anv_sparse.c:629)
==134077== by 0x5370E6E: anv_init_sparse_bindings (anv_sparse.c:670)
==134077== by 0x5328037: anv_CreateBuffer (anv_device.c:5071)
Note to backporters: this is only for when xe.ko is being used and
ANV_SPARSE_USE_TRTT=1 is exported. This is not the regular code path.
Fixes: 18bd00c024 ("anv/trtt: don't wait/signal syncobjs using the CPU anymore")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28455>
(cherry picked from commit 38af7254e2)
Xe KMD don't refcount anything, so resources could be freed while they
are still in use if we don't wait for exec_queue to be idle.
This issue was found with Xe KMD error capture, VM was already
destroyed when it attemped to capture error state but it can also
happen in applications that did not hang.
This fixed the '*ERROR* GT0: TLB invalidation' errors when running
piglit all test list.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27500>
(cherry picked from commit 665d30b544)
When LLVM 18 is used, pass the no-verify-fixpoint option when
running the instcombine pass. Otherwise LLVM may abort with an
error.
The background here is that this option is enabled by default for
testing purposes, because instcombine is normally only explicitly
invoked like this inside tests. If it is used in an actual
production pipeline, the no-verify-fixpoint option needs to be
enabled.
This should fix the issue reported at
https://bugzilla.redhat.com/show_bug.cgi?id=2268800.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28101>
(cherry picked from commit 99f0449987)
When tu_shader object is destroyed through vk_pipeline_cache, the relevant
destroy callback should relay to the general tu_shader_destroy function
that will also clean up owned resources.
During shader creation, the ir3_shader object should be destroyed once the
shader variants are retrieved. Since those variants are owned by tu_shader
they should be freed up in tu_shader_destroy.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: a03525d8db ("tu: Split program draw state into per-shader states")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27847>
(cherry picked from commit 3ee81ffe14)
It was by pure luck that all sources (and the result) of nir_dpas_intel
had the same number of components. It is possible to support matrix
sizes where the accumlator matrix and the result matrix are larger
(e.g., 16x8 * 8x16 = 16x16).
This breaks all of the assumptions of NIR's infrastructure for code
generating intrinsics. Fix the by making the accumulator matrix be the
first source. The accumulator and the result will always have the same
dimensions (due to rules of matrix multiplication) and the same type
(due to restructions of the cooperative matrix extension). This forces
them to have the same number of components.
This doesn't fix all the potential problems. NIR expects that all
0-sized sources will have the same number of components. This just
ensures that the result has the correct number of components.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit a8115221e5)
If the destination was the accumulator but is no longer, having the flag
set is not correct. On Xe2 this also causes a validation error.
v2: Reword the comment to be more clear. Suggested by Jordan.
Fixes: efa4e4bc5f ("intel/fs: Introduce regioning lowering pass.")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit be4fa59a72)
This reverts commit be8b7980e6.
Store the cache entry so that the fast path picks up the optimized
pipeline when its available from a background optimized_compile_job().
Observed traces where it would take the fast path back and forth using
an unoptimized pipeline and never pick up the optimized pipeline leading
to >50% fps drop.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28440>
(cherry picked from commit d6978b1af2)
opt_copy_propagation() can sometimes propagate FIXED_GRF sources into
SHADER_OPCODE_SENDs as the message payload. For example, GS input
reads, which simply take a URB handle and have the offset in the
descriptor. For non-VGRFs, there isn't a payload to split, so just
skip past such send messages.
Fixes: 589b03d02f ("intel/fs: Opportunistically split SEND message payloads")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28067>
(cherry picked from commit ba11127944)
External images translate to 2D images in ntv, so we will have to emit
OpImageQuerySizeLod instead of OpImageQuerySize (thanks Faith for
pointing that out). This quells
VUID-VkShaderModuleCreateInfo-pCode-08737
Image must have either 'MS'=1 or 'Sampled'=0 or 'Sampled'=2
%32 = OpImageQuerySize %v2int %31
triggred by piglit
spec@oes_egl_image_external_essl3@oes_egl_image_external_essl3
on Zink.
Fixes: 3f783a3c50
zink: omit Lod image operand in ntv when not using an image texture dim
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28389>
(cherry picked from commit b6c1390354)
These are both handled by inserting them directly at the top of the
nir_function_impl. However, if the cursor is already at the top, it
never gets updated so we end up inserting other stuff after the newly
inserted undef or decl_reg. It's an odd edge case to be sure but I hit
it with my new NIR CF pass for NAK.
Fixes: 1be4c61c95 ("nir/builder: Add a helper for creating undefs")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
(cherry picked from commit a782809f81)
When shaders might read metadata (DCC) this must be flushed.
VK_ACCESS_2_MEMORY_READ_BIT includes all READ bits that are relevant.
I think this issue has been uncoverd since vkd3d-proton d1425ee4
("vkd3d: Use VK_ACCESS_MEMORY_{READ,WRITE}_BIT where appropriate")
because RADV used to be missing VK_ACCESS_2_MEMORY_{READ,WRITE} in the
past and vkd3d-proton added a special workaround that has been removed.
This fixes some DCC corruption in WWE 2K24.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10774
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28332>
(cherry picked from commit 585b4c5a01)
in a sequence like:
* resize A
* clear
* resize B
* clear
* resize C
* clear
for a swapchain resource, the geometry for a given op after the resize
may desync for the op with which it was executed, but this is fine
since the underlying swapchain object will have to be re-created anyway
fixes#10827
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28214>
(cherry picked from commit ee13512a62)
For instance, this issue is triggered with
"piglit/bin/glslparsertest tests/spec/arb_bindless_texture/compiler/images/arith-bound-image.frag pass 3.30 GL_ARB_bindless_texture GL_ARB_shader_image_load_store":
Direct leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe9a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7f84ba7e0801 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4391
#2 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#3 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#4 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#5 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#6 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#7 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#8 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Direct leak of 136 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbff57 in operator new(unsigned long) (/usr/lib64/libasan.so.6+0xb2f57)
#1 0x7f84b1a5f749 in LLVMCreateBuilderInContext (/usr/local/lib64/libLLVM-17.so+0xc84749)
#2 0x7f84ba7817b0 in ac_llvm_context_init ../src/amd/llvm/ac_llvm_build.c:54
#3 0x7f84ba542b7a in si_llvm_context_init ../src/gallium/drivers/radeonsi/si_shader_llvm.c:120
#4 0x7f84ba542b7a in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:832
#5 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#6 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#7 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#8 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#9 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b9fee in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f84b81b9fee in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f84b81b05c7 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f84b81b05c7 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f84ba7e06ae in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4381
#7 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#8 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#9 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#10 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#11 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#12 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#13 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b9fee in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f84b81b9fee in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f84b81b05c7 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f84b81b05c7 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f84ba7e06e4 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4382
#7 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#8 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#9 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#10 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#11 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#12 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#13 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b046c in _mesa_hash_table_create ../src/util/hash_table.c:182
#3 0x7f84ba7e06e4 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4382
#4 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#5 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#6 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#7 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#8 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#9 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#10 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b046c in _mesa_hash_table_create ../src/util/hash_table.c:182
#3 0x7f84ba7e06ae in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4381
#4 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#5 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#6 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#7 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#8 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#9 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#10 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
SUMMARY: AddressSanitizer: 920 byte(s) leaked in 6 allocation(s).
Fixes: d92d35c9db ("ac/llvm: add a return value to ac_nir_translate")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28099>
(cherry picked from commit 0fd907fc7b)
The vma_samplers vma heap is initialized unconditionally. Don't use
device->physical->indirect_descriptors as a condition on whether to
free it or not.
From my TGL machine:
==373617== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1
==373617== at 0x48459F3: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==373617== by 0x6926DC0: util_vma_heap_free (vma.c:339)
==373617== by 0x6925ED3: util_vma_heap_init (vma.c:53)
==373617== by 0x5334EDA: anv_CreateDevice (anv_device.c:3404)
==373617== by 0x685593A: vk_tramp_CreateDevice (vk_dispatch_trampolines.c:78)
==373617== by 0x48A6D56: terminator_CreateDevice (loader.c:5833)
==373617== by 0x9C2293F: vulkan_layer_chassis::CreateDevice(VkPhysicalDevice_T*, VkDeviceCreateInfo const*, VkAllocationCallbacks const*, VkDevice_T**) (chassis.cpp:497)
==373617== by 0x48B0690: loader_create_device_chain (loader.c:4937)
==373617== by 0x48B1327: loader_layer_create_device (loader.c:4317)
==373617== by 0x48B8D79: vkCreateDevice (trampoline.c:1004)
==373617== by 0x10CC7A: MyApp::MyApp(int, bool) (sparse.cpp:608)
==373617== by 0x1201E8: main (sparse.cpp:6025)
Fixes: 7c76125db2 ("anv: use 2 different buffers for surfaces/samplers in descriptor sets")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28303>
(cherry picked from commit 6ec1e322f0)
The hand rolled etnaviv conversion functions were able to handle
negative input values when converting to fixpoint. By replacing
them with U_FIXED all negative values are clamped to zero, which
breaks usages where negative inputs are valid, like lodbias.
Fixes: 8bce68edf5 ("etnaviv: switch to U_FIXED(..) macro")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28224>
(cherry picked from commit 4b8981e471)
Fixes validation error:
VUID-VkShaderModuleCreateInfo-pCode-08737
AtomicFAddEXT: expected Pointer to point to a value of type Result
Type
%51 = OpAtomicFAddEXT %float %49 %uint_1 %uint_0 %50
when running
spec@nv_shader_atomic_float@execution@ssbo-atomicadd-float
Fixes: 9f6be8effb
zink: store and use alu types for ntv defs
v2: Fix commit message (Mike)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28243>
(cherry picked from commit 50a6c5d5fa)
Since we split misaligned attributes, we could overwrite one of these
VGPRs in the middle of loading the attribute.
For example:
v_add_u32_e32 v4, vcc, s7, v1
s_waitcnt lgkmcnt(0)
buffer_load_dword v4, v4, s[32:35], 0 idxen
buffer_load_dword v5, v4, s[32:35], 0 idxen offset:4
can overwrite the vertex index in the load of the first component.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27920>
(cherry picked from commit ec892c4d2b)
The stride given in the shader is in number of elements of the of the
type pointed by the given pointer, which may not match the matrix own
element type.
Since we cast the pointer to match the element type, the stride needs to
be ajusted accordingly.
v2:
- Fix mismatching bit-width in matrix element type and pointer type (Caio)
- Do the stride calculation in one place
Fixes dEQP-VK.compute.pipeline.cooperative_matrix.khr_*.multicomponent.*
Fixes: 3a35f8b29b ("intel/cmat: Lower cmat_load and cmat_store")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10820
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27903>
(cherry picked from commit 446f652cde)
The `stride` and `offset` attributes are meaningful for the "virtual"
register files (VGRFs, UNIFORMs and ATTRs). Accumulator is an ARF so
validation should check `hstride` (part of the <V,W,H> triple) and `subnr`
instead.
Fixes: 12d7aaf2b8 ("intel/compiler: add more validation for acc register usage")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28059>
(cherry picked from commit e324fbbe68)
This ensure the region triple <V,W,H> is set correctly, in this case the
desired region is a sequential like <8,8,1>. Without the helper the
sequence we get is <0,1,0> -- which the generator currently partially
adjusts when emitting code, but is not sufficient when doing validation
earlier.
The code generated code is slightly modified. From crucible test
func.shader.subtractSaturate.uint in the fragment shader for SIMD8, the
diff looks like
```
mov(8) acc0<1>UD g21<8,8,1>UD { align1 1Q $0.dst };
-add.sat(8) g22<1>UD -acc0<0,1,0>UD g16<8,8,1>UD { align1 1Q @1 $0.dst };
+add.sat(8) g22<1>UD -acc0<8,8,1>UD g16<8,8,1>UD { align1 1Q @1 $0.dst };
```
Note that without the patch generator adjusted the hstride for acc0 used
as destination (see brw_set_dest), but kept the src region as is. For
the source, it is not clear to me why the <0,1,0> would work correctly
here since it is a scalar, but using <8,8,1> it is correct.
Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28059>
(cherry picked from commit db8022dc4d)
It was correct for the parameters that the driver was using, but incorrect
for other parameters.
1. The address computation must multiply the workgroup size (wave size)
by num_mem_ops to fix the case when num_dwords_per_thread > 4.
2. nir_load_ssbo shouldn't set the number of components to 4 when
num_dwords_per_thread < 4.
Fixes: 6584088cd5 - radeonsi: "create_dma_compute" shader in nir
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28119>
(cherry picked from commit e99765df08)
this should yield more consistent results and avoid weird cases where
various formats are queried for things they don't support and won't use
Fixes: 9a412c10b7 ("zink: set all usage flags when querying sparse features")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28115>
(cherry picked from commit 8fa413fef0)
only certain formats are required to have the storage bit, so be more
tolerant of failure in the case where drivers actually check flags
and reject storage usage when it's actually unsupported
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28115>
(cherry picked from commit 61e5b6ad9d)
the existing guesswork during format selection for teximage is
accurate most of the time, but it's not accurate all of the time.
GL/ES each have a set of sized formats that are required to be
color renderable, and so any time one of these is allocated as a
texture, it MUST have the rendertarget usage bit attached so that
it can later be bound as a framebuffer attachment
an alternative might be to relax this and then try to do migration
to a different format/buffer later if necessary, but that's hard and
probably not actually as useful
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28055>
(cherry picked from commit 0f66589c2a)
This fixes an issue hit by one of darktable's kernels, where the sampler
argument got assigned the location of a dead kernel parameter turning it
into a zombie and leading us to trash the kernel input buffer's layout.
Fixes: 25b8a34b48 ("rusticl/kernel: inline samplers")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28121>
(cherry picked from commit 2df640c4f6)
If you look at the sampler message header on Gfx9+, you'll see that we
mostly only use 2 dwords (dw2 & dw3). DW2 has a bunch of sampler
parameters, DW3 is the sampler handle.
On Gfx9 we can micro optimize by copying r0 into the header because
the HW mostly doesn't care about other DWs. We just have to clear dw2
on non VS/FS stages.
On Gfx11+, we always have to do a careful copy of the r0.3 bits to
mask out the bottom unrelated bits. So there, just clearing the entire
header makes more sense.
On Xe2+, the dw4 of the header references the sampler feedback surface
handle and bit0 is a boolean to know whether to use that surface or
not. So it *REALLY* matters to have that as 0. If we copy r0, we'll
get random bits in dw4, leading to enable that surface.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28082>
(cherry picked from commit 75c6ad9907)
With RADV, when VS/TES and FS are compiled separately, the PrimitiveId
is exported unconditionally because it's not possible to know if the
FS reads it or not. This happens with fast-link GPL and shader object.
Though, the PrimitiveID should be ignored when it's implicitly exported
because otherwise the stream output LDS offset is incorrect.
This fixes a bunch of failures with transform feedback and Zink/RADV
when shader object is enabled on RDNA3.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27981>
(cherry picked from commit d12984edb8)
The algorithm used to rendering smooth lines worked under the assumption
that line coords were in the [0, 1] range. This was correct when using
an orthogonal projection, but not when using a perspective projection.
With a perspective projection (where the value for 1/Wc set in the VPM
is not 1.0), line coords values are also affected by this projection, so
the values are not in this range.
To deal with this, we normalize the line coords using the Wc value so
the range becomes [0, 1], and the smooth line rendering works as
expected.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10496
Fixes: ee4d51f8b2 ("v3d: Add a lowering pass for line smoothing")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28072>
(cherry picked from commit 69fbd5cb90)
The previous version had an optimization where, instead of actually
waiting on the FALCON to return, it would just do a bunch of nops in
some cases. This seems broken at least on Turing+ and results in
registers not ending up with the right values. It only really shows up
when you set two registers back-to-back in which case the second
SET_PRIV_REG may mess up the first.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27927>
(cherry picked from commit 0ed7bce8e5)
The driver is written that we should support ETNA_NUM_VARYINGS and reporting
a bigger number will cause some troubles. I had a quick look at galcore's
hw database and there are entries that report a higher value.
So I think what we want is to the minimum value of what kernel driver reports
and what the gallium driver should be able to handle.
Fixes: 84816c22e4 ("etnaviv: ask kernel for max number of supported varyings")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27923>
(cherry picked from commit 93255abe30)
As per spec, any colors, or color components, associated with a fragment
that are not written by the fragment shader are undefined.
So we might as well just write vec4(1.0) to output, since HW doesn't allow
us to have an empty FS.
Backport-to: 23.3
Backport-to: 24.0
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24855>
(cherry picked from commit 6998c48f77)
the ES spec imposes additional requirements for copy commands,
specifically that the formats have matching component sizes
the existing check used the driver's internal formats to check
for a match, which is broken since the spec requires the match be
between the passed internalFormat and the buffer's effective internal
format (i.e., this has no relation to what the driver supports)
fixes KHR-GLES3.copy_tex_image_conversions.forbidden* on a bunch of drivers
cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28030>
(cherry picked from commit 2cd192f879)
Batches must be ignored if batch count is zero, so all batch inspections
have to be gated behind batch count. For memcpy, it's UB if either src
or dst is NULL even when size is zero.
Side note:
- For original commit, this fixes just the memcpy UB
- For current codes, this fixes to not skip ffb batch prepare
Fixes: 493a3b5cda ("venus: refactor batch submission fixup")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28071>
(cherry picked from commit 8af267eb00)
This fixes the validation error
VUID-VkShaderModuleCreateInfo-pCode-08737
triggered by piglit:
spec@arb_gpu_shader5@execution@built-in-functions@fs-interpolateatsample-block-array:
GLSL.std.450 InterpolateAtSample: expected Sample to be 32-bit integer
%47 = OpExtInst %float %1 InterpolateAtSample %45 %float_0
Fixes: 9f6be8effb
zink: store and use alu types for ntv defs
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28043>
(cherry picked from commit b7d6d90dab)
We had a "Don't read out-of-bounds" sanity check for creating an alpha
when ATEST was needed, but that check happened only after we already
did a bi_extract(), which meant that the bi_extract could get into
trouble and assert() when there weren't enough components. Fixed by
re-arranging the calculation.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund>@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28045>
(cherry picked from commit 0e1862a2ab)
The code path for emitting tessellation commands when the TES needed
scratch space was failing to emit 3DSTATE_TE, and instead only emitting
3DSTATE_DS. This meant that you could get HS and DS enabled with
tessellation itself turned off, which is utter nonsense and would
cause a GPU hang.
Alchemist and later takes a different path and don't take this bug,
but all earlier hardware would hit it. Discovered while working on
compiler changes that caused a single piglit test to spill minorly,
and thus break entirely.
Fixes: 4256f7ed58 ("iris: Fill out scratch base address dynamically")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28032>
(cherry picked from commit 9e5fd49cbe)
If the app requests a swizzle on the shadow sampler which doesn't just
return the red channel or literal 0s/1s, we'll crash attempting to build
the result vector. Use something that's probably valid.
Cc: mesa-stable
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28001>
(cherry picked from commit cda6877cb6)
We precompile static state and count it as dynamic, so we have to
manually clear bitset that tells which dynamic state is set, in order to
make sure that future dynamic state will be emitted. The issue is that
framework remembers only a past REAL dynamic state and compares a new
dynamic state against it, and not against our static state masquaraded
as dynamic.
Example:
- Set dynamic state S with value A
- Bind pipeline with dynamic state S
- Draw
- Bind pipeline with static state S with value B
- Draw
- Set dynamic state S with value A
- Bind pipeline with dynamic state S
- Draw
Previously, at the last draw the dynamic state S was not dirty and
current dynamic state was equal to the past dynamic state, so
it was not emitted, while GPU used value B from static pipeline.
This fix, at the point of static pipeline binding, clears the
bitset which tells that dynamic state S was previously set.
This forces the next dynamic state to be re-emitted.
Fixes broken rendering in Arma 3, and probably some other
games running through DXVK.
Fixes: 97da0a7734
("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27961>
(cherry picked from commit a76fcebfc0)
There are multiple problems currently :
- blorp blitter commands overwrite the protection value coming from
the driver
- anv & iris are using render target MOCS for compute commands
Driver already have the ability to pass the MOCS values so we choose
to stick to that in this change. But now the driver need to select the
right MOCS depending on the engine the commands are going to run onto.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27956>
(cherry picked from commit 194afe8416)
ZINK_BIND_RESOURCE_DESCRIPTOR and ZINK_BIND_SAMPLER_DESCRIPTOR are
always used together, so that we can replace these two values with
ZINK_BIND_DESCRIPTOR and use only one bit to represent the value.
With that we can also remove the aliasing of ZINK_BIND_DESCRIPTOR with
PIPE_BIND_CONST_BW.
Fixes: 13c6ad0038
zink: use a single descriptor buffer for all non-bindless types
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28016>
(cherry picked from commit 8e239dda41)
technically this needs to be MUCH higher since there's no limitation
on the number of semaphore waits that can be submitted, but this is
enough to handle zink usage
fixes KHR-GL46.sparse_buffer_tests.BufferStorageTest
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27992>
(cherry picked from commit 9a53e3b1fd)
I started seeing
ACO ERROR:
In file ../src/amd/compiler/aco_validate.cpp:98
Operand and Definition types do not match: s1: %44 = p_parallelcopy %158
test_basic: ../src/amd/compiler/aco_interface.cpp:85: void validate(aco::Program*):
Assertion `is_valid' failed.
since commit 52ee4cf229 ("nir/builder: Teach nir_pack_bits and
nir_unpack_bits about 32_4x8").
Fixes: e0d232c2fc ("aco: implement nir_op_pack_32_4x8"). I
Suggested-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27972>
(cherry picked from commit 3d4dfae7eb)
Indeed, main_shader_part_ngg_es was not freed.
For instance, this issue is triggered on a radeonsi/gfx10 gpu with
"piglit/bin/arb_gpu_shader5-tf-wrong-stream-value -auto -fbo":
Direct leak of 1464 byte(s) in 1 object(s) allocated from:
#0 0x7f17904b99a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7f1785d65ac2 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3132
#2 0x7f1783af67d8 in util_queue_thread_func ../src/util/u_queue.c:309
#3 0x7f1783b51dfa in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#4 0x7f178f69d38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 2024 byte(s) in 1 object(s) allocated from:
#0 0x7f17904b97ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f1785d5443a in read_chunk ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:221
#2 0x7f1785d62cf5 in si_load_shader_binary ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:293
#3 0x7f1785d65255 in si_shader_cache_load_shader ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:423
#4 0x7f1785d65ef9 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3169
#5 0x7f1783af67d8 in util_queue_thread_func ../src/util/u_queue.c:309
#6 0x7f1783b51dfa in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#7 0x7f178f69d38a (/lib64/libc.so.6+0x8438a)
Fixes: 8f72f137ad ("radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27958>
(cherry picked from commit f93f215898)
Commit 90e364edb0 contained a typo in the glsl_dvec4_type() helper,
instead returning a glsl_ivec4_type. As an ivec4 is 2x smaller than
a dvec4, this also broke piglit sanity on crocus/hsw.
This also fixes the dvec2 helper, though it has not been specifically
tested anywhere.
Fixes: 90e364edb0 ("compiler/types: Add a few more helpers to get builtin types")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27917>
(cherry picked from commit f9acfeeb59)
For instance, this issue is triggered with
"piglit/bin/object-namespace-pollution glBitmap program -auto -fbo":
Direct leak of 112 byte(s) in 7 object(s) allocated from:
#0 0x7f472540e7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f471a9ce18f in rc_remove_unused_constants ../src/gallium/drivers/r300/compiler/radeon_remove_constants.c:101
#2 0x7f471a9b0836 in rc_run_compiler_passes ../src/gallium/drivers/r300/compiler/radeon_compiler.c:476
#3 0x7f471a9b0ad5 in rc_run_compiler ../src/gallium/drivers/r300/compiler/radeon_compiler.c:498
#4 0x7f471a9ec862 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:172
#5 0x7f471a9e1ab2 in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#6 0x7f471a9e6303 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:591
#7 0x7f471a9544fe in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1073
#8 0x7f4718f2ebe5 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1070
#9 0x7f4718f374b5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1116
#10 0x7f4718f38273 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1281
#11 0x7f4718f38273 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
#12 0x7f4718f389e9 in st_program_string_notify ../src/mesa/state_tracker/st_program.c:1378
#13 0x7f47199d9f99 in set_program_string ../src/mesa/main/arbprogram.c:413
Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27957>
(cherry picked from commit 29df85788a)
Do not continue and call drmIoctl on an invalid file descriptor.
Fix defect reported by Coverity Scan.
Argument cannot be negative
The negative argument will be interpreted as a very large unsigned value.
CID: 1544377
Cc: mesa-stable
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27788>
(cherry picked from commit b6962bbfc8)
When lower_simd_width() encounters an instruction that needs a larger
SIMD, for example SHADER_OPCODE_TXS_LOGICAL in Gfx4 needs at least
SIMD16. In this case the builder needs to be at least as large as
max_width, otherwise the group() setup will assert.
Turns out this did not assert before "by accident", since it was
relying on the default fs_visitor builder that had a dispatch width of 64,
a bogus placeholder value, expected not to be used.
However, when we changed the code to remove that builder (and the bogus
value), we created a new builder in the pass shader dispatch_width --
which work fine except in the case where we want to "lower" the SIMD above
the shader dispatch width. The fix is to also consider the already
calculated max_width when creating the builder.
Fixes: 5b8ec015f2 ("intel/compiler: Don't use fs_visitor::bld in remaining places")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10338
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27782>
(cherry picked from commit 337641cfcc)
Xe KMD also checks if cpu_caching caching set during bo creationg
matches with caching of the PAT index set in the VM unbind.
This was being unnoticed until now by luck and lack of testing in MTL.
So here always setting PAT index for all VM operations that has a bo
associated.
Fixes: eb18a92ef9 ("iris: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27893>
(cherry picked from commit 963c08b623)
Let's say you have an image in R32_UINT format, a view is created in
R32_SFLOAT and used as color attachment.
When resolving the attachment, our current code uses the image format
(R32_UINT in this case). But resolve mode might apply only to SFLOAT,
so we currently run into an assert in blorp.
We should instead use the view format. There is an exception for
depth/stencil view because the format we want to resolve is actually
the depth/stencil format, not just the depth or stencil aspect.
This fixes vkd3d-proton's test_multisample_resolve_formats.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27875>
(cherry picked from commit 5a7e58a430)
If monolithic shaders were inlined, there might not be a radv_shader
associated with some stages. Zero out the shader allocation info in that
case, the shader will get identified by hash instead.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27890>
(cherry picked from commit b588cb29a3)
The stw_device and its screen are set up independently. It's possible
to have a device without a screen if the DLL is loaded but never
called into, since DllMain for PROCESS_ATTACH sets up the stw_device,
but the screen is initialized later on the first call to get pixel
formats. If the DLL is loaded and then unloaded, don't crash.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27892>
(cherry picked from commit f96d31bc8a)
Found by inspection. Original code was returning the size instead of the
number of levels. This was probably an over zealous search-and-replace
when PIPE_CAP_MAX_TEXTURE_2D_LEVELS was changed to _SIZE.
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Fixes: 0c31fe9ee7 ("gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27800>
(cherry picked from commit 1b890825f6)
this is a bandaid fix that allows users (zink) to actually call the
functions intended to be called. the real fix would be to figure out
which extensions are enabled on the device and then only GPA the
functions associated with those extensions
that's too hard though so I'm slapping some flex tape on it
cc: mesa-stable
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27834>
(cherry picked from commit 5d91db9666)
This extension has been broken ever since the initial commit. It created
an XRGB DRIImage for the driver to render to, so whilst the presentation
was opaque, the buffer also completely lacked an alpha channel.
Fix it by making sure we only modify the FourCC we send to the Wayland
server when creating a buffer.
Closes: mesa/mesa#5886
Fixes: 9aee7855d2 ("egl: implement EGL_EXT_present_opaque on wayland")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27709>
(cherry picked from commit 9ea9a963aa)
When we bind a descriptor set with dynamic descriptors, we can't ignore
dynamic descriptors in previously-bound higher descriptor sets. For
example, assume we have descriptor sets A and B, each of which has one
dynamic storage buffer, and we do:
CmdBindDescriptorSets(firstSet=1, descriptorSetCount=1, A)
CmdBindDescriptorSets(firstSet=0, descriptorSetCount=1, B)
and in the first CmdBindDescriptorSets the pipeline layout includes a
descriptor set layout compatible with B in set 0. Then, following
"Pipeline Layout Compatibility," set 0 is disturbed:
When binding a descriptor set to set number N, a previously bound
descriptor set bound with lower index M than N is disturbed if the
pipeline layouts for set M and N are not compatible for set M.
Otherwise, the bound descriptor set in M is not disturbed
When it's disturbed, it's effectively turned into a set with 1 undefined
dynamic storage buffer:
When a descriptor set is disturbed by binding descriptor sets, the
disturbed set is considered to contain undefined descriptors bound
with the same pipeline layout as the disturbing descriptor set.
This disturbed set is compatible with B, so in the second
CmdBindDescriptorSets this clause doesn't apply:
If, additionally, the previously bound descriptor set for set N was
bound using a pipeline layout not compatible for set N, then all
bindings in sets numbered greater than N are disturbed.
and A remains valid to access. The code before 88db7364 worked only if
the pipeline layout when binding B contained a descriptor layout
compatible with A in set 1, because it used the pipeline layout's total
size when allocating the internal dynamic descriptors array, but that
isn't actually a requirement, so the previous code was already broken.
After 88db7364 we only allocate as much space as required by the current
descriptors being bound, because I misread the rules here, which made it
more broken and broke 3DMark Wildlife Extreme that does something like
this.
In order to properly fix this we need to keep track of the maximum ever
seen dynamic descriptor size, similar to what we already do for
descriptor sets, and use that. We have no idea what needs to be
preserved when binding a descriptor set with dynamic descriptors, so we
have to be conservative.
Fixes: 88db7364 ("tu: Rework dynamic offset handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27750>
(cherry picked from commit db0291c235)
Previously we were optimistic and tied this to certain format but wa
description lists other formats and bspec clearly disallows the usage.
Issue can be seen with different 16bpp tests, effect looks a bit like
dithering pattern but it is not, it is just rep16 failing.
Fixes:
GTF-GL46.gtf42.GL3Tests.texture_storage.texture_storage_texture_as_framebuffer_attachment
on DG2 and MTL, some 565 EGL tests on Android and internal issue on game
that displays a dither like pattern on the background while it's not
supposed to do that.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10646
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27794>
(cherry picked from commit 1a4f220c29)
For instance, this issue is triggered with
"piglit/bin/ext_framebuffer_multisample-accuracy all_samples color depthstencil -auto -fbo":
Direct leak of 1160 byte(s) in 1 object(s) allocated from:
#0 0x7fbe8897d7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7fbe7e7abfcc in rc_constants_copy ../src/gallium/drivers/r300/compiler/radeon_code.c:47
#2 0x7fbe7e7ec902 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:174
#3 0x7fbe7e7e1b22 in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#4 0x7fbe7e7e6373 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:591
#5 0x7fbe7e75456e in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1073
#6 0x7fbe7cd2ebe5 in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1070
#7 0x7fbe7cd374b5 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1116
#8 0x7fbe7cd38273 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1281
#9 0x7fbe7cd38273 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
#10 0x7fbe7d798ca8 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:724
#11 0x7fbe7d798ca8 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:952
#12 0x7fbe7d6790d5 in link_program ../src/mesa/main/shaderapi.c:1336
#13 0x7fbe7d6790d5 in link_program_error ../src/mesa/main/shaderapi.c:1447
...
SUMMARY: AddressSanitizer: 2528456 byte(s) leaked in 1057 allocation(s).
Fixes: 54f6e72b27 ("r300: better register allocator for vertex shaders")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27792>
(cherry picked from commit 4d00edda00)
readback should trigger on the current backbuffer, not the most recently
presented buffer. if e.g., a clear is only triggered through glFlush,
this clear should be read back rather than the contents of the last-presented
buffer
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27553>
(cherry picked from commit d2ed77072c)
the previous code could recycle a currently-submitting state by hitting
a race condition where zink_screen_check_last_finished(batch_id) returned
true because batch_id was 0
this can no longer recycle the current batch, but the race should still be
eliminated for consistency: check 'submitted' since this guarantees batch_id
is valid
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27729>
(cherry picked from commit 3283415bbd)
The EXT_texture_format_BGRA8888 spec clearly defines GL_BGRA as a
color-renderable format, so we need to support it here as well.
This has been broken since the day support for the extension was added.
Oh well, let's fix it up!
Fixes: 1d595c7cd4 ("gles2: Add GL_EXT_texture_format_BGRA8888 support")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27720>
(cherry picked from commit 3b23e9d89d)
With TES, the primitive ID is an input variable but it's considered a
sysval by SPIRV->NIR. Though, its value is greater than
VARYING_SLOT_VAR0 which means its location was adjusted by mistake.
This fixes compiling a tessellation evaluation shader in debug build
with Enshrouded.
Fixes: dfbc03fa88 ("spirv: Fix locations for per-patch varyings")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27413>
(cherry picked from commit 78ea304a06)
We need to allocate "shared size" bytes for each workgroup but
we were incorrectly multiplying by the number of workgroups in
each supergroup instead, which would typically cause us to allocate
less memory than actually required.
The reason this issue was not visible until now is that the kernel
driver is using a large page alignment on all BO allocations and
this causes us to "waste" a lot of memory after each allocation.
Incidentally, this wasted memory ensured that out of bounds
accesses would not cause issues since they would typically land
in unused memory regions in between aligned allocations, however,
experimenting with reduced memory aligments raised the issue,
which manifested with the UE4 Shooter demo as a GPU hang caused
by corrupted state from out of bounds memory writes to CS
shared memory.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27675>
(cherry picked from commit 1880e7cfed)
Indeed, vertex_buffer was not properly freed.
For instance, this issue is triggered with:
"piglit/bin/fcc-read-after-clear blit rb -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 8a963d122d ("r300g/swtcl: don't do stuff which is only for HWTCL")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27678>
(cherry picked from commit 3b90c46bdf)
There are a couple mistakes here :
- using a bitfield as an index to generate a bitfield...
- in anv_nir_push_desc_ubo_fully_promoted(), confusing binding
table access of the descriptor buffer with actual descriptors
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ff91c5ca42 ("anv: add analysis for push descriptor uses and store it in shader cache")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27504>
(cherry picked from commit cf193af762)
according to spec, these should return NONE if the format is
not supported for a given texture target, but mesa was incorrectly
returning a hardcoded value for all cases without checking the driver
instead, check whether the driver can create a texture for a given
format to correctly handle this non-support case
cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27621>
(cherry picked from commit 893780b362)
Left-shifting by 11*8 or 14*8 is undefined. This fixes many
dEQP-VK.query_pool.statistics_query.* failures (but not pre-existing
flakes) for release builds using clang.
Fixes: 48aabaf225 ("radv: do not harcode the pipeline stats mask for query resolves")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27651>
(cherry picked from commit ec5d0ffb04)
this is the case where:
* a batch A is submitted
* a no-op flush occurs
* the frontend gets the fence from already-flushed batch A
* zink recycles batch A
* the frontend waits on fence A
fixes#10598
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27623>
(cherry picked from commit fb2ae7736f)
now that zink_gfx_lib_cache::stages_present exists (and is correct),
this value can be used directly to effect cache eviction instead of depending
on the prog->stages_present value, which may not even be the same prog that
owns a given zink_gfx_lib_cache instance
this fixes the case where a shader used in multiple progs with differing shader
masks would never have all its gpl pipelines freed
fixes leaks with caselist:
KHR-Single-GL46.arrays_of_arrays_gl.InteractionUniformBuffers1
KHR-Single-GL46.subgroups.quad.framebuffer.subgroupquadbroadcast_3_float_vertex
Fixes: d786f52f1f ("zink: prevent crash when freeing")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27358>
(cherry picked from commit e8ce53a33d)
some passes (e.g., opt_shrink_vector) operate on the assumption that
sparse tex ops have a certain number of components and then remove components
and unset the sparse flag if they can optimize out the sparse usage
zink's sparse ops do not have the standard number of components, which
causes such passes to make incorrect assumptions and tag them as
not being sparse, which breaks everything
fix#10540
Fixes: 0d652c0c8d ("zink: shrink vectors during optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27414>
(cherry picked from commit 2085d60438)
This works around some Unity engine behaivor with ANGLE-on-Venus, when
cmd pools are created on main thread once while the render thread only
does descriptor pool creation for set allocations during recording time.
This change also explicitly forces async pipeline create for threads
creating the device instead of implicitly via feedback cmd pool create.
This ensures intended behavior when feedback is disabled.
Fixes: d17ddcc847 ("venus: dispatch background shader tasks to secondary ring")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27347>
(cherry picked from commit 1718980e85)
The runtime is turning GENERAL layouts into FEEDBACK_LOOP ones when it
detects feedback loops in a render pass. This is breaking drivers that
would like to use a different HW layout for those 2 layouts because if
the application inserts barrier in the render pass, the barriers the
driver sees are inconsistent.
This could lead to barrier of this type :
- GENERAL -> FEEDBACK_LOOP (runtime)
- GENERAL -> GENERAL (app)
- FEEDBACK_LOOP -> GENERAL (runtime)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23523>
(cherry picked from commit 76cf391255)
this logic relies on constant indexing for compact arrays, but this is
frequently not the case for compact array builtins (e.g., gl_TessLevelOuter).
the usual strategy of lowering to temps isn't viable in TCS, which means
io lowering has to be able to handle indirect access to these builtins
without crashing
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27534>
(cherry picked from commit 9e2c7314f2)
The round up in 'next_address_8kb = DIV_ROUND_UP(push_constant_kb, 8)'
was not decreasing the amount of URB available for Mesh and Task, what
could cause an over allocation of URB.
There was also no minimum entries enforcement for Mesh and Task, what
could cause 0 r.mesh_entries to be set in a case where tue_size_dw is
90% > than mue_size_dw. Same for r.task_entries when Task is enabled.
Also adding a few more asserts to help debug.
This fixes at least dEQP-VK.mesh_shader.ext.properties.mesh_payload_size
in LNL but it has potential to fixes other Mesh tests as well.
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27555>
(cherry picked from commit d0fba810b3)
When doing query result copies in 3D mode, we're flushing the render
target cache, but the shader writes go through the dataport.
Fixes flakes/fails in piglit with shader query copies forced with Zink :
$ query_copy_with_shader_threshold=0 ./bin/arb_query_buffer_object-coherency -auto -fbo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b3b12c2c27 ("anv: enable CmdCopyQueryPoolResults to use shader for copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
(cherry picked from commit c53a4711cb)
When bitrate or fps change is detected, only update rate control
parameters instead of completely reinitializing encode session.
This fixes an issue where if application changed bitrate or fps often,
the output bitrate would significantly overshoot the target bitrate in some
cases. In other cases, the output bitrate would be extremely low instead.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27548>
(cherry picked from commit 8d44a11508)
It is possible to free memory backing images before images are
destroyed :
VkFreeMemory:
"Memory can be freed whilst still bound to resources, but those
resources must not be used afterwards."
The spec leaves us the option to keep a reference on the associated
memory and free it only when all the bound resources have been
destroyed. Here we choose to free memory immediately.
One particular test in the CTS
(dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics)
does the following :
imgA = vkCreateImage()
imgB = vkCreateImage()
memA = vkAllocateMemory()
vkBindImageMemory(imgA, memA) # Aux mapping with ref count = 1
vkFreeMemory(memA) # Aux mapping removed, ref count = 0
memB = vkAllocateMemory() # Same address as memA
vkBindImageMemory(imgB, memB)
vkDestroyImage(imgA) # Removes the mapping of imgB-memB
vkQueueSubmit() # hang with pagefault in AUX-TT
The solution implemented in this change is to not do anything AUX-TT
related in vkFreeMemory(). This soluation has some consequences,
because a virtual memory address range freed and reallocated cannot be
rebound in the AUX-TT until all the associated resources have released
their AUX-TT mapping (to bring back the AUX-TT refcount of the range
to 0). This should still be better than keeping the memory allocated
through refcounting of the anv_bo.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7b87e1afbc ("anv: track & unbind image aux-tt binding")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10528
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27566>
(cherry picked from commit e0b4dfbbda)
If a program does two blits in a row, we internally do a sequence of
operations that involves binding vb0.
Previously, the vb0 state after each operation would look something like:
| operation | cmd->state.gfx.vb0 | hardware | save->vb0 |
| ---------------------------- | ------------------ | --------- | --------- |
| | user | user | |
| nvk_meta_begin() | user | user | user |
| BindVertexBuffers(internal0) | internal0 | internal0 | user |
| nvk_meta_end() | internal0 | user | |
| nvk_meta_begin() | internal0 | user | internal0 |
| BindVertexBuffers(internal1) | internal1 | internal1 | internal0 |
| nvk_meta_end() | internal1 | internal0 | |
That is, CmdBindVertexBuffers() would update cmd->state.gfx.vb0, but
nvk_meta_end() would not. This meant that the last operation would bind a
driver-internal buffer instead of the original value that the user set.
This change fixes the issue by tracking cmd->state.gfx.vb0 in
nvk_cmd_bind_vertex_buffer(), which both CmdBindVertexBuffers() and
nvk_meta_end() call into.
After this commit, the state looks like:
| operation | cmd->state.gfx.vb0 | hardware | save->vb0 |
| ---------------------------- | ------------------ | --------- | --------- |
| | user | user | |
| nvk_meta_begin() | user | user | user |
| BindVertexBuffers(internal0) | internal0 | internal0 | user |
| nvk_meta_end() | user | user | |
| nvk_meta_begin() | user | user | user |
| BindVertexBuffers(internal1) | internal1 | internal1 | user |
| nvk_meta_end() | user | user | |
To test this commit, build gtk4 commit 87b66de1, run:
GSK_RENDERER=vulkan gtk4-demo --run=image_scaling
then select trilinear filtering in the dropdown and check for rendering
artifacts.
Fixes: e1c66501 ("nvk: Use vk_meta for CmdClearAttachments")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27559>
(cherry picked from commit d98ff2cc4a)
SPECviewperf creo-03 needs GL_EXT_shader_image_load_store in order for
its shaders to compile but we don't support a few corner cases that
didn't make it into the ARB variant. It seems to run fine with an
override, so just do that for now.
Cc: mesa-stable
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27429>
(cherry picked from commit 24d3c83212)
It can be the case that a collect and one of its sources are assigned
to non-overlapping parts of the same merge set, for example:
ssa_1 = ...
ssa_2 = ...
ssa_3 = ...
ssa_4 = collect ssa_1, ssa_2 (kill), ssa_3
... = ssa_4 (kill)
ssa_5 = collect ssa_1, ssa_3
... = ssa_1 (kill)
... = ssa_3 (kill)
... = ssa_5 (kill)
If we merge the first collect first, we get a merge set:
ssa_1 (offset 0)
ssa_2 (offset 2)
ssa_3 (offset 4)
ssa_4 (offset 0)
Now, we decide to merge ssa_1 and ssa_5:
ssa_1 (offset 0)
ssa_2 (offset 2)
ssa_3 (offset 4)
ssa_4 (offset 0)
ssa_5 (offset 0)
ssa_3 cannot become a child of ssa_5 in the interval tree, just like a
source not in the same merge set, so we should not remove it and then
reinsert it assuming that RA will make it a child of ssa_5.
This fixes an RA validation error in Farming Simulater.
Fixes: 0ffcb19 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27497>
(cherry picked from commit aeed5fd98d)
If dual blending is enabled, only 1 output is supported. Multiple
outputs confuse the write combining pass in this case, leading to
incorrect output and/or an assert failure in emit_fragment_store.
The fix is straightforward, just skip the speculative emitting of
multiple outputs in the case where dual source blending is enabled.
This also adds an extra sanity check in `pan_nir_lower_zs_store` to
check for only one blend store being present.
Fixes: c65a9be421 ("panfrost: Preprocess shaders at CSO create time")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9487
Co-Authored-By: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26474>
(cherry picked from commit 49c1b404e5)
The CTS image allocation sometimes doesn't try to allocate a complete
DPB, but the amdgpu kernel module checks for this, so always make
the DPB max sized on uvd instances.
Fixes part of video decode on Fiji/Polaris
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27186>
(cherry picked from commit df9bc11589)
As per this optimisations description:
"Takes assignments to variables that are dereferenced only
once and pastes the RHS expression into where the variables
dereferenced."
However the optimisation is run at compile time before multiple
shaders from the same stage could have been pasted together.
So this optimisation can incorrectly assume a global is only
referenced once since it cannot see the other pieces of the
shader stage until link time.
Here we skip the optimisation if the variable is a global. We
could change it to only run at link time however this
optimisation is only run at link time if we are being forced
to use GLSL IR to inline a function that glsl to nir cannot
handle and this will also be removed in a future patchset.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10482
Fixes: d75a36a9ee ("glsl: remove do_copy_propagation_elements() optimisation pass")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27351>
(cherry picked from commit bc0178af57)
Similar to what was done for Wayland in 58f90fd03f:
the glthread unmarhsal thread needs to be idle to avoid
concurrent calls to get_back_bo.
Also the existing code flushed after setting dri2_surf->back
to NULL so a new back buffer was always allocated by the
glthread flush:
|---------------> dri2_drm_swap_buffers
| get_back_bo (back=0x55eb93c6c488) > # First get_back_bo call
| get_back_bo (back=0x55eb93c6c488 age: 0)<
| # dri2_surf->back = NULL
|-----> FLUSH
| get_back_bo (back=nil) > # Another get_back_bo call
| get_back_bo (back=0x55eb93c6c4c8 age: 3)<
|-----< FLUSH
|---------------< dri2_drm_swap_buffers
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10437
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27143>
(cherry picked from commit 6f47e87a60)
GL ARB_sparse_buffer allows unbound regions in buffers.
VK sparseBinding insists all regions must be bound before first use.
This means we need to use sparseResidencyBuffer to back GL
sparse buffers to get the same semantics.
Fixes GL and piglit sparse buffer tests on zink/nvk.
Fixes: c90246b682 ("zink: implement sparse buffer creation/mapping")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27404>
(cherry picked from commit ff50e80574)
LLVM_LIB_DIR is a variable used for runtime compilations.
When cross compiling, LLVM_LIB_DIR must be set to the
libclang path on the target. So, this path should not
be retrieved during compilation but at runtime.
dladdr uses an address to search for a loaded library.
If a library is found, it returns information about it.
The path to the libclang library can therefore be
retrieved using one of its functions. This is useful
because we don't know the name of the libclang library
(libclang.so.X or libclang-cpp.so.X)
v2 (Karol): use clang::CompilerInvocation::CreateFromArgs for dladdr
v3 (Karol): follow symlinks to fix errors on debian
Fixes: e22491c832 ("clc: fetch clang resource dir at runtime")
Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (v1): Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>
(cherry picked from commit 445aacb421)
If the lane from which the hardware writes the unifa address
is disabled, then we may end up with a bogus address and invalid
memory accesses from follow-up ldunifa.
Instead of always disabling unifa loads in non-uniform control
flow we can try to see if the address is prouced from a nir
register (which is the only case where we do conditional writes
under non-uniform control flow in ntq_store_def), and only
disable it in that case.
When enabling subgroups for graphics pipelines, this fixes a
GMP violation in the simulator with the following test
(which has non-uniform control flow writing unifa with lane 0
disabled, which is the lane from which the unifa takes the
address):
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcastfirst_int
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
(cherry picked from commit 5b269814fc)
c->execute is 0 (not the block index) for lanes currently active
under non-uniform control flow.
Also this simplifies a bit the instructions we emit for flag
generation, both for uniform and non-uniform control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
(cherry picked from commit 7bdc8898b1)
If the ELSE block is cheap then we don't emit the branch instruction
but we still want to generate the flags, since these are setting
the flags for the THEN block too.
Fixes: e401add741 ("broadcom/compiler: skip jumps in non-uniform if/then when block cost is small")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
(cherry picked from commit 29d4924e5e)
descriptor buffer uses mapped buffers. mapping/unmapping buffers
uses a ctx in the function params, but at this time there is no ctx.
since the ctx is not actually used for unmapping descriptor buffers,
this can instead use a special buffer unmap function to avoid invalid access
Fixes: b06f6e00fb ("zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27344>
(cherry picked from commit 0a97d1ebfa)
this is already implied since the buffers must be BAR-allocated,
but it ensures the context isn't accessed during unmap
Fixes: b06f6e00fb ("zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27344>
(cherry picked from commit c900cca96c)
../src/compiler/nir/nir_builder.h: In function ‘nir_build_deref_follower’:
../src/compiler/nir/nir_builder.h:1607:1: error: control reaches end of non-void function [-Werror=return-type]
1607 | }
Fixes: 4a4e175738
nir: Support deref instructions in lower_var_copies
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
(cherry picked from commit 0ab3b3c641)
../src/compiler/nir/nir_lower_int64.c: In function ‘lower_int64_intrinsic’:
../src/compiler/nir/nir_lower_int64.c:1347:1: error: control reaches end of non-void function [-Werror=return-type]
1347 | }
Fixes: bf7a114246
nir/lower_int64: Add lowering for some 64-bit subgroup ops
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
(cherry picked from commit 80a1b91601)
../src/amd/vulkan/radv_sampler.c: In function ‘radv_tex_wrap’:
../src/amd/vulkan/radv_sampler.c:50:1: error: control reaches end of non-void function [-Werror=return-type]
50 | }
| ^
../src/amd/vulkan/radv_sampler.c: In function ‘radv_tex_compare’:
../src/amd/vulkan/radv_sampler.c:76:1: error: control reaches end of non-void function [-Werror=return-type]
76 | }
| ^
Fixes: 4de305cb8a
radv: move sampler related code to radv_sampler.c
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
(cherry picked from commit ca47138fb1)
Memory can be free before images it is bound to. When unmapping the
CCS range in the AUX-TT, we cannot rely on the anv_bo::offset field
because the anv_bo might have been freed.
Just save the mapping address/size and use those values at unmapping
time.
Fixes an assert on CI with :
dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e519e06f4b ("anv: add missing alignment for AUX-TT mapping")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27304>
(cherry picked from commit 9d31680e79)
- no need to flush anything before as we're working on a clean
buffer (SI_OP_SKIP_CACHE_INV_BEFORE)
- L2 must be flushed after the job to avoid rendering artifacts.
Instead of setting it manually, use SI_OP_SYNC_AFTER +
SI_COHERENCY_NONE.
Fixes: 1a99f50c7f ("radeonsi: use a compute shader to convert unsupported indices format")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27095>
(cherry picked from commit cce5920025)
This fixes#9807 but I don't understand why.
Emitting cache flushes before VGT_PRIMITIVE_TYPE is what makes
the problem go away but changing the order in si_draw() is clearer.
The only cases where sctx->flags is modified in si_emit_draw_registers
is handled using si_emit_cache_flush_direct so we can move cache
flushing up without any addtional conditionals.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9807
Fixes: 1e4b539042 ("radeonsi: handle deferred cache flushes as a state (si_atom)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27095>
(cherry picked from commit 0e16da89fe)
Buffers that are not dedicated can also be used for CCS mapped images,
so they need to be aligned to the AUX-TT requirements.
GTK+ is running into such case where it creates an image with a CCS
modifier. When requesting the alignment through
vkGetImageMemoryRequirements() the 64KB/1MB alignment is returned, but
the binding fails with an assert because the VkDeviceMemory has not
been aligned to the AUX-TT requirement and we cannot disable CCS since
the modifier requires it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cdd3178fb ("anv: Meet CCS alignment reqs with dedicated allocs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10433
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27258>
(cherry picked from commit e519e06f4b)
When updating an AFBC-packed resource, we need to make sure it is
legalized before blitting the staging resource to it. We can't rely
on the blit to properly convert the resource as it will result in
blit recursion and a crash.
If the whole texture is updated however, there is no need to unpack
as the content can be discarded. Just create a new BO with the right
format.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 33b48a5585 ("panfrost: Add debug flag to force packing of AFBC textures on upload")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27208>
(cherry picked from commit 1aa832e5f5)
There might be a more efficient path when legalizing a resource if
we don't need to worry about its content. For example, it doesn't
make sense to copy the resource content when converting the modifier
if the resource content is discarded anyway.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 33b48a5585 ("panfrost: Add debug flag to force packing of AFBC textures on upload")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27208>
(cherry picked from commit ee77168d57)
The hardware uses the lane index for per-vertex TCS output reads rather
than the vertex index. Fortunately, it's a pretty easy calculation to
go from one to the other.
Fixes: abe9c1fea2 ("nak: Add NIR lowering for attribute I/O")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27284>
(cherry picked from commit 99ef70d8aa)
VK_ACCESS_2_SHADER_STORAGE_READ_BIT specifies read access to a
storage buffer, physical storage buffer, storage texel buffer, or
storage image in any shader pipeline stage.
Any storage buffers or images written to must be invalidated and
flushed before the shader can access them.
This fixes the following tests on LNL:
- dEQP-VK.synchronization2.op.single_queue.barrier.write\*_specialized_access_flag
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27212>
(cherry picked from commit 3e93ccbc1b)
No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16
and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8,
16 and 32 bit.
Also turn the `lower` into a `has` option as only two drivers actually
support `uror` and `urol` at this momemt.
Fixes crashes with CL integer_rotate on iris and nouveau since we emit
urol for `rotate`.
v2: always lower 64 bit
Fixes: fe0965afa6 ("spirv: Don't use libclc for rotate")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>
(cherry picked from commit f2b7c4ce29)
When building the CFG the instructions are taken of the list in
fs_visitor and added to the lists inside each block. The single
"exec_node" in the instruction is used for those memberships.
In the case the pass rebuilt the CFG, it had no instructions, so
calculate_cfg() had nothing to work with. For now fix the bug by
pulling all the instructions back to the original list.
We can do better here, but punting until upcoming work on
CFG itself.
Issue found in an unpublished CTS test. Small reproduction in our
unit tests now enabled.
Fixes: 65237f8bbc ("intel/fs: Don't add MOV instructions to DO blocks in combine constants")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27131>
(cherry picked from commit 4dbf9181cd)
We're missing the ISA code in renderdoc. You can reproduce with the
Sascha Willems graphics pipeline demo.
The change is large here because we have to fix a confusion between
anv_shader_bin & anv_pipeline_executable. anv_pipeline_executable is
there as a representation for the user and multiple
anv_pipeline_executable can point to a single anv_shader_bin.
In this change we split the anv_shader_bin related logic that was
added in anv_pipeline_add_executable*() and move it to a new
anv_pipeline_account_shader() function.
When importing RT libraries, we add all the anv_pipeline_executable
from the libraries.
When importing Gfx libraries, we add the anv_pipeline_executable only
if not doing link time optimization.
anv_shader_bin related properties are added whenever we're importing a
shader from a library, compiling or finding in the cache.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3d49cdb71e ("anv: implement VK_EXT_graphics_pipeline_library")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26594>
(cherry picked from commit 58c9f817cb)
This fixes VUID-vkCmdDraw-None-08600 violation when running gpl cts:
dEQP-VK...graphics_library.misc.bind_null_descriptor_set.*, where the
final pipeline layout is falsely dropped, leading to incompatible with
the pipeline layout of the bound descriptor set.
Fixes: a65ac274ac ("venus: Do pipeline fixes for VK_EXT_graphics_pipeline_library")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27054>
(cherry picked from commit 80a5df16fe)
The panfrost driver now makes an ioctl to retrieve some new memory
parameters, and DRM_PANFROST_PARAM_MEM_FEATURES is required (does not
default in the caller). This caused drm-shim to stop working. This
patch adds some defaults to get drm-shim working again.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 91fe8a0d28 ("panfrost: Back panfrost_device with pan_kmod_dev object")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27162>
(cherry picked from commit a50b2f8f25)
GFX10 has a hw bug and it can't handle 0-sized index buffer. The
non-indirect draw path was fine but not the indirect path where RADV
emits the index buffer.
This fixes flakes with dEQP-VK.*maintenance6* on NAVI14, and possibly
GPU hangs if there is an indirect draw with a valid index buffer right
before because it would re-use the same index buffer.
Fixes: db9816fd66 ("radv: add support for NULL index buffer")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27142>
(cherry picked from commit 783e3c096f)
On chordal graphs, a greedy coloring can be done in a way that never uses
more colors than are required for the largest clique. However, since we
have vector values and force phi resources into the same spill slots, the
interference graphs are not chordal, and thus, this assumption doesn't hold.
Use twice as many spill slots as upper bound.
Totals from 10 (0.01% of 79242) affected shaders: (GFX11)
MaxWaves: 52 -> 54 (+3.85%)
Instrs: 271386 -> 271779 (+0.14%)
CodeSize: 1362544 -> 1365432 (+0.21%)
VGPRs: 2536 -> 2532 (-0.16%)
SpillVGPRs: 778 -> 818 (+5.14%)
Scratch: 73472 -> 76800 (+4.53%)
Latency: 3331718 -> 3328798 (-0.09%); split: -0.14%, +0.05%
InvThroughput: 1665860 -> 1643350 (-1.35%); split: -1.40%, +0.05%
VClause: 3292 -> 3329 (+1.12%); split: -0.06%, +1.18%
Copies: 46082 -> 46257 (+0.38%)
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27011>
(cherry picked from commit e3098bb232)
If someone were to remove the libraries that are needed for these,
`default` would simply not enable these tests, and the only thing we
could notice is that test jobs would suddenly take less time to run.
Instead, let's have a check to make sure dEQP's cmake has detected
everything and enabled these platforms.
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27041>
(cherry picked from commit 27a1b4e4f3)
DOOM Eternal builds acceleration structures with inactive primitives and
tries to make them active in later AS updates. This is disallowed by the
spec and triggers a GPU hang. Fix the hang by working around the bug.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27034>
(cherry picked from commit a9831caa14)
Commit 73eecffabd ("panvk: Use the vk_pipeline_layout base struct")
reworked the panvk logic to use vk_pipeline_layout, which contains the
number of descriptor set layout referenced by a pipeline layout, thus
deprecating panvk_pipeline_layout::num_sets.
Make panvk_fill_non_vs_attribs() use vk_pipeline_layout::set_count
instead of panvk_pipeline_layout::num_sets and kill the latter so we
can't introduce new users.
Fixes: 73eecffabd ("panvk: Use the vk_pipeline_layout base struct")
Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Constantine Shablya <constantine.shablya@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27107>
(cherry picked from commit b18bfed2c5)
'debian/x86_64_build' job needs 'debian/x86_64_build-base' job, but 'debian/x86_64_build-base' is not in any previous stage
Fixes: f298a0e709 ("ci: make sure we evaluate the python-test rules first")
Fixes: 2c9fdaa830 ("ci: fix python-test dependency error on merge requests")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27042>
(cherry picked from commit 2ce0b5ab0a)
For instance, this issue is triggered with
vs-to-fs-overlap.shader_test -auto -fbo:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7fe64f58e9a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7fe642ca2839 in _mesa_symbol_table_ctor ../src/mesa/program/symbol_table.c:286
#2 0x7fe642ff003d in gl_nir_cross_validate_outputs_to_inputs ../src/compiler/glsl/gl_nir_link_varyings.c:728
#3 0x7fe642d7c7d8 in gl_nir_link_glsl ../src/compiler/glsl/gl_nir_linker.c:1357
#4 0x7fe642be6931 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:562
#5 0x7fe642be6931 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:944
#6 0x7fe642acab55 in link_program ../src/mesa/main/shaderapi.c:1336
#7 0x7fe642acab55 in link_program_error ../src/mesa/main/shaderapi.c:1447
#8 0x7fe6424aa389 in _mesa_unmarshal_LinkProgram src/mapi/glapi/gen/marshal_generated2.c:1911
#9 0x7fe641fd912b in glthread_unmarshal_batch ../src/mesa/main/glthread.c:139
#10 0x7fe641f48d48 in util_queue_thread_func ../src/util/u_queue.c:309
#11 0x7fe641fa442a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
Fixes: 7d1948e9b5 ("glsl: implement cross_validate_outputs_to_inputs() in nir linker")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27071>
(cherry picked from commit bacace8634)
Vivante hardware handles 64bpp render targets and samplers in a odd way
by splitting the buffer and using a pair of texture samplers or a pair
of MRT outputs to access those resources. This isn't implemented in the
driver right now, so we should not advertise support for those formats.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26982>
(cherry picked from commit e481c1269c)
Prior to 06b526de, the mesa format was used for these completeness checks.
That was to address the case where a *different* internal format selected
the *same* mesa format, and the texture shouldn't be considered compatible.
But this didn't address the case where the *same* internal format selected
a *different* mesa format, e.g. because the type passed to the TexImage
API was different.
An old WGL demo app called TexFilter.exe tries to redefine a mipped RGBA16
texture as RGBA8. This incorrect logic caused Mesa to try to copy the RGBA16
data from the smaller mips into the newly created RGBA8 data, because it
thought that the texture was still mip-complete, despite the format changing.
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27023>
(cherry picked from commit 4cb9c77e8e)
ring_seqno_valid indicates a successful ring cmd submission, and can be
used to avoid invalid reply decoding due to failed submit alloc.
Otherwise, the garbled VkResult will mislead into initialization failure
instead of oom.
Below cts failure is fixed:
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
Fixes: ec131c6e55 ("venus: use instance allocator for ring allocs")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27026>
(cherry picked from commit ecd50e70d4)
The max waves for RT prolog need to be recalculated after merging the
resource usage of all shaders invoked from it.
Note that there is no need to panic, as the info was only used to
calculate maximum scratch size and with the RT prolog being low
footprint, this likely only caused overestimation rather than
underestimation.
Fixes: 533ec9843e ("radv: Precompute shader max_waves.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26998>
(cherry picked from commit 63827751e1)
This was broken when I added texcoord support, the problem is that we
failed to properly count the number of used fs inputs and thus we failed
to make the proper decision when to reuse the color varying slot
Also fix the error messages, they were incorrect after the rewrite as
well. This fixes a bunch of piglits.
Fixes: d4b8e8a481
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27003>
(cherry picked from commit 53c17d85ab)
# Bring artifacts back from the NFS dir to the build dir where gitlab-runner
# will look for them.
cp -Rp /nfs/results/. results/
if[ -f "${STRUCTURED_LOG_FILE}"];then
cp -p ${STRUCTURED_LOG_FILE} results/
echo"Structured log file is available at https://${CI_PROJECT_ROOT_NAMESPACE}.pages.freedesktop.org/-/${CI_PROJECT_NAME}/-/jobs/${CI_JOB_ID}/artifacts/results/${STRUCTURED_LOG_FILE}"
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.