This has been confirmed to fix sporadic graphics corruption on Gen12
platforms for a number of workloads (including Heaven, Valley and
CS:GO among others). Corruption seems to occur during context switch
fairly consistently, but unfortunately this problem doesn't seem to be
documented. Until the hardware team comes up with a better
workaround, fix the problem by reemitting constants at the beginning
of each batch.
No corruption has been observed so far in GL due to preemption,
however this is a possibility to keep in mind, it may be necessary to
disable preemption in addition to this patch in order to fully address
this problem (see also 81201e4617).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4412
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4454
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 20e2c7308f)
.pick_status.json: Mark 6a29632dd2 as denominated
.pick_status.json: Mark bd1705a480 as denominated
.pick_status.json: Mark 366fb28dac as denominated
.pick_status.json: Mark 0f7379e308 as denominated
.pick_status.json: Mark cff5c40fc3 as denominated
.pick_status.json: Mark c0c03f29e0 as denominated
.pick_status.json: Mark b6b3b38434 as denominated
.pick_status.json: Mark 5a340c0929 as denominated
.pick_status.json: Mark 226c7ae2a8 as denominated
.pick_status.json: Mark 8b44e45347 as denominated
.pick_status.json: Mark 3d3f21f0be as denominated
.pick_status.json: Mark 3436e5295b as denominated
.pick_status.json: Mark 2c02740a8c as denominated
.pick_status.json: Mark d4f21b53f2 as denominated
.pick_status.json: Mark aa5d38decd as denominated
.pick_status.json: Mark f4a7dbc58f as denominated
.pick_status.json: Mark 799a931d12 as denominated
.pick_status.json: Mark 90632ae7b3 as denominated
.pick_status.json: Mark f7acdb1d1d as denominated
.pick_status.json: Mark a5d5cbdf08 as denominated
.pick_status.json: Mark 9413c6aec3 as denominated
.pick_status.json: Mark fe5349f70c as denominated
.pick_status.json: Mark 961361cdc9 as denominated
.pick_status.json: Mark 0845cabc72 as denominated
.pick_status.json: Mark 0a7a61b2d7 as denominated
.pick_status.json: Mark 3720c6a6f6 as denominated
.pick_status.json: Mark 15e24850b7 as denominated
.pick_status.json: Mark b60bc59180 as denominated
.pick_status.json: Mark 08fdaec473 as denominated
.pick_status.json: Mark 30bc562bda as denominated
.pick_status.json: Mark 505d176a8e as denominated
The parameters GL_TEXTURE_MIN_LOD, GL_TEXTURE_MAX_LOD,
GL_TEXTURE_MAX_ANISOTROPY_EXT, GL_TEXTURE_LOD_BIAS are stored as floats but
returned as integers. Setting their values outside of the integer range results
has undefined behaviour when the c-runtime method lroundf converts the value
back to an integer.
Fixes: 53c36dfc('replace IROUND with util functions')
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10244>
(cherry picked from commit 55fb9417a6)
We were never setting set->size, so we were always copying 0 bytes. But
as we only copy the contents when the layout and therefore the size is
the same, we don't have to take the old size into account anyway.
This fixes some VK_EXT_robustness2 tests that use push descriptors.
Fixes: 6d4f33e ("turnip: initial implementation of VK_KHR_push_descriptor")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
(cherry picked from commit cb02a48f83)
We forgot to remove the instruction under consideration from instr_list
before inserting it into the block's list, which caused instr_list to
become corrupted. This happened to work but caused further corruption in
some rare scenarios.
Fixes: adf1659 ("freedreno/ir3: use standard list implementation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
(cherry picked from commit 8e11f0560e)
For these caps, we need to check all stages to be sure we've got things
right.
Again, this is probably benign, because LLVMpipe should support the same
value for all stages.
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10189>
(cherry picked from commit ffe534f27b)
Conflicts:
src/gallium/frontends/lavapipe/lvp_device.c
We should really check for the minimum of all supported vertex-stages
here, not just the vertex-shader.
This shouldn't make any real-world difference, because we really only
support LLVMpipe here, and that driver has the same limits for all
stages. But it seems better to actually check all stages instead of just
assuming.
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10189>
(cherry picked from commit d91a549b67)
Conflicts:
src/gallium/frontends/lavapipe/lvp_device.c
This seems arbitrary, and makes us check for PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS
instead of PIPE_SHADER_CAP_MAX_SHADER_IMAGES, which isn't what we want.
The end result is that we accidentally exposed 128 shader images,
instead of 16. This can lead to us writing outside of the array of
shader images in llvmpipe_set_shader_images, among other bad things.
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10189>
(cherry picked from commit 83de54f6a6)
.pick_status.json: Update to 165a69d2f7
.pick_status.json: Mark 32eb74e1e1 as backported
.pick_status.json: Mark e4ef5f0433 as backported
.pick_status.json: Mark 363c1ef0c0 as backported
When formatting the error here, we're currently casting an
ast_type_qualifier as a string.
But we don't need to use a string here at all, because we know from
context exactly what qualifier we're talking about, because the
if-statements explicitly check for the uniform-qualifier.
So let's just hard-code the format-string to reference the right
qualifier instead of the string-shenanigans. The latter cannot do the
right thing.
Fixes: 2d03f48a65 ("glsl: Add parsing for GLSL uniform blocks.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9911>
(cherry picked from commit 437ed05708)
This breaks CLOn12's handling of CL CTS test_basic vector_creation for char3 (at least).
Removing this cast causes us to try to load from a deref with no alignment info.
Fixes: 99bb2a4d ("nir/opt_deref: Don't remove casts with alignment information")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
(cherry picked from commit 4b69ae8e1e)
bo_free is called on external BOs when there are no objects left which
reference them. The function unmaps the address range associated with
any maps which occured. However, if the BO is busy (not idle), it
doesn't mark the pointer to the start address as invalid. This can lead
to a segfault later on.
At the end of bo_free, these BOs are still present in the handle hash
table. If such a BO is reused (i.e., when a DMABUF with the same handle
is reimported) and the driver attempts to get another mapping, the
bufmgr will incorrectly assume that the map pointer is still valid and
reuse it. This leads to a segfault. Set the pointer to NULL to mark it
as invalid.
Enables iris to run and pass the piglit test,
ext_image_dma_buf_import-reimport-bug.
Cc: mesa-stable
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9230>
(cherry picked from commit 0092219cfe)
This was hidden originally to workaround a bug in the RK3399 display
driver. The patch resolving this issue has been merged in the upstream
kernel, and in fact...
1. The issue was visible on 21.0 even with this workaround under certain
configurations (sway with an external monitor).
2. Even on buggy kernels, due to other platform details this is an
obscure bug to hit (not aware of any ways to trigger it OOTB with
current userspaces other than sway with an external monitor)
So why bother? Let's just delete the hack and let AFBC be used freely.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10148>
(cherry picked from commit e00d94f14f)
.pick_status.json: Mark fe53c22294 as backported
.pick_status.json: Mark 61cf77583a as backported
.pick_status.json: Mark 30cf07cc8a as backported
.pick_status.json: Mark abc724e440 as backported
.pick_status.json: Mark 0a939e788f as backported
.pick_status.json: Mark 3257ab9f23 as backported
.pick_status.json: Mark 33d87eeb5a as backported
.pick_status.json: Mark 1df3a00dcd as backported
Enabling this fast path, while making CPU side simpler
(and thus reducing driver's CPU overhead), forces us to
generate multiple VERTEX_BUFFER_STATE packets pointing to
the same buffer, but with slightly shifted BufferStartingAddress.
This is fine on GFX version >= 11 and < 8, but on version
8 and 9, thanks to internal details of the VF cache, vertex
data from each VERTEX_BUFFER_STATE is cached separately
and this results in a lot of cache misses.
Disabling this fast path restores previous performance levels.
On my SKL GT2 machine this improves performance in:
- GLB27 Egypt offscreen by 37%
- GLB27 TRex offscreen by 22%
- gfxbench5 Manhattan offscreen by 1.75%
- gfxbench5 Manhattan31 offscreen by 1.9%
- gfxbench5 Aztec Ruins high by 2.3%
In Intel performance CI on GFX version 9 it improves performance in:
- gfxbench5 Manhattan offscreen by 1.65%
- gfxbench5 Aztec Ruins normal offscreen by 1.33%
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3277
Fixes: 42842306d3 ("mesa,st/mesa: add a fast path for non-static VAOs")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10175>
This is needed since we're about to reinstate the fencing mechanism on
screen destruction. Until we figure out another way to handle it, this
will cause hangs on exit with the shim.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable # 21.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8867>
(cherry picked from commit 0464117ad9)
Conflicts:
.gitlab-ci/run-shader-db.sh
Because of nir_to_tgsi, LLVM is not required here.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: d0f8fe5909 ("softpipe: Switch to using NIR as the shader format from mesa/st.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10072>
(cherry picked from commit 49cb1fac13)
Conflicts:
src/gallium/auxiliary/draw/draw_pipe_aaline.c
src/gallium/auxiliary/draw/draw_pipe_aapoint.c
src/gallium/auxiliary/draw/draw_pipe_pstipple.c
Fixes a build failure when building any Vulkan driver for the X11
platform with -Dxlib-lease=disabled. For example:
/usr/bin/ld: src/vulkan/wsi/libvulkan_wsi.a(wsi_common_x11.c.o): in function `wsi_x11_detect_xwayland':
src/vulkan/wsi/wsi_common_x11.c:123: undefined reference to `xcb_randr_query_version_unchecked'
Fixes: 1de2fd0cf2 "wsi/x11: Always link against xcb-xrandr"
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9972>
(cherry picked from commit 2470bcb946)
We don't check whether there's an intervening write in this pass, which
makes it incorrect. ir3_cp_postsched does check correctly, but we were
accidentally doing it here anyway for some sources.
While we're here, delete some code that was only used in the array case.
Fixes: f370e954 ("freedreno/ir3: handle const/immed/abs/neg in cp")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
(cherry picked from commit 5a70c4d4a0)
The commit below was already meant to do this, but accidentally missed
this part.
Fixes stutter when the frame-rate drops below the refresh rate.
Fixes: e8f50bd600 "wsi/x11: Treat IMMEDIATE present mode the same as
MAILBOX for Xwayland"
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10026>
(cherry picked from commit 8ec530d982)
To fix synchronization issue between multimedia queue and gfx queue.
Adding flush call will let multimedia queue to wait for the content of gfx
command buffer to be executed, for the case where there is dependency
between these two queues.
Fixes: 2f50dea218 ("radeonsi: always use a staging texture for linear 1D textures in VRAM")
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 27209e63ea)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9995>
We don't ever want drisw path picking zink as the driver,
we can revisit this when the penny wrapper work gets further
along.
This selection causes systems with nvidia/intel dual-gpus
to try and pick the intel gpu for rendering in the nvidia
context if there is no nvidia GL driver or accel doesn't work.
This is a partial revert of the original commit.
Fixes: 4a3b42a717 ("drisw: Prefer hardware-layered sw-winsys drivers over pure sw")
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9816>
(cherry picked from commit 3e1698fe1b)
texture_index is meaningless when a tex_instr has deref src.
Use var->data.binding instead.
This fixes the incorrect lowering on radeonsi where the same
lowering steps was applied to all tex_instr based on the needs
of the first one (since texture_index is always 0).
CC: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9931>
(cherry picked from commit bc438c91d9)
Commits like the following changed the script names and distro tag
but didn't update the documentation. We do not explicitely mention
script names because they will likely change in the future but the
distro tag is less likely to change because it is shared with the
upstream ci-templates repo.
Fixes: af7dca3560 ("ci: Update the ci-templates commit.")
Fixes: 506e9d5fc7 ("gitlab-ci: Rename container install scripts to ...")
Fixes: c6c7652753 ("gitlab-ci: Organize images using new REPO_SUFFIX ...")
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9781>
(cherry picked from commit 8371b75241)
When we encounter a bindless image here, lower_deref returns a
NULL-pointer, and calling record_images_used will try to dereference
that NULL-pointer.
So let's dig out the var from the source instruction instead of the
result of the lowering.
Fixes: 5910c938a2 ("nir/glsl: gather bitmask of images used by program")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9895>
(cherry picked from commit 89a04a54c4)
In some cases we will change the type of the destination register of
an instruction. This is the type we should use to verify that we're
allow to do the replacement.
Otherwise we can hit restrictions on CHV and upcoming Xe-Hp for
instance where the copy propagation transforms this :
send(16) (mlen: 2) vgrf10:UD, 0u, 0u, vgrf35:D, null:UD
mov(16) vgrf11:UW, vgrf10<2>:UW
mov(16) vgrf12:UW, vgrf10+0.2<2>:UW
mov(16) vgrf15:HF, |vgrf11|:HF
mov(16) vgrf16:HF, |vgrf12|:HF
mov(8) vgrf41<2>:UW, vgrf15+0.0:UW group0
mov(8) vgrf42<2>:UW, vgrf15+0.16:UW group8
mov(8) vgrf45<2>:UW, vgrf16+0.0:UW group0
mov(8) vgrf46<2>:UW, vgrf16+0.16:UW group8
into this :
send(16) (mlen: 2) vgrf10:UD, 0u, 0u, vgrf35:D, null:UD
mov(8) vgrf41<2>:HF, |vgrf10+0.0|<2>:HF group0
mov(8) vgrf42<2>:HF, |vgrf10+1.0|<2>:HF group8
mov(8) vgrf45<2>:HF, |vgrf10+0.2|<2>:HF group0
mov(8) vgrf46<2>:HF, |vgrf10+1.2|<2>:HF group8
Because of the floating point use, stride and offets should be the
same.
v2: Fix final destination type selection (Curro)
v3: constify (Curro)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9832>
(cherry picked from commit aa53665fda)
The indirect draw call already encodes the index bias so that no
additional encoding in the hardware is needed in this case.
This fixes a regression with a number of tests from
dEQP-GLES31.functional.draw_indirect.random.*
Fixes: c6c532faa8
"gallium/u_vbuf: use updated pipe_draw_start_count while using draw_vbo"
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9877>
(cherry picked from commit acdf1a1234)
This changes how the L3 cache affinity code works out the affinity
masks. It works better with multi-CPU systems and should also be
capable of handling big/little type situations if they appear in
the future.
It now iterates over all CPU cores, gets the core count for each
CPU, and works out the L3_ID from the physical CPU ID, and
the current cores L3 cache. It then tracks how many L3 caches
it has seen and reallocate the affinity masks for each one.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4496
Fixes: d8ea509965 ("util: completely rewrite and do AMD Zen L3 cache pinning correctly")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9782>
(cherry picked from commit 11d2db17c5)
We don't want to expose an EGL device for a display-only DRM devices
(like VKMS). For these DRM devices we have a separate software-rendering
device (the first in the list, always present).
There is a similar check in _eglAddDRMDevice, however it will be
removed in a future commit to allow split render/display devices
to be properly added. We can't figure out whether we're on a split
render/display system before loading the driver.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
(cherry picked from commit e39d72aec2)
On the EGL DRM platform, call _eglAddDevice with the software flag
set if GBM has loaded a software driver. This allows _eglAddDevice
to make the difference between llvmpipe and kmsro.
This is important on split render/display SoCs: we don't want to
advertise EGL_MESA_device_software on these systems.
Completely drop disp->Options.ForceSoftware, because GBM is
responsible for choosing software rendering and doesn't take this
hint into account.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
(cherry picked from commit 08a51770bd)
"plane[0].i32" is the plane being lowered, it's not the sampler we're looking
for.
It worked when there's a single sampler because, eg for NV12, plane[0].i32 for
the UV plane would be 1 and the added ":uv" sampler would also land at binding
point 1.
Fixes: 079e5f73d7 ("mesa/st: rewrite src var when lowering tex_src_plane")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9812>
(cherry picked from commit 6298347ec7)
Windows doesn't actually distribute a full TLS CA certificate store, but
pulls them in over time with Windows Update. Try to prime it by manually
pulling the certificates and installing them.
This bumps the Windows tag to force a rebuild.
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9618>
(cherry picked from commit e6aacec9e1)
Conflicts:
.gitlab-ci.yml
LLVM is moving to the 13 release, but LLVM-SPIRV is still so in the past.
Given that LLVM 12.0.0 is still not out (we are at 12.0.0-rc1 today),
use the `release/12.x` branch for LLVM.
We should also tag LLVM-SPIRV, but... it seems that they haven't caught up
yet, so keep using the master branch, but add a note for a future
committer.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8740>
(cherry picked from commit 8deca5a72e)
The new registry caching in place for registry.fd.o can not handle layers
bigger than 5 GB. The last layer we used to build on windows was 5.2 GB,
meaning that the upload would fail.
Split the layers by calling multiple `RUN`, hoping that the size will be
roughly split between those steps if we have a special layer for VS2019.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8740>
(cherry picked from commit a69ab2ae36)
Since the child process doesn't call exec(), exit() attempted to run
atexit handlers registered by the parent process. This could result in
the child process hanging in exit() if there were still disk cache
threads alive when the parent process called fork(). (The CI runners
hit this multiple times when running tests in strace)
Fixes: 6a246f5c6d "aco/tests: Fix deadlock for too large test lists"
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9472>
(cherry picked from commit d411691965)
this was only implemented for textures (I assume because drivers which implement
the corresponding intrinsic don't support multisampled images), but it's also
used for shader images
Fixes: 22fdb2f855 ("nir/spirv: Update to the latest revision")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9682>
(cherry picked from commit 50881d59e6)
gl_MaxVaryingFloats was not removed from core until 4.20 and is still
available in compat shaders. Found while writing some new CTS to test
the correct declarations of this constant.
Fixes: 0ebf4257a385i ("glsl: define some GLES3 constants in GLSL 4.1")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9514>
(cherry picked from commit 684f97de80)
Starting with d0d039a4d3, we emit writes to the push constant chunk
of the payload to stomp out-of-bounds data to zero for Vulkan. Then, in
369eab9420, we started emitting shader preamble code for emulated
push constants on Gen12.5 parts. In either of these cases, we can run
into issues if we don't have a proper live range for some of the payload
registers where they get used for something and then smashed by our push
handling code. We've not seen many issues with this yet because it only
happens when you have dead push constants.
Fixes: d0d039a4d3 "anv: Emit pushed UBO bounds checking code..."
Fixes: 369eab9420 "intel/fs: Emit code for Gen12-HP indirect..."
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9501>
(cherry picked from commit b9e9f92f73)
si_check_render_feedback only relied on si_images::enabled_mask and
si_samplers::enabled_mask to determine if a texture was being used
both as input and output.
Given that some samplers/images can be considered active (so accounted
for by enabled_mask) but not used by the current shader this could
lead to false-positive.
This commit fixes this by and-ing the above mask with the information
from shader_info for each active shader.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4227
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9525>
The container is moved from before and hence returns size 0. To get the
correct value, the new instruction container must be used instead.
This was flagged by clang-tidy. The fixed call still triggers the
corresponding diagnostic, hence this change silences it by adding a
redundant clear() after move.
Fixes: 7f1b537304 ("aco: add new NOP insertion pass for GFX6-9")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9432>
(cherry picked from commit 97c97781f6)
An earlier commit tried to make this shader compatible with GLSL 3.30,
but it requires, GL_ARB_gpu_shader_int64, which requires GLSL 4.00 and
GL 4.0 according to the extension spec. So we were failing to enable
the required extension, breaking compilation of this shader.
The original intention of that patch was to get this working on zink,
which at the time only supported GL 3.3. But now it supports later
OpenGL versions, so we don't need to do this any longer. Rather than
revert the patch and raise the version all the way back to 430, just
bump it to the require 400 at Ian Romanick's suggestion.
Fixes: 4d47b22bf0 ("glsl/float64: make this compatible with glsl 330")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3991
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9351>
(cherry picked from commit a48151ffad)
We need to iterate the whole row, we can't be clever and only look at
one side, the symmetry doesn't work like that. See the original paper.
total instructions in shared programs: 69392 -> 69322 (-0.10%)
instructions in affected programs: 9002 -> 8932 (-0.78%)
helped: 82
HURT: 28
Instructions are helped.
total bundles in shared programs: 32225 -> 32155 (-0.22%)
bundles in affected programs: 4286 -> 4216 (-1.63%)
helped: 82
HURT: 28
Bundles are helped.
total quadwords in shared programs: 56102 -> 56102 (0.00%)
quadwords in affected programs: 0 -> 0
helped: 0
HURT: 0
total registers in shared programs: 4552 -> 4572 (0.44%)
registers in affected programs: 298 -> 318 (6.71%)
helped: 18
HURT: 38
Registers are HURT.
total threads in shared programs: 3772 -> 3775 (0.08%)
threads in affected programs: 84 -> 87 (3.57%)
helped: 15
HURT: 14
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 0 -> 0
spills in affected programs: 0 -> 0
helped: 0
HURT: 0
total fills in shared programs: 0 -> 0
fills in affected programs: 0 -> 0
helped: 0
HURT: 0
Fixes: 66ad64d73d ("pan/midgard: Implement linearly-constrained register allocation")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9338>
(cherry picked from commit 4f969d796d)
I noticed that we were hitting this before st_create_context() called
util_cpu_detect() and so num_cpu_mask_bits was zero. But there is no
harm in calling util_cpu_detect(), so lets just call it here to be safe.
Fixes: d877451b48 ("util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>
(cherry picked from commit 9fb9019beb)
libpanfrost_lib depends on libpanfrost_bifrost for 'bifrost_compile_shader_nir' symbol
libpanfrost_lib depends on libpanfrost_bifrost_disasm for 'disassemble_bifrost' symbol
LOCAL_STATIC_LIBRARIES requires proper ordering to make the symbols available
Fixes the following building error happening with Android P:
FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so
external/mesa/src/panfrost/lib/decode.c:534: error: undefined reference to 'disassemble_bifrost'
external/mesa/src/panfrost/lib/pan_shader.c:145: error: undefined reference to 'bifrost_compile_shader_nir'
Cc: 20.3 21.0 <mesa-stable@lists.freedesktop.org>
Fixes: 166630f ("android: pan/bi: Separate disasm/compiler targets")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9265>
(cherry picked from commit 97b7786e6b)
Before we introduced the submission thread in 829699ba63, once we
returned from vkQueueSubmit, all signaled syncobj would have a
i915_request/dma-fence waiting to be signaled by some work that would
submitted to HW by i915.
After this submission thread that is no longer the case. We added a
few checks in places like vkQueuePresentKHR() to wait for the binary
semaphores to materialize before we would hand things over to the WSI
code.
Unfortunately 829699ba63 forgot to reset the signaled binary
semaphore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 829699ba63 ("anv: implement shareable timeline semaphores")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4276
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9188>
(cherry picked from commit cb74cd816c)
The blob advertises 64 for this, so let's use the same value. Small
alignments are observed to raise an IMPRECISE_FAULT at least on Bifrost.
As a bonus this forces cache line alignment which will help perf. Fixes
dEQP-GLES31.functional.texture.texture_buffer.render.as_vertex_texture.offset_1_alignments
Fixes: 5f7bafa316 ("panfrost: Enable ARB_texture_buffer_object")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9164>
(cherry picked from commit 3f21b089f8)
fmin(-A, -B) is -fmax(A, B), and fmax(-A, -B) is -fmin(A, B). Therefore
the logic joining A and B should toggle between ior and iand for the
negated versions.
At the very least, a shader from Euro Truck Simulator 2 in shader-db is
affected by this. The KIL instruction in the (ARB assembly) shader ends
up with the wrong logic. This is _probably_ the source of
https://gitlab.freedesktop.org/mesa/mesa/-/issues/1346.
That said, the issue mentions that Mesa 18.0.5 works, but commit
68420d8322 ("nir: Simplify min and max of b2f") was added in 17.3.
Moreover, I was not able to reproduce the error in the ETS2 shader from
shader-db from any Mesa commit near the time the original fd.o bugzilla
was submitted (December 2018). 🤷
In fact, the current error in that shader starts with 9167324a86
("nir/algebraic: Mark some logic-joined comparison reductions as
exact"). That's a bit of a red herring as 9167324a86 just sets off a
chain of replacements that eventually leads to the incorrect min/max of
b2f patterns fixed by this commit.
The other affected shaders in the shader-db results are from Cargo
Commander. These are also ARB assembly shaders.
I think any ARB assembly shader that uses the pattern
SLT r0, ...;
...
KIL -r0;
will suffer from issues related to this.
This change fixes the piglit
tests/spec/arb_fragment_program/kil-of-slt.shader_test test added in
https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/454.
shader-db results:
All Gen6+ platforms had similar result. (Ice Lake shown)
total instructions in shared programs: 20034604 -> 20034486 (<.01%)
instructions in affected programs: 3885 -> 3767 (-3.04%)
helped: 47
HURT: 2
helped stats (abs) min: 2 max: 4 x̄: 2.64 x̃: 2
helped stats (rel) min: 2.33% max: 8.33% x̄: 3.48% x̃: 3.39%
HURT stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3
HURT stats (rel) min: 13.64% max: 16.67% x̄: 15.15% x̃: 15.15%
95% mean confidence interval for instructions value: -2.83 -1.99
95% mean confidence interval for instructions %-change: -3.84% -1.60%
Instructions are helped.
total cycles in shared programs: 979881379 -> 979879406 (<.01%)
cycles in affected programs: 119873 -> 117900 (-1.65%)
helped: 46
HURT: 3
helped stats (abs) min: 10 max: 756 x̄: 45.41 x̃: 26
helped stats (rel) min: 0.53% max: 19.72% x̄: 1.67% x̃: 1.26%
HURT stats (abs) min: 28 max: 56 x̄: 38.67 x̃: 32
HURT stats (rel) min: 1.44% max: 3.54% x̄: 2.75% x̃: 3.27%
95% mean confidence interval for cycles value: -70.83 -9.70
95% mean confidence interval for cycles %-change: -2.23% -0.57%
Cycles are helped.
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8115098 -> 8115076 (<.01%)
instructions in affected programs: 2592 -> 2570 (-0.85%)
helped: 32
HURT: 2
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.88% max: 2.70% x̄: 1.35% x̃: 1.31%
HURT stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5
HURT stats (rel) min: 17.24% max: 18.52% x̄: 17.88% x̃: 17.88%
95% mean confidence interval for instructions value: -1.15 -0.15
95% mean confidence interval for instructions %-change: -1.83% 1.39%
Inconclusive result (%-change mean confidence interval includes 0).
total cycles in shared programs: 238189718 -> 238189802 (<.01%)
cycles in affected programs: 75076 -> 75160 (0.11%)
helped: 3
HURT: 31
helped stats (abs) min: 2 max: 130 x̄: 44.67 x̃: 2
helped stats (rel) min: 0.18% max: 5.70% x̄: 2.02% x̃: 0.19%
HURT stats (abs) min: 2 max: 70 x̄: 7.03 x̃: 4
HURT stats (rel) min: 0.07% max: 6.41% x̄: 0.53% x̃: 0.15%
95% mean confidence interval for cycles value: -7.27 12.21
95% mean confidence interval for cycles %-change: -0.33% 0.94%
Inconclusive result (value mean confidence interval includes 0).
No fossil-db changes on any Intel platform.
Fixes: 68420d8322 ("nir: Simplify min and max of b2f")
Closes: #1346
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>
(cherry picked from commit 7e127c1fca)
Fixes 'nir_tex_src_coord' param was provided to NIR 'txf' operation as a
vec3 for TEXTURE_1D_ARRAY target, causing an assert.
Only following targets require vec3: TEXTURE_2D_ARRAY, TEXTURE_3D,
TEXTURE_CUBE, TEXTURE_CUBE_ARRAY. The rest must use vec2.
Packing layer value into Y-coordinate the same way it was done in
'create_fs' in commit 2bf6dfac.
Fixes: a01ad311 ("st/mesa: Add NIR versions of the PBO upload/download
shaders. ")
Signed-off-by: Yevhenii Kharchenko <yevhenii.kharchenko@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9014>
(cherry picked from commit 1516b6bd9a)
Since inline samplers are uniforms, just like kernel args, and
nir_lower_vars_to_explicit_types will assign driver_location based
on order in the variable list, move the inline samplers to the end
of the list to prevent them from creating gaps in the kernel arg
offsets.
Fixes: ff05da7f ("microsoft: Add CLC frontend and kernel/compute support to DXIL converter")
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9138>
(cherry picked from commit 3ee8f2ccba)
Importantly, also run that before mucking with the variable list via image lowering,
which removes and inserts variables, making the driver_location no longer line up
with metadata.
Fixes: ff05da7f ("microsoft: Add CLC frontend and kernel/compute support to DXIL converter")
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9138>
(cherry picked from commit 9da8179a1e)
We have to choose between:
1) Stop handling two identical GPUs
2) Stop having crashes with other layers active.
3) Fix the Vulkan Loader.
Since nobody seems to want to spend enough effort to do 3 the
effective choice is between 1 and 2. This is choosing 2, as
two identical GPUs is pretty uncommon since crossfire doesn't
work on Linux anyway.
(And it would only work sporadically as the game needs to enable the
extension)
CC: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3801
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 38ce8d4d00)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9217>
The calculate_tess_lds_size function already returns the size in blocks
of the encoding granule, but we forgot to adjust config->lds_size.
This variable is not used to actually set the LDS size used for TCS,
but by ACO to make scheduling decisions.
Fossil DB stats on Sienna Cichlid:
Please note that the +3729.43% is NOT a regression.
The real LDS size used didn't change, it was just reported incorrectly.
Totals from 1342 (0.96% of 139391) affected shaders:
VGPRs: 60880 -> 80240 (+31.80%); split: -0.05%, +31.85%
CodeSize: 3378456 -> 3381224 (+0.08%); split: -0.23%, +0.31%
LDS: 687104 -> 26312192 (+3729.43%)
MaxWaves: 29794 -> 23962 (-19.57%)
Instrs: 644194 -> 644610 (+0.06%); split: -0.32%, +0.39%
Cycles: 2675068 -> 2676804 (+0.06%); split: -0.31%, +0.38%
VMEM: 428840 -> 517418 (+20.66%); split: +22.53%, -1.88%
SMEM: 91831 -> 88587 (-3.53%); split: +5.70%, -9.23%
VClause: 22740 -> 19384 (-14.76%); split: -16.18%, +1.42%
SClause: 19116 -> 18373 (-3.89%); split: -4.34%, +0.46%
Copies: 66662 -> 63448 (-4.82%); split: -5.55%, +0.73%
Fixes: cf89bdb9ba "radv: align the LDS size in calculate_tess_lds_size()"
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9098>
(cherry picked from commit 48f349971f)
They've all supported it since either forever or Iron Lake which is
equivalent to forever for Vulkan.
From Kenneth Graunke's GitLab review:
"Linear blending of depth buffer data is usually fairly nonsense
(something's 2 meters away? another thing's 6 meters away? let's
just report 4 meters?)...but it's definitely a thing we can do, so
we may as well let apps do it, and trust them not when it doesn't
make sense."
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9110>
(cherry picked from commit 56d005c21c)
Test the sampler->conversion for NULL pointer before dereferencing it.
Fixes: Regressions in VulkanCTS.
Fixes: 226316116c "intel/anv: Fix condition to set MipModeFilter for YUV surface"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 69e94e8939)
On Intel platforms before Gen6, there is no min or max instruction.
Instead, a comparison instruction (*more on this below) and a SEL
instruction are used. Per other IEEE rules, the regular comparison
instruction, CMP, will always return false if either source is NaN. A
sequence like
cmp.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F
(+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F
will generate the wrong result for min if g22 is NaN. The CMP will
return false, and the SEL will pick g22.
To account for this, the hardware has a special comparison instruction
CMPN. This instruction behaves just like CMP, except if the second
source is NaN, it will return true. The intention is to use it for min
and max. This sequence will always generate the correct result:
cmpn.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F
(+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F
The problem is... for whatever reason, we don't emit CMPN. There was
even a comment in lower_minmax that calls out this very issue! The bug
is actually older than the "Fixes" below even implies. That's just when
the comment was added. That we know of, we never observed a failure
until #4254.
If src1 is known to be a number, either because it's not float or it's
an immediate number, use CMP. This allows cmod propagation to still do
its thing. Without this slight optimization, about 8,300 shaders from
shader-db are hurt on Iron Lake.
Fixes the following piglit tests (from piglit!475):
tests/spec/glsl-1.20/execution/fs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/fs-nan-builtin-min.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-min.shader_test
Closes: #4254
Fixes: 2f2c00c727 ("i965: Lower min/max after optimization on Gen4/5.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8115134 -> 8115135 (<.01%)
instructions in affected programs: 229 -> 230 (0.44%)
helped: 0
HURT: 1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>
(cherry picked from commit 3c31364f5e)
Mip Mode Filter must be set to MIPFILTER_NONE for Planar YUV surfaces.
Add the missing condition to check for planar format.
Fixes: b24b93d584 "anv: enable VK_KHR_sampler_ycbcr_conversion"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 226316116c)
On Gen7, we have to split shuffles into two MOVs for 64-bit types so we
can't handle source modifiers. On Gen12.5, we have to use integer types
all the time so we can't use them there either. Fixing that will be a
different commit but it interacts with this one.
Fixes: 90c9f29518 "i965/fs: Add support for nir_intrinsic_shuffle"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>
(cherry picked from commit 3ce6ca7214)
We can't move the shuffle to a new block so this only works if the
shuffle and the bcsel are in the same block. Fortunately, in the
motivating case, this is true.
Also, we have to be careful around discard. We could try really hard to
just avoid moving them past discard but we choose to simply bail if we
see a discard instead.
Fixes: 4ff4d4e569 "nir/opt_intrinsic: Optimize bcsel(b, shuffle..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9068>
(cherry picked from commit ceb6986d34)
We were only get guaranteed that libfreedreno (and thus the tracepoints
generation) was ready when we linked, not when we compiled the gmemtool.c
that also used it.
Fixes: a02dcb970f ("freedreno: Add GPU tracepoints")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9056>
(cherry picked from commit eabee821e9)
asan on llvmpipe with piglit tests/spec/arb_gl_spirv/execution/ssbo/array-indirect.shader_test
reported.
=================================================================
==3288325==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f5b2d6513cf in __interceptor_malloc (/lib64/libasan.so.6+0xab3cf)
#1 0x7f5b2a1ae810 in ralloc_size ../src/util/ralloc.c:133
#2 0x7f5b2a1ae7e1 in ralloc_context ../src/util/ralloc.c:120
#3 0x7f5b2b210177 in gl_nir_link_uniform_blocks ../src/compiler/glsl/gl_nir_link_uniform_blocks.c:585
#4 0x7f5b2af7f52d in gl_nir_link_spirv ../src/compiler/glsl/gl_nir_linker.c:614
#5 0x7f5b2a3b76fa in st_link_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:765
#6 0x7f5b2a3ace7b in st_link_shader ../src/mesa/state_tracker/st_glsl_to_ir.cpp:65
#7 0x7f5b2a471165 in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3122
#8 0x7f5b2a97a6d8 in link_program ../src/mesa/main/shaderapi.c:1311
#9 0x7f5b2a97a6d8 in link_program_error ../src/mesa/main/shaderapi.c:1419
#10 0x7f5b2a97df45 in _mesa_LinkProgram ../src/mesa/main/shaderapi.c:1911
#11 0x7f5b299b59e5 in stub_glLinkProgram /mnt/devel/gl/piglit/tests/util/piglit-dispatch-gen.c:33956
#12 0x40a71a in link_and_use_shaders /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:1604
#13 0x415722 in init_test /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:5225
#14 0x4164ce in piglit_init /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:5597
#15 0x7f5b29a214e9 in run_test /mnt/devel/gl/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:73
#16 0x7f5b29a103fe in piglit_gl_test_run /mnt/devel/gl/piglit/tests/util/piglit-framework-gl.c:229
#17 0x407847 in main /mnt/devel/gl/piglit/tests/shaders/shader_runner.c:72
#18 0x7f5b2928f1e1 in __libc_start_main (/lib64/libc.so.6+0x281e1)
SUMMARY: AddressSanitizer: 48 byte(s) leaked in 1 allocation(s).
Fixes: 57239192 ("nir/linker: add gl_nir_link_uniform_blocks.c")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8974>
(cherry picked from commit 14b2dc0013)
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76.
No changes in fossil-db because there are no shaders in fossil-db that
use shaderInt8. :(
A couple alternatives were considered.
1. Lower 8-bit integers to 16-bit on all platforms. I looked at the
output of a few shaders from the Vulkan CTS, and it was a mess.
There were so many extra type converting MOVs. I think all of that
could be cleaned up, but it would be more work. It would also not be
great for cherry-picking to a stable branch.
This *is* the approach that will be taken Mesa 21.1. See also
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8730.
2. Disable the optimization that prunes the `& 7`. This would be more
optimal in shaders that don't have the explicit mask, but it's not
very future proof. It would potentially require auditing future
optimizations to make sure they don't run afoul of this problem.
In the end, the easiest solution seems to be adding the extra mask to
implement the specified semantics of the NIR shift instructions...
especially since the only shaders we have that use shaderInt8 are from
the CTS.
v2: Use braces in the else part because they were used in the then part.
Suggested by Jason.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 26fc5e1f4a ("nir/algebraic: expand existing 32-bit patterns to all bit sizes using loops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9001>
We stopped reporting the alpha test screen cap, and stopped using the
value in the key, so now shrink the key. This gets another switch case
out of the hot uniforms upload path.
Fixes: 1404b8b1e5 ("vc4: do not report alpha-test as supported")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8601>
(cherry picked from commit cc0841c82a)
This should no longer be necessary since the mark_block_wqm() we use to
flag break conditions as WQM now adds block to the worklist. With them
added to the worklist, get_block_needs() will add WQM to block_needs.
Adding WQM to block_needs here without adding the block to the worklist
(like we do here) can cause issues because it does not ensure that the
predecessors' branches are in WQM (needed for it to be possible to
transition to WQM in the block). This happened in an Overwatch shader.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 661922f6ac ("aco: add block to worklist in mark_block_wqm()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4066
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
(cherry picked from commit f0074a6f05)
The base mask previously used was 0xffffffff. This is not correct (but
should still work) for 16-bit and 8-bit values, but it means the high
32-bits of 64-bit values will get chopped off.
Instead of just restricting the pattern to 32-bits (as was done before
00b28a50b2), this extends the optimization in two ways:
1. Make it correct for other bit sizes.
2. Make it work for arbitrary shift counts.
This has the added benefit of reducing the number of patterns actually
added (7 previously, 4 now).
The "Reassociate for improved CSE" part is just reverted to its
pre-00b28a50b2c behavior. I doubt that pattern is likely to have much
impact outside 32-bits.
This change fixes the piglit tests
tests/spec/arb_gpu_shader_int64/fs-shl-of-shr-int64.shader_test and
tests/spec/arb_gpu_shader_int64/fs-iand-of-iadd-int64.shader_test.
All of the shaders helped in shader-db are vertex shaders on platforms
with vector-oriented vertex processing. The shaders contain ((x >> 16)
<< 16). These platforms set lower_extract_word, so the optimization
that transforms (x >> 16) to extract_u16 doesn't trigger. With only ~60
shaders involved, I didn't bother trying to add extract_XYZ versions of
these patterns to try to get those cases.
Fixes: 00b28a50b2 ("nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Haswell and earlier Intel GPUs had simlar results. (Haswell shown)
total instructions in shared programs: 16397554 -> 16397496 (<.01%)
instructions in affected programs: 7961 -> 7903 (-0.73%)
helped: 58
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.36% max: 1.89% x̄: 0.99% x̃: 0.78%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.13% -0.85%
Instructions are helped.
total cycles in shared programs: 1035483770 -> 1035483504 (<.01%)
cycles in affected programs: 75922 -> 75656 (-0.35%)
helped: 44
HURT: 2
helped stats (abs) min: 2 max: 12 x̄: 6.14 x̃: 2
helped stats (rel) min: 0.05% max: 1.67% x̄: 0.87% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06%
95% mean confidence interval for cycles value: -7.28 -4.29
95% mean confidence interval for cycles %-change: -1.03% -0.63%
Cycles are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>
(cherry picked from commit 6b0443a900)
The polygon list is written by tiler jobs and read by fragment ones,
and nothing should re-use the heap until the fragment job is done.
4fec6c9448 ("panfrost: Add the tiler heap to fragment jobs") fixed
this for the !multi-context case by adding the heap BO to fragment job.
But the tiler heap is shared accross contexts, and vertex/tiler+fragment
job submission is done through 2 separate ioctls, meaning that
vertex/tiler and fragment jobs from 2 different context might be
interleaved.
Add a lock at the device level to ensure tiler/vertex+fragment jobs are
submitted sequentially, with no other jobs using the same tiler heap
in-between.
Cc: mesa-stable
Fixes: d8deb1eb6a ("panfrost: Share tiler_heap across batches/contexts")
Reported-by: Icecream95 <ixn@disroot.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8822>
(cherry picked from commit 66125c429f)
Vertex attribute bounds checking is supposed to be done per-attribute:
is_oob = index * stride + attrib_offset + attrib_size > buffer_size
but we were obtaining num_records by dividing the buffer size by the
stride, making it per-vertex:
is_oob = index * stride + (stride - 1) >= buffer_size
An example from Dead Cells (Wine) is:
attribute bindings: 0, 1, 2
attribute formats: r32g32, r32g32, r32g32b32a32
attribute offsets: 0, 0, 0
binding buffers: all the same buffer
binding offsets: 0, 8, 16
binding sizes: 128, 120, 112
binding strides: 32, 32, 32
Workaround this issue without switching to per-attribute descriptors by
rounding up the division. This is still incorrect, but it should now no
longer consider in-bounds attributes out-of-bounds.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3796
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4199
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8835>
(cherry picked from commit 56cd79b63d)
The _ModelProjectMatrix matrix embedded inside has members inside of it
marked as 16-byte aligned, and so the context also has to be 16-byte
aligned or access to those members would be invalid. I believe the
compiler used this to use better 16-byte-aligned load/stores to other
members of the context, breaking when the context's alignment was only 8
(as normal mallocs guarantee).
Fixes: 3175b63a0d ("mesa: don't allocate matrices with malloc")
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8783>
(cherry picked from commit 55e853d823)
Single-line version of MSVC warning suppression does not extend beyond
the #endif directive. Use push/disable/pop instead.
Also suppress 26452, which is a similar analysis warning.
This could also be fixed with constexpr if, but C++17 would be required.
Fixes: 790516db0b ("gallium/swr: fix gcc warnings")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8093>
(cherry picked from commit 3c7062417b)
Disable compression in iris_flush_resource if the resource lacks a
modifier. When a caller wants to prepare such a resource for sharing
(via eglCreateImage for example), this change enables all reference
holders to access the resource in a common manner - without compression.
This fixes misrendering with 3D-accelerated qemu. A piglit test which
reproduces qemu's behavior, ext_image_dma_buf_import-export-tex, is also
enabled to pass.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2678
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8663>
(cherry picked from commit 40d6b92de9)
Many entries in the dri2_format_table have _DRI_IMAGE_FORMAT_NONE as the
dri_format. Make the result of dri2_get_mapping_by_format a tad more
well-defined by returning NULL when this format is passed into it.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8663>
(cherry picked from commit 0a8cc88202)
In Vulkan, for some variable modes, the generated NIR will have derefs
pointing to resource index intrinsics instead of the variable. This
was letting nir_remove_dead_variables pass remove those variables,
which would lose information relevant for later passes after
spirv2nir.
Add a set to keep track of such variables and prevent them to be
removed when producing the NIR output.
Issue reported by Rhys.
Fixes: c4c9c780b1 ("spirv: Remove more dead variables")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8706>
(cherry picked from commit 10b3eecd36)
We can skip CCS ambiguate if followed by a fast clear within render
pass.
v2: (Jason)
- Check array layer as well since we only fast clear first layer and
first LOD.
- Don't drop fast clear check while doing resolve operation.
Fixes: d5849bc840 "anv: Skip HiZ and CCS ambiguates which preceed fast-clears"
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6988>
(cherry picked from commit 001722b3a3)
According to the BSpec page for MEDIA_VFE_STATE, on Gen12 platforms
"if a fused configuration has fewer threads than the native POR
configuration, the scratch space allocation is based on the number of
threads in the base native POR configuration". However we currently
use the subslice count from devinfo->num_subslices[0], which only
includes the subslices currently enabled by the platform fusing. This
leads to scratch space underallocation and occasional hangs.
The problem is likely to affect most Gen12 GPUs with less than 96 EUs.
GFXBench5 Aztec Ruins is able to reproduce the issue fairly reliably.
Fixes: 9e5ce30da7 "intel: fix the gen 12 compute shader scratch IDs"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8636>
(cherry picked from commit e2c5ef6cd6)
While invalidating the AUX-TT entries, we have to consider the surface
offset as well otherwise, we will end up invalidating another surface's
CCS portion.
For eg. when we have HiZ+CCS and STC_CCS enabled, both will use the CCS
portion allocated at the end of BO. While invalidating the CCS portion
of stencil buffer, we will end up invalidating the CCS portion that
belongs to the depth main surface and vice-versa, if the surface offset
is not considered.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4123
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8677>
(cherry picked from commit dab229ef69)
D and linear are both DISPLAY micro tiling according to ac_surface
but don't work together. This fixes an issue with GFX9+.
This fixes the SkQP WritePixelsNonTexture_Gpu test.
Fixes: 69ea473eeb ("amd/addrlib: update to the latest version")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8665>
(cherry picked from commit 12ce72fcfc)
Conflicts:
src/amd/vulkan/radv_meta_resolve.c
If PAN_MESA_DEBUG=deqp is set to enable testing, then we advertise shader
images to get GLES3.1, even though we don't have any of the shader image
funcs hooked up. This caused breakage when cso started unbinding shader
images at context destruction.
Just stub out the function for now, you'll still segfault when creating an
image.
Cc: mesa-stable (for the next commit)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8530>
(cherry picked from commit f259fcae83)
We could probably support some strides if we tried hard enough but the
whole point of this opcode is to accelerate things with crazy Align16 or
crazy regions. It's ok if we have to emit an extra MOV to get a packed
source.
Fixes: 8b4a5e641b "intel/fs: Add support for subgroup quad operations"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
(cherry picked from commit 58bcb5401d)
Previously, we were returning 2 whenever the source was a Q type. As
far as I can tell, the only reason why this hasn't blown up before is
that it was only ever used for VGRFs until the SWSB pass landed which
uses it for everything. This wasn't a problem because Q types generally
aren't a thing on TGL. However, they are for a small handful of
instructions.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
(cherry picked from commit 4c8cbe9b13)
The intention of IRIS_DIRTY_{RENDER,COMPUTE}_RESOLVES_AND_FLUSHES
is to avoid considering resolves/flushes on back to back draw calls
where nothing of significance has changed with the resources. When
anything changes that could require a resolve, we must flag those.
Those situations are:
1. Texture/image/framebuffer bindings change
(as the set of images we need to look at is now different)
2. Depth writes are enabled/disabled (the resolve code uses this)
3. The aux state for a currently bound resource changes.
We were missing this last case. In particular, one example where
we missed this was:
1. Bind a texture.
2. Clear that texture (likely blits/copies/teximage would work too)
3. Draw and sample from that texture
Clear-then-Bind would work, as binding would flag resolves as dirty.
But Bind-then-Clear doesn't work, as clear can change the aux state
of the bound texture, but wasn't flagging that anything had changed.
Technically, we could consider whether the resource whose aux state
is changing is bound for compute (and only flag COMPUTE_RESOLVES),
or bound for a 3D stage (and only flag RENDER_RESOLVES), and flag
nothing at all if it isn't bound. But we don't track that well,
and it probably isn't worth bothering. So, flag unconditionally
for now.
This does not appear to impact Piglit's drawoverhead scores.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3994
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4019
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8603>
(cherry picked from commit e2500c02cc)
The spec calls to always use sample 0 in this case, whereas we can do
undefined things for invalid sample id's in the MSAA case.
Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.sample_n_*
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8574>
(cherry picked from commit 245a696741)
Because Gallium and Vulkan disagree on what kind of state strides is, we
need to wrangle this state a bit, and up until now, we've been simply
fixing this up while binding the vertex-buffers.
But this isn't robust, because the vertex element state might be bound
after the vertex-buffer state was bound. We also need to take
binding-map into account, which we're currently missing as well.
Instead, w need to deal with this at a place where we know what's being
used for both of these. So let's do this during draw instead.
Ideally, we'd also do some dirty-tracking to know if this is needed or
not, but I believe Mike has some patches in this areas lined up, so it
might be easier to wait for those.
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3661
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4125
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8588>
(cherry picked from commit d74b012260)
When NGG is used, the hw can't know the number of geometry shader
primitives. To fix that, the NGG geometry shader accumulates itself
the number of primitives by using an atomic operation directly to GDS.
Then, begin/query copy the start/stop values from GDS to the
query pool buffer using a PS_DONE event. This was actually wrong
because PS_DONE is completely asynchronous to everything and executed
when the preceding draws finish pixel shaders.
Fix this by using a COPY_DATA packet which is synced with CP. This
fixes random failures on Sienna Cichlid with
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8590>
(cherry picked from commit 085e2ce3d4)
We incorrectly utilize the stencil layouts structures even if we
should stick to the depth_stencil ones if the layout includes stencil.
v2: Don't forget stencil only layout (Nanley)
Simplify callers of new helper functions (Nanley)
v3: Store VK_IMAGE_LAYOUT_UNDEFINED when no stencil is available (Nanley)
Use a switch statement (Nanley)
v4: Consider all layouts but depth only to be potential stencil layouts (Lionel)
v5: Refactor helper in vk_image_layout_depth_only() and discard
VkAttachmentDescriptionStencilLayoutKHR in
VkAttachmentDescription2KHR if format is not depth/stencil.
v5: s/LAYOUT_COLOR_ATTACHMENT_OPTIMAL/LAYOUT_DEPTH_ATTACHMENT_OPTIMAL/ (Nanley)
v6: Fix overly harsh assert()
Fixes: c1c346f166 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4084
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8475>
(cherry picked from commit 28207669d0)
Fixes several dEQP-VK.robustness.robustness2.* tests on GFX8. Generations
other than GFX8 don't fail the tests because bounds-checking is done using
the index (making it per-vertex).
fossil-db (Polaris):
Totals from 1387 (0.99% of 140385) affected shaders:
(no statistics affected)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7834>
(cherry picked from commit 914c61d6c0)
This reverts commit aca67a555c, which
regressed the following Piglit test on i915 (and presumably r200):
piglit/spec/!opengl 1.1/sized-texture-format-channels
Specifically, it begins testing glTexImage2D with format GL_RGBA,
type GL_FLOAT, and internalFormat GL_RGB16F, which leads to the
following error:
Mesa 21.0.0-devel implementation error: unexpected format GL_RGB16F in _mesa_choose_tex_format()
Please report at https://gitlab.freedesktop.org/mesa/mesa/-/issues
sized-texture-format-channels: ../../src/mesa/main/teximage.c:2836: _mesa_choose_texture_format: Assertion `f != MESA_FORMAT_NONE' failed.
i915 and r200 unconditionally support ARB_half_float_pixel, but neither
support RGB16F as an internal format. According to Ian's rationale
in the commit message for 1edca151a0
(which enabled that extension for all drivers):
"This extension only adds data types that can be passed to, for
example, glTexImage2D. It does not add internal formats. Since
you can already pass GL_FLOAT to glTexImage2D this shouldn't pose
any additional issues with those drivers. Note that r200 and i915
already supported this extension, and they don't support
floating-point textures either."
So, commit aca67a55c011 enabled half-float internal formats on hardware
that cannot support them. We should revert the change.
v2: Don't reintroduce the _mesa_is_gles3() condition, as that shouldn't
be necessary (feedback from Erik Faye-Lund).
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8458>
(cherry picked from commit 07473321a2)
_tnl_draw_prims loops over the prims, and for each one, maps the VBOs,
draws, and unmaps them. But it failed to reset nr_bos = 0 between each
loop iteration, which meant that when processing prim[n], the BO list
had all BOs for prior primitives too. Assuming each primitive used the
same VBOs, that means the same VBO would appear in the list multiple
times, and it would try to unmap the same BO multiple times. This
triggered asserts on the second unmap, as it had already been unmapped.
Fixes Piglit's oes_draw_elements_base_vertex-multidrawelements on i915.
Fixes: e99e7aa4c1 ("mesa: switch Draw(Range)Elements(BaseVertex) calls to DrawGallium")
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
(cherry picked from commit 9fb5d7acbb)
Prior to commit e99e7aa4c1 (mesa: switch
Draw(Range)Elements(BaseVertex) calls to DrawGallium), the indices
parameter of glDrawElements (an offset into the VBO) was handled by
setting ib->ptr. With that commit, it instead began binding the
index buffer with offset 0, and adding the offset to draw->start,
which eventually becomes prim->start.
t_draw.c's bind_indices() was trying to convert the relevant section of
the index buffer to GLuints, but was failing to account for start, so
it nabbed the wrong portion of the index buffer.
Fixes: e99e7aa4c1 ("mesa: switch Draw(Range)Elements(BaseVertex) calls to DrawGallium")
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
(cherry picked from commit 376c8f750b)
Commit 4c751ad67a (vbo/dlist: use a shared
index buffer) caused multiple draws to use the same index buffer, and
began setting the primitive's `start` field to the offset needed to
access the right portion of the index buffer.
Unfortunately, t_rebase_prims completely botches handling this case.
Say for example we had start = 40, min_index = 6, max_index = 11.
The actual indexes in the buffer are ib[40..45]. t_rebase_prims,
however, would allocate an index buffer containing only 6 elements,
and populate them with <ib[0..5] - min_index>. For one, this reads
the wrong source data, leading to garbage index values. For another,
it stores the new index buffer in the wrong spot, so drawing will try
and read elements [40..45] of an array of length 6, and crash.
This patch makes t_rebase_prims allocate a larger index buffer, with
the blank space at the beginning, and try to copy the correct section
of index buffer data over. This only works if `start` is the same for
all primitives, however, so if we detect different ones, we recurse
to rebase and call draw() separately for each different start value.
Fixes: 4c751ad67a ("vbo/dlist: use a shared index buffer")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4082
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
(cherry picked from commit bd6120f562)
We only convert line strips to lines in certain cases, but were flagging
node->merged.prim as GL_LINES even if we simply copied a GL_LINE_STRIP
prim[0] over without modifying it.
Fixes Piglit's lineloop test (which triggers loop -> strip conversion
earlier in this path, then was incorrectly triggering strip -> list
mode modification with no changes to the underlying data).
Fixes: 310991415e ("vbo/dlist: implement primitive merging")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
(cherry picked from commit 14ae5069da)
I'm can't see why this is necessary. There are already new fields
(node->merged.{min,max}_index) for the new values in the merged case.
But in vbo_save_draw.c, in the !draw_using_merged_prim case, we would
try and use the original node...with the now destroyed min/max index.
Fixes some assert failures when running with swtnl and forcing the
non-merged path (though it takes the merged path by default).
Fixes: 4c751ad67a ("vbo/dlist: use a shared index buffer")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
(cherry picked from commit 44bdd5225c)
In some rare cases, L2 needs to be flushed if an image is affected
by the pipe misaligned issue. This is roughly based on AMDVLK.
I confirmed that disabling TC-compat HTILE, and respectively DCC,
for the relevant images also fixes the regressions below.
This fixes some regressions introduced with L2 coherency for
dEQP-VK.renderpass2.depth_stencil_resolve.image_2d_* and for
dEQP-VK.renderpass2.suballocation.multisample_resolve.*.
Fixes: 4a783a3c78 ("radv: Use L2 coherency on GFX9+.")
Co-Authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8557>
(cherry picked from commit 4c99d6ff54)
The driver used to invalidate the vector cache for meta operations
but this has been removed and I think it should be restored to fix
a bunch of regressions on GFX8.
This probably needs to be cleaned up but this is a hotfix.
This fixes a bunch of regressions and flakes on GFX8 like
dEQP-VK.pipeline.multisample.sample_locations_ext.draw.color.samples_4.*.
Fixes: 8f8d72af55 ("radv: Use access helpers for flushing with meta operations.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8573>
(cherry picked from commit 8882abe47e)
This limits the exposure of these functions to when the extension is
available. Prevents crashes otherwise, as the rest of the infrastructure
doesn't necessarily expect these functions when the extension is not
available.
Fixes: 40c1f9883e ("mesa,glsl: add support for GL_NV_shader_atomic_int64")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8533>
(cherry picked from commit a0f4affcf6)
The vulkan loader doesn't load layers for apps that require a newer
version of vulkan, so this layer didn't get loaded for vulkan 1.2 apps.
I would like to just stick 1.09 in there but it might be worth
validating it works at new version of vulkan I suppose and the major
doesn't revise that often
Fixes: 9bc5b2d169 ("vulkan: add initial device selection layer. (v6)")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8508>
(cherry picked from commit ca834d0b2d)
So far we only write a maximum of 4 dwords further into the batch and
it seems just going over the CS prefetch was enough.
Turns out writing more dwords can delay the writes and we start
prefetching stuff that hasn't landed in memory yet.
This fixes the issue by stalling the CS to ensure the writes have
landed before we go over the prefetch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 796fccce63 ("intel/mi-builder: add framework for self modifying batches")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8525>
(cherry picked from commit d8154c4006)
The flush VA space was only allocated for command buffers on the
graphics queue. Also, the ZPASS_DONE event should never be emitted
on compute queues because it hangs.
Invalidating the L2 metadata cache is only required for coherency
between the RBs and L2, so only on the graphics queue.
The L2 cache is invalidated at beginning of any IBs and that should
also invalidate the L2 metadata cache for compute anyways.
Fixes: 4a783a3c ("radv: Use L2 coherency on GFX9+.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8494>
(cherry picked from commit c6849f9687)
We sometimes use anv_layout_to_aux_state() to compute the aux state of
an image during the resolve operations at the end of a render
(sub)pass.
If we're dealing with a multisampled image that is created without a
transfer usage, our internal code might trigger a resolve using the
transfer layout (see genX_cmd_buffer.c:cmd_buffer_end_subpass), for
which the image doesn't the usage bit. The current code tries to AND
the 2 usages which won't have any bit in common, thus skipping all
checks below.
v2: Add the transfer usages depending on attachment usage (Lionel)
v3: Limit to samples > 1 (Jason) && DEPTH_STENCIL_ATTACHMENT_BIT (Lionel)
v4: Add transfer usage at image creation (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 54b525caf0 ("anv: Rework anv_layout_to_aux_state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4037
Reviewed-by: Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8307>
(cherry picked from commit d4b4d69d4d)
# Ephemeral packages (installed for this script and removed again at the end)
STABLE_EPHEMERAL="libssl-dev"
apt-get -y install "$STABLE_EPHEMERAL"
. .gitlab-ci/container/build-mold.sh
apt-get purge -y "$STABLE_EPHEMERAL"
. .gitlab-ci/container/cross_build.sh
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.