With vkCmdPipelineBarrier, it's possible to specify a barrier with
pipeline stages but without any memory barriers. These might not be
practical, but are legal Vulkan code.
Barriers like this are currently ignored in mesa, as we only convert
barriers with passed memory barriers into vkCmdPipelineBarrier2.
This commit adds handling of execution only barriers by converting them
into a memory barrier without access masks.
Fixes: 97f0a4494b ("vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34187>
Fix the nginx cache snippets - I'd missed the file nesting somehow.
Tested on a debian:bookworm image with nginx-full installed, checked
that we could pull an arbitrary external site, as well as S3, as well
as GitLab artifacts.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34341>
Fixes some upcoming CTS tests for texture clears.
* some drivers will attempt to issue clears with zero range
and hit asserts/crashes (spec clarification for negative
values)
* fix error thrown with negative values to match spec
* fix cases for clearing generic compressed formats
* fix negative case of using color format while having
depth/stencil internalformat and vice versa
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>
On Valhall we can optimize lower waits, which waits for both readers and
writers, into resource_waits which only wait for writers, allowing
threads accessing read-only resources to execute concurrently.
Let's use that on LD_TILE instructions so we can optmize the read-only
case.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
In order to dynamically load the content of the tile buffer, we need
to know the target (color, depth or stencil) and the conversion to
apply. Let's define the load_input_attachment_{target,conv}_pan
intrinsics so we can dissociate the logic lowering input attachment
loads into load_converted_output_pan, and the part optimizing the shader
when input attachment map is passed at compile time.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
We take the color attachment remapping into account when emitting
blend descriptors, and we make sure we re-emit those when this color
attachment map is dirty.
We also need to take the remapping into account when checking the
render targets written by the fragment shader, hence the addition of
a color_attachment_written_mask() helper.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
Re-order things in panvk_deserialize_shader() to avoid declaring local
variables for stuff we feed the panvk_shader with. The only exception
is pan_shader_info, because we need to know the shader stage to call
vk_shader_zalloc(), which if part of pan_shader_info.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
Some upcoming changes in the runtime will make it impossible to rely
on the pipeline or runtime information to know whether a fragment
shader has input attachments.
Instead we gather that information at compile time and store it in our
shader bind_map.
At runtime we check whether the fragment shader has input attachments
and whether those map to the runtime depth/stencil input attachments
to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
For drivers using the render pass emulation provided by the
runtime, it's important to express the mapping between
depth/stencil/color attachments and input attachments using
VkRenderingInputAttachmentIndexInfoKHR, otherwise those drivers
have to special-case emulated render passes in their
CmdBeginRendering() implementation.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for
Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it
does this with a pass that pushes UBOs. That's good!
The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend
at this point. This is where the trouble comes in:
- True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's
write-combine memory, which we cannot read from efficiently (at least
depending on coherency details that were never plumbed through panfrost.ko and
unlikely to be replumbed now that panthor is the new hot stuff). So, pushing
true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is
dubious... When I benchmarked this on MT8192 in early 2023, this pushing
improved FPS in SuperTuxKart but hurt FPS in Dolphin.
- True UBOs can be written on the GPU. In OpenGL, we have batch tracking
infrastructure to sort this mess out in theory. What this means is that
pushing UBOs requires us to flush writers AND STALL at draw-time. If this is
ever hit, our performance is utterly trashed. But it gets worse.
- True UBOs can be written in the same batch that reads them. For example, we
could bind a buffer as a transform feedback buffer, do a draw with XFB, then
rebind as a UBO and do a draw reading. This is where we collapse -- our logic
will flush the writer, which is the same batch we were in the middle of
enqueueing a draw to. When we try to push words, we'll crash with theatrics.
This could be solved by smartening the batch tracking logic but it's not
trivial by any means.
So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing
it!
Long term, the solution will be to push on the GPU instead. This avoids all of
these issues. This can be done with a compute kernel or with CSF instructions.
The Vulkan driver will likely have to do this for performance, since pushing
UBOs from the CPU is utterly broken in Vulkan for the above reasons.
I have a branch somewhere doing this on v9 but I'm doing this on NIR time to
unblock a core change that was crashing piglit due to this pile of unsoundness.
Let's fix the correctness issues first, then someone can look at recovering
performance later when we're not blocking unrelated work.
Fixes corruption in Piglit test
gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with
transform feedback. (I suspect the test still doesn't pass for the same reason
it's broken on other tilers. But that's a better place to be than oodles of
memory corruption.)
According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
ss->info.ubo_mask includes the push+sysval UBO so if there's a user
UBO bound at the same index as the push+sysval UBO, without this
change we end up writing a descriptor for the user UBO at that index.
Fixes: 3b3cd59f ("panfrost: Launch transform feedback shaders")
Cc: mesa-stable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
For instance, this issue is triggered with "piglit/bin/fcc-blit-between-clears -auto -fbo":
Direct leak of 16400 byte(s) in 5 object(s) allocated from:
#0 0xb720689a in __interceptor_calloc (/usr/lib/libasan.so.6+0xb289a)
#1 0xaf10f896 in draw_create_fragment_shader ../src/gallium/auxiliary/draw/draw_fs.c:47
#2 0xaef64619 in i915_create_fs_state ../src/gallium/drivers/i915/i915_state.c:550
#3 0xae16a955 in ureg_create_shader ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2194
#4 0xae17f45f in ureg_create_shader_with_so_and_destroy ../src/gallium/auxiliary/tgsi/tgsi_ureg.h:150
#5 0xae17f45f in ureg_create_shader_and_destroy ../src/gallium/auxiliary/tgsi/tgsi_ureg.h:159
#6 0xae17f45f in util_make_fs_blit_zs ../src/gallium/auxiliary/util/u_simple_shaders.c:365
#7 0xaf13300e in blitter_get_fs_texfetch_depth ../src/gallium/auxiliary/util/u_blitter.c:1157
#8 0xaf13300e in util_blitter_cache_all_shaders ../src/gallium/auxiliary/util/u_blitter.c:1322
#9 0xaef6b738 in i915_create_context ../src/gallium/drivers/i915/i915_context.c:233
#10 0xacb33c49 in st_api_create_context ../src/mesa/state_tracker/st_manager.c:986
#11 0xac845740 in dri_create_context ../src/gallium/frontends/dri/dri_context.c:178
#12 0xac854d97 in driCreateContextAttribs ../src/gallium/frontends/dri/dri_util.c:631
#13 0xb6ce79a3 in dri2_create_context_attribs ../src/glx/dri2_glx.c:240
#14 0xb6c9606f in dri_common_create_context ../src/glx/dri_common.c:665
#15 0xb6ca4f00 in CreateContext ../src/glx/glxcmds.c:322
#16 0xb6ca5c0b in glXCreateNewContext ../src/glx/glxcmds.c:1449
Fixes: 1a69b50b3b ("i915g: Fix point sprites.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
For instance, this issue is triggered with "piglit/bin/glx-multithread-texture -auto -fbo":
Direct leak of 256 byte(s) in 1 object(s) allocated from:
#0 0xb71eda62 in __interceptor_realloc (/usr/lib/libasan.so.6+0xb2a62)
#1 0xadd5a32f in tokens_expand ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:239
#2 0xadd5a32f in get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:262
#3 0xadd62519 in copy_instructions ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2079
#4 0xadd62519 in ureg_finalize ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2129
#5 0xadd64bde in ureg_get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2206
#6 0xade377d0 in nir_to_tgsi_options ../src/gallium/auxiliary/nir/nir_to_tgsi.c:4043
#7 0xade3da63 in nir_to_tgsi ../src/gallium/auxiliary/nir/nir_to_tgsi.c:3831
#8 0xaeb606c9 in i915_create_vs_state ../src/gallium/drivers/i915/i915_state.c:662
#9 0xac781a2c in st_create_common_variant ../src/mesa/state_tracker/st_program.c:720
#10 0xac78e8a4 in st_get_common_variant ../src/mesa/state_tracker/st_program.c:773
#11 0xac78fc10 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1259
#12 0xac78fc10 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
#13 0xac790b1a in st_program_string_notify ../src/mesa/state_tracker/st_program.c:1378
#14 0xace457a9 in _mesa_get_fixed_func_vertex_program ../src/mesa/main/ffvertex_prog.c:1397
#15 0xac5ef8db in update_program ../src/mesa/main/state.c:281
#16 0xac5f0ece in _mesa_update_state_locked ../src/mesa/main/state.c:560
#17 0xac5f1653 in _mesa_update_state ../src/mesa/main/state.c:593
#18 0xacdf9fe2 in _mesa_DrawArrays ../src/mesa/main/draw.c:1403
Fixes: 487a493325 ("i915g: Add support for per-vertex point size.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
For instance, this issue is triggered with "piglit/bin/fcc-blit-between-clears -auto -fbo":
Direct leak of 836 byte(s) in 1 object(s) allocated from:
#0 0xb71eb6f2 in malloc (/usr/lib/libasan.so.6+0xb26f2)
#1 0xaefadc78 in slab_add_new_page ../src/util/slab.c:179
#2 0xaefadc78 in slab_alloc ../src/util/slab.c:221
#3 0xaef7d461 in i915_texture_transfer_map ../src/gallium/drivers/i915/i915_resource_texture.c:789
#4 0xac9e931e in pipe_texture_map ../src/gallium/auxiliary/util/u_inlines.h:555
#5 0xac9e931e in _mesa_map_renderbuffer ../src/mesa/main/renderbuffer.c:494
#6 0xad49c5e4 in readpixels_memcpy ../src/mesa/main/readpix.c:260
#7 0xad49c5e4 in _mesa_readpixels ../src/mesa/main/readpix.c:898
#8 0xad5d8cfe in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:568
#9 0xad4a0caf in read_pixels ../src/mesa/main/readpix.c:1199
#10 0xad4a0caf in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
#11 0xad4a155b in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231
or "piglit/bin/fcc-read-to-pbo-after-clear -auto":
Direct leak of 772 byte(s) in 1 object(s) allocated from:
#0 0xb726b6f2 in malloc (/usr/lib/libasan.so.6+0xb26f2)
#1 0xaf0adc88 in slab_add_new_page ../src/util/slab.c:179
#2 0xaf0adc88 in slab_alloc ../src/util/slab.c:221
#3 0xaf07aad7 in i915_buffer_transfer_map ../src/gallium/drivers/i915/i915_resource_buffer.c:75
#4 0xad10de74 in pipe_buffer_map_range ../src/gallium/auxiliary/util/u_inlines.h:398
#5 0xad10de74 in _mesa_bufferobj_map_range ../src/mesa/main/bufferobj.c:499
#6 0xad5677ce in _mesa_map_pbo_dest ../src/mesa/main/pbo.c:308
#7 0xad59be3b in _mesa_readpixels ../src/mesa/main/readpix.c:894
#8 0xad6d8cfe in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:568
#9 0xad5a0caf in read_pixels ../src/mesa/main/readpix.c:1199
#10 0xad5a0caf in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
#11 0xad5a155b in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231
Fixes: e7a73b75a0 ("gallium: switch drivers to the slab allocator in src/util")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>