Compare commits

..

48 Commits

Author SHA1 Message Date
Eric Engestrom
fef2cdb0a5 VERSION: bump for 21.3.6 2022-02-09 20:10:39 +00:00
Eric Engestrom
405ce01d78 docs: add release notes for 21.3.6 2022-02-09 20:10:36 +00:00
Lionel Landwerlin
b680bd0656 intel/nir: fix shader call lowering
We're replacing a generic instruction by an intel specific one, we
need to remove the previous instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c5a42e4010 ("intel/fs: fix shader call lowering pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
(cherry picked from commit 39f6cd5d79)
2022-02-09 20:07:50 +00:00
Lionel Landwerlin
caf19bcf7b intel/fs: don't set allow_sample_mask for CS intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 77486db867 ("intel/fs: Disable sample mask predication for scratch stores")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
(cherry picked from commit c89024e446)
2022-02-09 20:07:50 +00:00
Mike Blumenkrantz
f0281839c3 zink: min/max blit region in coverage functions
these regions might not have the coords in the correct order, which will
cause them to fail intersection tests, resulting in clears that are never
applied

cc: mesa-stable

fixes:
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_all_buffer_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_depth_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_stencil_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_linear_filter_color_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_magnifying_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_minifying_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_missing_buffers_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_nearest_filter_color_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_dimensions_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_height_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_width_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_scissor_blit

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14867>
(cherry picked from commit 388f23eabe)
2022-02-09 20:07:50 +00:00
Mike Blumenkrantz
ed869d3eb7 zink: reject invalid draws
cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14859>
(cherry picked from commit b656ab75a6)
2022-02-09 20:07:50 +00:00
Mike Blumenkrantz
fa3d049548 zink: fix PIPE_CAP_TGSI_BALLOT export conditional
this requires VK_EXT_shader_subgroup_ballot

cc: mesa-stable

fixes (lavapipe):
KHR-GL46.shader_ballot_tests.ShaderBallotAvailability
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionRead

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14858>
(cherry picked from commit e38c13830f)
2022-02-09 20:07:50 +00:00
Rhys Perry
c185f61e1b radv: fix R_02881C_PA_CL_VS_OUT_CNTL with mixed cull/clip distances
Matches radeonsi.

Seems Vulkan CTS doesn't really test cull distances. Removing
VARYING_SLOT_CULL_DIST0/VARYING_SLOT_CULL_DIST1 variables doesn't break
any of dEQP-VK.clipping.*, except for tests which read the variables in
the fragment shader.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5984
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14882>
(cherry picked from commit 7ddad1b93a)
2022-02-09 20:07:50 +00:00
Dave Airlie
3cbc17121b crocus: find correct relocation target for the bo.
If we have batch a + b, and writing to batch b, causes batch a
to flush, all the bo->index get reset, and we try to submit a -1
to the kernel.

Look the bo index up when creating relocations.

Fixes crash seen in KHR-GL46.compute_shader.pipeline-post-fs
and a trace from Wasteland 3

Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14905>
(cherry picked from commit 37c3be6947)
2022-02-09 20:07:50 +00:00
Mike Blumenkrantz
808eec704d zink: add VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT for query binds
required by spec

cc: mesa-stable

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14853>
(cherry picked from commit 1e96542390)
2022-02-09 20:07:50 +00:00
Mike Blumenkrantz
58382828c9 llvmpipe: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835>
(cherry picked from commit f8a9010410)
2022-02-09 20:07:49 +00:00
Mike Blumenkrantz
444e37340d llvmpipe: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
this cap is broken

cc: mesa-stable

fixes:
GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUnifor

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835>
(cherry picked from commit 9a75392cd8)
2022-02-09 20:07:49 +00:00
Mike Blumenkrantz
2ef1287ef7 zink: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
this cap is broken

cc: mesa-stable

fixes:
GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835>
(cherry picked from commit 9a38dab2d1)
2022-02-09 20:07:49 +00:00
Rhys Perry
b340c47b69 aco: don't encode src2 for v_writelane_b32_e64
Encoding src2 doesn't cause issues for print_asm() because we have a
workaround there, but it does for RGP and it seems the developers are not
interested in fixing it.

https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/61

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14832>
(cherry picked from commit 0447a2303f)
2022-02-09 20:07:49 +00:00
Kenneth Graunke
6bc710d769 i965: Avoid NULL drawbuffer in brw_flush_front
Commit 17e62a3c23 made _mesa_make_current
begin calling ctx->Driver.Flush() in more cases, including when called
during context destruction, after _mesa_free_context_data has set
ctx->DrawBuffer to NULL.  i965's flush hook wasn't prepared for this,
and assumed that ctx->DrawBuffer was non-NULL.  This led to a crash
with the following backtrace:

 #0 0x00007ffff5bf97b5 in _mesa_is_winsys_fbo (fb=0x0)
    at ../../src/mesa/main/fbobject.h:52
 #1 0x00007ffff5bfa359 in brw_flush_front (ctx=0x5555555a4110)
    at ../../src/mesa/drivers/dri/i965/brw_context.c:242
 #2 0x00007ffff5bfa587 in brw_glFlush (ctx=0x5555555a4110,
    gallium_flush_flags=0) at ../../src/mesa/drivers/dri/i965/brw_context.c:301
 #3 0x00007ffff5d46b2b in _mesa_make_current (newCtx=0x0, drawBuffer=0x0,
    readBuffer=0x0) at ../../src/mesa/main/context.c:1616
 #4 0x00007ffff5d46484 in _mesa_free_context_data (ctx=0x5555555a4110,
    destroy_debug_output=true) at ../../src/mesa/main/context.c:1309
 #5 0x00007ffff5bfcb59 in brw_destroy_context (driContextPriv=0x555555590260)
    at ../../src/mesa/drivers/dri/i965/brw_context.c:1301

There is really no point in worrying about front buffer flushing during
the context's destruction when we've already discarded the drawbuffer,
so just add a NULL check in brw_flush_front and skip that work.

Fixes: 17e62a3c23 ("mesa: (correctly) flush more in _mesa_make_current")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5957
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14828>
2022-02-09 20:07:49 +00:00
Samuel Pitoiset
fe18c96c86 radv/winsys: fix missing buffer_make_resident() for the null winsys
With latest Fossilize everything should now be captured correctly
but without this, all Fossilize databases that need
VK_EXT_custom_border_color would just crash.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14821>
(cherry picked from commit 1cadd19197)
2022-02-09 20:07:49 +00:00
Danylo Piliaiev
2027acff50 ir3: opt_deref in opt loop to remove unnecessary tex casts
Otherwise we may be left with such casts:

 vec1 32 ssa_72 = deref_var &shadow_map (uniform sampler2D)
 vec1 32 ssa_73 = deref_cast (texture2D *)ssa_72 (uniform texture2D)
 vec1 32 ssa_74 = deref_cast (sampler *)ssa_72 (uniform sampler)
 vec1 32 ssa_76 = (float32)tex ssa_73 (texture_deref), ssa_74 (sampler_deref), ssa_75 (coord), ssa_64 (comparator)

And crash in ycbcr lowering since we aren't able to follow deref chain.

Fixes crash in GFXBench Aztec Ruins Vulkan tests.
See issue: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5945

Cc: mesa-stable

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14819>
(cherry picked from commit f917c73528)
2022-02-09 20:07:49 +00:00
Connor Abbott
77f9a3ada4 ir3/cp: ir3: Prevent propagating shared regs out of loops harder
We need to check the source of the copy, not the destination. That means
this we need to move this check inside the ifs, where we know that the
source is a copy.

Fixes: 590efd180b ("Prevent propagating shared regs out of loops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14412>
(cherry picked from commit 09731fc79a)
2022-02-09 20:07:49 +00:00
Connor Abbott
5be8602e4d ir3: Fix copy-paste mistakes in ir3_block_remove_physical_predecessor()
Fixes: 2768a35e41 ("ir3: Add pass to remove unreachable blocks")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14412>
(cherry picked from commit 53e54898e0)
2022-02-09 20:07:49 +00:00
Mike Blumenkrantz
e5d09f30b9 zink: use SpvScopeDevice over SpvScopeWorkgroup for atomic shader ops
Workgroup is only allowed in compute shaders, and Device should be more
in line with the intended use here

the alternative would be to keep using Workgroup for compute and use Device
otherwise, but this would effectively make atomic ops non-atomic, which seems
like it isn't desirable

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14690>
(cherry picked from commit 6f38ea4ac7)
2022-02-09 20:07:49 +00:00
Mike Blumenkrantz
39e4072ef4 zink: cast image atomic op params/results based on image type
according to spec, these must match the texel pointer type

cc: mesa-stable

fixes (nvidia):
dEQP-GLES31.functional.image_load_store.2d.atomic.exchange_r32f_return_value
dEQP-GLES31.functional.image_load_store.2d_array.atomic.exchange_r32f_return_value
dEQP-GLES31.functional.image_load_store.cube.atomic.exchange_r32f_return_value

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14690>
(cherry picked from commit 2361c52b5e)
2022-02-09 20:07:49 +00:00
Jason Ekstrand
675c2dabd3 vulkan/wsi: Set MUTABLE_FORMAT_BIT in the prime path
Fixes: 4bdf8547f4 "vulkan/wsi: Implement VK_KHR_swapchain_mutable_format"

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12031>
(cherry picked from commit 8299d5f37f)
2022-02-09 20:07:49 +00:00
Georg Lehmann
d594032257 vulkan/wsi/wayland: Fix add_drm_format_modifier aplha/opaqueness.
This had the opposite problem of the shm path. R8G8B8A8 was always support if
either DRM_FORMAT_XBGR8888 or DRM_FORMAT_ABGR8888 was supported, but we need
both.

Fixes: d944136f36 ("vulkan/wsi/wayland: don't expose surface formats not fully supported")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14588>
(cherry picked from commit cbe4943ae9)
2022-02-09 20:07:49 +00:00
Georg Lehmann
b366fdbe53 vulkan/wsi/wayland: Add modifiers for RGB formats.
These formats get overwritten after the FALLTHROUGH, so no modifers got added
to them at all.

Fixes: 151b65b211 ("vulkan/wsi/wayland: generalize modifier handling")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14588>
(cherry picked from commit 9843fddfff)
2022-02-09 20:07:49 +00:00
Georg Lehmann
09499c4202 vulkan/wsi/wayland: Convert missing vulkan formats to shm formats.
Fixes: 6b36f35734 ("vulkan/wsi/wl: add wl_shm support for lavapipe.")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14588>
(cherry picked from commit a881b6ac1f)
2022-02-09 20:07:48 +00:00
Georg Lehmann
6f916ec0c0 vulkan/wsi/wayland: Fix add_wl_shm_format alpha/opaqueness.
We need both the SHM format with alpha and the opaque format to fully support
a vulkan format with alpha. Previously no surface format was reported because
the vulkan formats with aplha were never added as opaque.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5879
Fixes: d944136f36 ("vulkan/wsi/wayland: don't expose surface formats not fully supported")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14588>
(cherry picked from commit 4ae4e04e18)
2022-02-09 20:07:48 +00:00
Rhys Perry
154ce77176 aco: fix neg(abs(mul(a, b))) if the mul is not VOP3
Previously, is_abs was just ignored if mul_instr->isVOP3()==false.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
(cherry picked from commit 452975f257)
2022-02-09 20:07:48 +00:00
Mike Blumenkrantz
c975c588b1 zink: fix waiting on current batch id
- the current batch id is always 0
- there is always a current batch
- a batch id can only be set at the time of submit

thus when passing 0 to wait on the current batch, the submit must complete
so that there is a batch id, and this must occur before the timeline wait
path or else the timeline wait does nothing

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14693>
(cherry picked from commit 3a0888c62f)
2022-02-09 20:07:48 +00:00
Mike Blumenkrantz
494e809469 zink: add vertex shader pipeline bit for generated barrier construction
if the vertex buffer resource has writes, it needs this bit too

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14691>
(cherry picked from commit 95bfb75688)
2022-02-09 20:07:48 +00:00
Mike Blumenkrantz
e2085aeaff zink: clamp tbo creation to maxTexelBufferElements
for sparse buffers, the total buffer size will be huge, so this needs
to only be the limit that the driver can support to avoid crashing
or whatever

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14692>
(cherry picked from commit 27d405dc2f)
2022-02-09 20:07:48 +00:00
Paulo Zanoni
9b50b320d1 iris: implement inter-context busy-tracking
Previously, no buffers were ever marked as EXEC_OBJECT_ASYNC so the
Kernel would ensure dependency tracking for us. After we implemented
explicit busy tracking in commit 89a34cb845, only the external
objects kept relying on the Kernel's implicit tracking and Iris did
inter-batch busy tracking, meaning we lost inter-screen and
inter-context synchronization. This seemed fine to me since, as far as
I understood, it is the duty of the application to synchronize itself
against multiple screens and contexts.

The problem here is that applications were actually relying on the old
behavior where the Kernel guarantees synchronization, so 89a34cb845
can be seen as a regression. This commit addresses the inter-context
synchronization case.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5731
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5812
Fixes: 89a34cb845 ("iris: switch to explicit busy tracking")
Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14783>
2022-02-09 20:07:48 +00:00
Mike Blumenkrantz
adab2e0b7a zink: fix vertex buffer mask computation for null buffers
off by N

affects:
KHR-GL46.texture_cube_map_array.sampling

Fixes: 53aade0ef0 ("zink: fix enabled vertex buffer mask calculation")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14721>
(cherry picked from commit 42ae116ac7)
2022-02-09 20:07:48 +00:00
Caio Oliveira
e3956c3d5c anv: Fix subgroupSupportedStages physical property
Use the proper Vulkan values that can be combined into a bitmask.

Fixes: f40a08d25c ("anv: Don't advertise unsupported shader stages")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14761>
(cherry picked from commit d6c31f05a2)
2022-02-09 20:07:48 +00:00
Emma Anholt
d08330e731 vulkan: Fix leak of error messages
Fixes: 0cad3beb2a ("vulkan/log: Add common vk_error and vk_errorf helpers")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14235>
(cherry picked from commit bdb8e615d1)
2022-02-09 20:07:48 +00:00
Manas Chaudhary
35444289c5 panvk: Fix pointer corruption in panvk_add_wait_event_syncobjs
nr_in_fences was being incremented to point to an
illegal address

Fixes: 1e23004600 ("panvk: Add vkEvents support")
Cc: mesa-stable
Signed-off-by: Manas Chaudhary <manas.chaudhary@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14744>
(cherry picked from commit cad053db61)
2022-02-09 20:07:48 +00:00
Nanley Chery
9c0388b7ff anv: Re-enable CCS_E on TGL+
Commit e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
accidentally disabled CCS_E on TGL+ because it checked for
image->vk.mip_levels > 0 instead of image->vk.mip_levels > 1.

Instead of reverting it, we remove the code which disables CCS_E for
mipmapped or arrayed images now that we've sufficiently handled the
clear color issue in other ways.

Fixes: e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14723>
(cherry picked from commit 57445adc89)
2022-02-09 20:07:48 +00:00
Nanley Chery
cbb35dbbcc anv: Use ANV_FAST_CLEAR_DEFAULT_VALUE for CCS on TGL+
On TGL, if a block of fragment shader outputs match the surface's clear
color, the HW may convert them to fast-clears (see HSD 14010672564).
This can lead to rendering corruptions if not handled properly. We
restrict the clear color to zero to avoid issues that can occur with:

   - Texture view rendering (including blorp_copy calls)
   - Images with multiple levels or array layers

Fixes: e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14723>
(cherry picked from commit c48401404c)
2022-02-09 20:07:47 +00:00
Nanley Chery
d55fed66d9 anv: Disable CCS_E for some 8/16bpp copies on TGL+
CCS_E is currently disabled on TGL+, but we'll enable it soon. We choose
to explicitly disable it for certain copy operations to avoid CTS
failures in the following groups:

- dEQP-VK.drm_format_modifiers.export_import.*
- dEQP-VK.synchronization*

Fixes: e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14723>
(cherry picked from commit d68b2db89c)
2022-02-09 20:07:47 +00:00
Yiwei Zhang
a987828e6b tu: VkExternalImageFormatProperties is optional
..even if external image info has valid external handles.

Fixes: 26380b3a9f ("turnip: Add driver skeleton (v2)")

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14730>
(cherry picked from commit 96acd0933e)
2022-02-09 20:07:47 +00:00
Mike Blumenkrantz
343f9f52a1 zink: reorder fbfetch flag-setting to avoid null deref
this avoids dereferencing pg->dd which is allocated a few lines later

Fixes: 417477f60e ("zink: always use lazy (non-push) updating for fbfetch descriptors")

fixes (radv):
dEQP-GLES31.functional.blend_equation_advanced.basic.multiply

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14718>
(cherry picked from commit 8747715aec)
2022-02-09 20:07:47 +00:00
Jason Ekstrand
5a106c47d7 anv/pass: Don't set first_subpass_layout for stencil-only attachments
Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13980>
(cherry picked from commit 6612dcc425)
2022-02-09 20:07:47 +00:00
Mike Blumenkrantz
0051b41362 zink: never use SpvOpImageQuerySizeLod for texel buffers
this is illegal

cc: mesa-stable

affects KHR-GL46.texture_buffer.texture_buffer_texture_buffer_range

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14696>
(cherry picked from commit 5e748770b9)
2022-02-09 20:07:47 +00:00
Francisco Jerez
9f974e41b1 intel/fs: Take into account region strides during SIMD lowering decision of SHUFFLE.
This fixes a bug in the handcrafted SIMD lowering done by the SHUFFLE
code generation, which wasn't taking into account the source and
destination region strides while deciding whether it needs to split an
instruction.

v2: Use new element_sz() helper instead of left shift. (Lionel)

Fixes: 90c9f29518 ("i965/fs: Add support for nir_intrinsic_shuffle")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14273>
(cherry picked from commit d1038197f3)
2022-02-09 20:07:47 +00:00
Eric Engestrom
2e255372a6 .pick_status.json: Mark 960e72417f as denominated 2022-02-09 20:07:47 +00:00
Eric Engestrom
f040bf876b .pick_status.json: Mark 15e7750446 as denominated 2022-02-09 20:07:47 +00:00
Eric Engestrom
bf8ee1a24f .pick_status.json: Update to cb781fc350 2022-02-09 20:07:37 +00:00
Charles Baker
7583560305 Revert "zink: handle vertex buffer offset overflows"
This reverts commit 9823b970fb.

From VkPhysicalDeviceLimits [1]:

> maxVertexInputAttributeOffset is the maximum vertex input attribute
offset that can be added to the vertex input binding stride. The offset
member of the VkVertexInputAttributeDescription structure must be
less than or equal to this limit.

The maxVertexInputAttributeOffset is a limit on the offset of a vertex
attribute within a vertex rather than a limit on offsets for vertex
buffer bindings.  The code to bind temporary buffers can be removed.

[1] https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VkPhysicalDeviceLimits.html

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14572>
(cherry picked from commit 1b88777e97)
2022-01-26 22:12:51 +00:00
Bas Nieuwenhuizen
cb21190b24 Revert "nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants"
This reverts commit a1af902531.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5423
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14532>
(cherry picked from commit d1530a3f3b)
2022-01-26 22:12:51 +00:00
42 changed files with 5893 additions and 186 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
21.3.5
21.3.6

149
docs/relnotes/21.3.6.rst Normal file
View File

@@ -0,0 +1,149 @@
Mesa 21.3.6 Release Notes / 2022-02-09
======================================
Mesa 21.3.6 is a bug fix release which fixes bugs found since the 21.3.5 release.
Mesa 21.3.6 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 21.3.6 implements the Vulkan 1.2 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA256 checksum
---------------
::
TBD.
New features
------------
- None
Bug fixes
---------
- radv: CullDistance fail
- i965: Segmentation fault during glinfo context destruction, regression in 21.3.x
- Vulkan Wayland WSI returns empty surface formats
- [REGRESSION][BISECTED] iris: Qutebrowser/QtWebEngine sporadically flashes the window in white
- Flickering Intel Uhd 620 Graphics
- Broken Terraria & Glitches in Forza Horizon 4
Changes
-------
Bas Nieuwenhuizen (1):
- Revert "nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants"
Caio Oliveira (1):
- anv: Fix subgroupSupportedStages physical property
Charles Baker (1):
- Revert "zink: handle vertex buffer offset overflows"
Connor Abbott (2):
- ir3: Fix copy-paste mistakes in ir3_block_remove_physical_predecessor()
- ir3/cp: ir3: Prevent propagating shared regs out of loops harder
Danylo Piliaiev (1):
- ir3: opt_deref in opt loop to remove unnecessary tex casts
Dave Airlie (1):
- crocus: find correct relocation target for the bo.
Emma Anholt (1):
- vulkan: Fix leak of error messages
Eric Engestrom (3):
- .pick_status.json: Update to cb781fc350108584116280fc597c695d2f476c68
- .pick_status.json: Mark 15e77504461a30038a054c87cc53a694171c9cf4 as denominated
- .pick_status.json: Mark 960e72417f3e8885699cf384f690853e14ba44da as denominated
Francisco Jerez (1):
- intel/fs: Take into account region strides during SIMD lowering decision of SHUFFLE.
Georg Lehmann (4):
- vulkan/wsi/wayland: Fix add_wl_shm_format alpha/opaqueness.
- vulkan/wsi/wayland: Convert missing vulkan formats to shm formats.
- vulkan/wsi/wayland: Add modifiers for RGB formats.
- vulkan/wsi/wayland: Fix add_drm_format_modifier aplha/opaqueness.
Jason Ekstrand (2):
- anv/pass: Don't set first_subpass_layout for stencil-only attachments
- vulkan/wsi: Set MUTABLE_FORMAT_BIT in the prime path
Kenneth Graunke (1):
- i965: Avoid NULL drawbuffer in brw_flush_front
Lionel Landwerlin (2):
- intel/fs: don't set allow_sample_mask for CS intrinsics
- intel/nir: fix shader call lowering
Manas Chaudhary (1):
- panvk: Fix pointer corruption in panvk_add_wait_event_syncobjs
Mike Blumenkrantz (15):
- zink: never use SpvOpImageQuerySizeLod for texel buffers
- zink: reorder fbfetch flag-setting to avoid null deref
- zink: fix vertex buffer mask computation for null buffers
- zink: clamp tbo creation to maxTexelBufferElements
- zink: add vertex shader pipeline bit for generated barrier construction
- zink: fix waiting on current batch id
- zink: cast image atomic op params/results based on image type
- zink: use SpvScopeDevice over SpvScopeWorkgroup for atomic shader ops
- zink: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
- llvmpipe: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
- llvmpipe: ci updates
- zink: add VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT for query binds
- zink: fix PIPE_CAP_TGSI_BALLOT export conditional
- zink: reject invalid draws
- zink: min/max blit region in coverage functions
Nanley Chery (3):
- anv: Disable CCS_E for some 8/16bpp copies on TGL+
- anv: Use ANV_FAST_CLEAR_DEFAULT_VALUE for CCS on TGL+
- anv: Re-enable CCS_E on TGL+
Paulo Zanoni (1):
- iris: implement inter-context busy-tracking
Rhys Perry (3):
- aco: fix neg(abs(mul(a, b))) if the mul is not VOP3
- aco: don't encode src2 for v_writelane_b32_e64
- radv: fix R_02881C_PA_CL_VS_OUT_CNTL with mixed cull/clip distances
Samuel Pitoiset (1):
- radv/winsys: fix missing buffer_make_resident() for the null winsys
Yiwei Zhang (1):
- tu: VkExternalImageFormatProperties is optional

View File

@@ -625,6 +625,10 @@ emit_instruction(asm_context& ctx, std::vector<uint32_t>& out, Instruction* inst
encoding = 0;
if (instr->opcode == aco_opcode::v_interp_mov_f32) {
encoding = 0x3 & instr->operands[0].constantValue();
} else if (instr->opcode == aco_opcode::v_writelane_b32_e64) {
encoding |= instr->operands[0].physReg() << 0;
encoding |= instr->operands[1].physReg() << 9;
/* Encoding src2 works fine with hardware but breaks some disassemblers. */
} else {
for (unsigned i = 0; i < instr->operands.size(); i++)
encoding |= instr->operands[i].physReg() << (i * 9);

View File

@@ -3326,12 +3326,16 @@ combine_instruction(opt_ctx& ctx, aco_ptr<Instruction>& instr)
VOP3_instruction& new_mul = instr->vop3();
if (mul_instr->isVOP3()) {
VOP3_instruction& mul = mul_instr->vop3();
new_mul.neg[0] = mul.neg[0] && !is_abs;
new_mul.neg[1] = mul.neg[1] && !is_abs;
new_mul.abs[0] = mul.abs[0] || is_abs;
new_mul.abs[1] = mul.abs[1] || is_abs;
new_mul.neg[0] = mul.neg[0];
new_mul.neg[1] = mul.neg[1];
new_mul.abs[0] = mul.abs[0];
new_mul.abs[1] = mul.abs[1];
new_mul.omod = mul.omod;
}
if (is_abs) {
new_mul.neg[0] = new_mul.neg[1] = false;
new_mul.abs[0] = new_mul.abs[1] = true;
}
new_mul.neg[0] ^= true;
new_mul.clamp = false;

View File

@@ -152,12 +152,6 @@ std::pair<bool, size_t>
disasm_instr(chip_class chip, LLVMDisasmContextRef disasm, uint32_t* binary, unsigned exec_size,
size_t pos, char* outline, unsigned outline_size)
{
/* mask out src2 on v_writelane_b32 */
if (((chip == GFX8 || chip == GFX9) && (binary[pos] & 0xffff8000) == 0xd28a0000) ||
(chip >= GFX10 && (binary[pos] & 0xffff8000) == 0xd7610000)) {
binary[pos + 1] = binary[pos + 1] & 0xF803FFFF;
}
size_t l =
LLVMDisasmInstruction(disasm, (uint8_t*)&binary[pos], (exec_size - pos) * sizeof(uint32_t),
pos * 4, outline, outline_size);

View File

@@ -4448,7 +4448,7 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf *ctx_cs, struct radeon_cmdbuf
S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(misc_vec_ena) |
S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0f) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xf0) != 0) |
cull_dist_mask << 8 | clip_dist_mask);
total_mask << 8 | clip_dist_mask);
if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF, outinfo->writes_viewport_index);
@@ -4568,7 +4568,7 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf *ctx_cs, struct radeon_cmdbuf
S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(misc_vec_ena) |
S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0f) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xf0) != 0) |
cull_dist_mask << 8 | clip_dist_mask);
total_mask << 8 | clip_dist_mask);
radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN,
S_028A84_PRIMITIVEID_EN(es_enable_prim_id) |

View File

@@ -65,6 +65,13 @@ radv_null_winsys_bo_unmap(struct radeon_winsys_bo *_bo)
{
}
static VkResult
radv_null_winsys_bo_make_resident(struct radeon_winsys *_ws, struct radeon_winsys_bo *_bo,
bool resident)
{
return VK_SUCCESS;
}
static void
radv_null_winsys_bo_destroy(struct radeon_winsys *_ws, struct radeon_winsys_bo *_bo)
{
@@ -80,4 +87,5 @@ radv_null_bo_init_functions(struct radv_null_winsys *ws)
ws->base.buffer_destroy = radv_null_winsys_bo_destroy;
ws->base.buffer_map = radv_null_winsys_bo_map;
ws->base.buffer_unmap = radv_null_winsys_bo_unmap;
ws->base.buffer_make_resident = radv_null_winsys_bo_make_resident;
}

View File

@@ -445,8 +445,6 @@ optimizations.extend([
# (a + #b) * #c => (a * #c) + (#b * #c)
(('imul', ('iadd(is_used_once)', a, '#b'), '#c'), ('iadd', ('imul', a, c), ('imul', b, c))),
(('~fmul', ('fadd(is_used_once)', a, '#b'), '#c'), ('fadd', ('fmul', a, c), ('fmul', b, c)),
'!options->avoid_ternary_with_two_constants'),
# ((a + #b) + c) * #d => ((a + c) * #d) + (#b * #d)
(('imul', ('iadd(is_used_once)', ('iadd(is_used_once)', a, '#b'), c), '#d'),

View File

@@ -45,9 +45,9 @@ traces:
# checksum: 4b707f385256b380c936186db8c251cb
# 1 minute
- device: freedreno-a530
checksum: 130dbeac42683b46fed4b268c5aad984
checksum: a71d62bb2c0fabeca41468628777b441
- device: freedreno-a630
checksum: 139861e52f9425b4adb7c0b90b885f91
checksum: 339dce29ae08569652438116829510c7
- path: xonotic/xonotic-keybench-high.trace
expectations:
# Skipped since it's long on a530.
@@ -327,9 +327,9 @@ traces:
#- device: freedreno-a306
# checksum: 0c57ccc3989b75a940b28ea1cc09cb0d
- device: freedreno-a530
checksum: 4715d72a7958f2fd5a387c16b3a01579
checksum: bc19f0f58935fdb348f401396e6845e1
- device: freedreno-a630
checksum: 1e397c5c34c9c50350a8db1a060a6bbb
checksum: f546f840e916ab0f11f8df0e4eee584d
- path: glmark2/shading:shading=blinn-phong-inf.trace
expectations:
- device: freedreno-a306
@@ -422,7 +422,7 @@ traces:
- path: gputest/gimark.trace
expectations:
- device: freedreno-a630
checksum: dd8fb768033d09f6edc98b4cfff02c6f
checksum: e58167bd8eeb8952facbc00ff0449135
- path: gputest/pixmark-julia-fp32.trace
expectations:
- device: freedreno-a630
@@ -452,11 +452,11 @@ traces:
- path: gputest/plot3d.trace
expectations:
- device: freedreno-a306
checksum: 302943895dbdd7730958fb0175f23b7f
checksum: f6ecd9b8afc692b0cdb459b9b30db8d4
- device: freedreno-a530
checksum: 755aa5b521237ddf9fea3181d2ba2b75
checksum: 4faafe5fab0d8ec6d7b549c94f663c92
- device: freedreno-a630
checksum: 302aec1ced68e22182460b617b0f2aef
checksum: 0a6a16c394a413f02ec2ebcc3251e366
# Note: Requires GL4 for tess.
- path: gputest/tessmark.trace
expectations:
@@ -473,9 +473,9 @@ traces:
- path: humus/AmbientAperture.trace
expectations:
- device: freedreno-a306
checksum: 3d9243cbd0659cb58b16cade2be3f2c2
checksum: 8d4c52f0af9c09710d358f24c73fae3c
- device: freedreno-a530
checksum: c55c1ba5683306980956b5f89563f343
checksum: aab5c853e383e1cda56663d65f6925ad
- device: freedreno-a630
checksum: 83fd7bce0fc1e1f30bd143b7d30ca890
- path: humus/CelShading.trace
@@ -536,7 +536,7 @@ traces:
expectations:
# a306/a630 would need higher GL version to run
- device: freedreno-a630
checksum: e93cf9682c9ca5ed6a6effe5b7fdd386
checksum: 0e32ca8fc815a7250f38a07faeafb21b
- path: pathfinder/canvas_text_v2.trace
expectations:
# a306/a630 would need higher GL version to run

View File

@@ -402,9 +402,9 @@ ir3_block_remove_physical_predecessor(struct ir3_block *block, struct ir3_block
{
for (unsigned i = 0; i < block->physical_predecessors_count; i++) {
if (block->physical_predecessors[i] == pred) {
if (i < block->predecessors_count - 1) {
if (i < block->physical_predecessors_count - 1) {
block->physical_predecessors[i] =
block->physical_predecessors[block->predecessors_count - 1];
block->physical_predecessors[block->physical_predecessors_count - 1];
}
block->physical_predecessors_count--;

View File

@@ -303,6 +303,22 @@ try_swap_mad_two_srcs(struct ir3_instruction *instr, unsigned new_flags)
return valid_swap;
}
/* Values that are uniform inside a loop can become divergent outside
* it if the loop has a divergent trip count. This means that we can't
* propagate a copy of a shared to non-shared register if it would
* make the shared reg's live range extend outside of its loop. Users
* outside the loop would see the value for the thread(s) that last
* exited the loop, rather than for their own thread.
*/
static bool
is_valid_shared_copy(struct ir3_instruction *dst_instr,
struct ir3_instruction *src_instr,
struct ir3_register *src_reg)
{
return !(src_reg->flags & IR3_REG_SHARED) ||
dst_instr->block->loop_id == src_instr->block->loop_id;
}
/**
* Handle cp for a given src register. This additionally handles
* the cases of collapsing immedate/const (which replace the src
@@ -316,22 +332,14 @@ reg_cp(struct ir3_cp_ctx *ctx, struct ir3_instruction *instr,
{
struct ir3_instruction *src = ssa(reg);
/* Values that are uniform inside a loop can become divergent outside
* it if the loop has a divergent trip count. This means that we can't
* propagate a copy of a shared to non-shared register if it would
* make the shared reg's live range extend outside of its loop. Users
* outside the loop would see the value for the thread(s) that last
* exited the loop, rather than for their own thread.
*/
if ((src->dsts[0]->flags & IR3_REG_SHARED) &&
src->block->loop_id != instr->block->loop_id)
return false;
if (is_eligible_mov(src, instr, true)) {
/* simple case, no immed/const/relativ, only mov's w/ ssa src: */
struct ir3_register *src_reg = src->srcs[0];
unsigned new_flags = reg->flags;
if (!is_valid_shared_copy(instr, src, src_reg))
return false;
combine_flags(&new_flags, src);
if (ir3_valid_flags(instr, n, new_flags)) {
@@ -357,6 +365,9 @@ reg_cp(struct ir3_cp_ctx *ctx, struct ir3_instruction *instr,
struct ir3_register *src_reg = src->srcs[0];
unsigned new_flags = reg->flags;
if (!is_valid_shared_copy(instr, src, src_reg))
return false;
if (src_reg->flags & IR3_REG_ARRAY)
return false;

View File

@@ -210,6 +210,7 @@ ir3_optimize_loop(struct ir3_compiler *compiler, nir_shader *s)
progress |= OPT(s, nir_lower_phis_to_scalar, false);
progress |= OPT(s, nir_copy_prop);
progress |= OPT(s, nir_opt_deref);
progress |= OPT(s, nir_opt_dce);
progress |= OPT(s, nir_opt_cse);
static int gcm = -1;

View File

@@ -484,7 +484,7 @@ tu_get_external_image_format_properties(
const struct tu_physical_device *physical_device,
const VkPhysicalDeviceImageFormatInfo2 *pImageFormatInfo,
VkExternalMemoryHandleTypeFlagBits handleType,
VkExternalMemoryProperties *external_properties)
VkExternalImageFormatProperties *external_properties)
{
VkExternalMemoryFeatureFlagBits flags = 0;
VkExternalMemoryHandleTypeFlags export_flags = 0;
@@ -526,11 +526,14 @@ tu_get_external_image_format_properties(
handleType);
}
*external_properties = (VkExternalMemoryProperties) {
.externalMemoryFeatures = flags,
.exportFromImportedHandleTypes = export_flags,
.compatibleHandleTypes = compat_flags,
};
if (external_properties) {
external_properties->externalMemoryProperties =
(VkExternalMemoryProperties) {
.externalMemoryFeatures = flags,
.exportFromImportedHandleTypes = export_flags,
.compatibleHandleTypes = compat_flags,
};
}
return VK_SUCCESS;
}
@@ -597,7 +600,7 @@ tu_GetPhysicalDeviceImageFormatProperties2(
if (external_info && external_info->handleType != 0) {
result = tu_get_external_image_format_properties(
physical_device, base_info, external_info->handleType,
&external_props->externalMemoryProperties);
external_props);
if (result != VK_SUCCESS)
goto fail;
}

View File

@@ -132,8 +132,10 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
return 1;
case PIPE_SHADER_CAP_FP16:
case PIPE_SHADER_CAP_FP16_DERIVATIVES:
case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
return lp_has_fp16();
//enabling this breaks GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
return 0;
case PIPE_SHADER_CAP_INT64_ATOMICS:
return 0;
case PIPE_SHADER_CAP_INT16:

View File

@@ -264,21 +264,30 @@ crocus_init_batch(struct crocus_context *ice,
crocus_batch_reset(batch);
}
static struct drm_i915_gem_exec_object2 *
find_validation_entry(struct crocus_batch *batch, struct crocus_bo *bo)
static int
find_exec_index(struct crocus_batch *batch, struct crocus_bo *bo)
{
unsigned index = READ_ONCE(bo->index);
if (index < batch->exec_count && batch->exec_bos[index] == bo)
return &batch->validation_list[index];
return index;
/* May have been shared between multiple active batches */
for (index = 0; index < batch->exec_count; index++) {
if (batch->exec_bos[index] == bo)
return &batch->validation_list[index];
return index;
}
return -1;
}
return NULL;
static struct drm_i915_gem_exec_object2 *
find_validation_entry(struct crocus_batch *batch, struct crocus_bo *bo)
{
int index = find_exec_index(batch, bo);
if (index == -1)
return NULL;
return &batch->validation_list[index];
}
static void
@@ -410,7 +419,7 @@ emit_reloc(struct crocus_batch *batch,
(struct drm_i915_gem_relocation_entry) {
.offset = offset,
.delta = target_offset,
.target_handle = target->index,
.target_handle = find_exec_index(batch, target),
.presumed_offset = entry->offset,
};

View File

@@ -717,6 +717,12 @@ update_bo_syncobjs(struct iris_batch *batch, struct iris_bo *bo, bool write)
move_syncobj_to_batch(batch, &deps->write_syncobjs[other_batch_idx],
I915_EXEC_FENCE_WAIT);
/* If it's being written by our screen, wait on it too. This is relevant
* when there are multiple contexts on the same screen. */
if (deps->write_syncobjs[batch_idx])
move_syncobj_to_batch(batch, &deps->write_syncobjs[batch_idx],
I915_EXEC_FENCE_WAIT);
struct iris_syncobj *batch_syncobj = iris_batch_get_signal_syncobj(batch);
if (write) {
@@ -729,6 +735,8 @@ update_bo_syncobjs(struct iris_batch *batch, struct iris_bo *bo, bool write)
move_syncobj_to_batch(batch, &deps->read_syncobjs[other_batch_idx],
I915_EXEC_FENCE_WAIT);
move_syncobj_to_batch(batch, &deps->read_syncobjs[batch_idx],
I915_EXEC_FENCE_WAIT);
} else {
/* If we're reading, replace the other read from our batch index. */

View File

@@ -188,7 +188,6 @@ spec/glsl-4.00/execution/conversion/vert-conversion-explicit-dvec2-vec2: fail
spec/glsl-4.00/execution/conversion/vert-conversion-explicit-dvec3-vec3: fail
spec/glsl-4.00/execution/conversion/vert-conversion-explicit-dvec4-vec4: fail
spec/glsl-4.50/execution/ssbo-atomiccompswap-int: fail
spec/glsl-es-1.00/linker/glsl-mismatched-uniform-precision-unused: fail
spec/intel_shader_atomic_float_minmax/execution/shared-atomiccompswap-float: skip
spec/intel_shader_atomic_float_minmax/execution/shared-atomicexchange-float: skip
spec/intel_shader_atomic_float_minmax/execution/shared-atomicmax-float: skip

View File

@@ -37,7 +37,7 @@ traces:
- path: gputest/pixmark-piano.trace
expectations:
- device: gl-vmware-llvmpipe
checksum: 4262587e893cf98c61a8467a15677181
checksum: b580ae01560380461a103975cab77393
- path: gputest/triangle.trace
expectations:
- device: gl-vmware-llvmpipe
@@ -169,7 +169,7 @@ traces:
- path: bgfx/39-assao.rdc
expectations:
- device: gl-vmware-llvmpipe
checksum: bc6f44e63010db07e7ba588b216e38b1
checksum: 5d9c6dd6399db34ac81951cd7152ec1c
- path: bgfx/40-svt.rdc
expectations:
- device: gl-vmware-llvmpipe

View File

@@ -21,11 +21,11 @@ traces:
- path: pathfinder/demo.trace
expectations:
- device: gl-radeonsi-stoney
checksum: 8ff636268dfa0d54b6f15d70d15e354d
checksum: c81c85f9b247dd1b06c3dd5b669cc283
- path: pathfinder/canvas_moire.trace
expectations:
- device: gl-radeonsi-stoney
checksum: 505b9cad6e65c13463a0786944f8b679
checksum: 78dd2357ad6e5ffc049a75bfb11c5497
- path: pathfinder/canvas_text_v2.trace
expectations:
- device: gl-radeonsi-stoney
@@ -37,7 +37,7 @@ traces:
- path: gputest/pixmark-piano.trace
expectations:
- device: gl-radeonsi-stoney
checksum: 58a86d233d03e2a174cb79c16028f916
checksum: 86ebe6ff8038975de8724fa9536edb7e
- path: gputest/triangle.trace
expectations:
- device: gl-radeonsi-stoney
@@ -133,7 +133,7 @@ traces:
- path: glmark2/refract.trace
expectations:
- device: gl-radeonsi-stoney
checksum: 41d105bdd10a354f6d161c67f715b7f9
checksum: 9d0a2d2fce0b80a265fbcee5107c9e82
- path: glmark2/shading:shading=blinn-phong-inf.trace
expectations:
- device: gl-radeonsi-stoney
@@ -173,11 +173,11 @@ traces:
- path: godot/Material Testers.x86_64_2020.04.08_13.38_frame799.rdc
expectations:
- device: gl-radeonsi-stoney
checksum: 4df1fbfc346851fe9e086a0708afde21
checksum: 02f654ad77c0c1106e1b31e1c86c93bb
- path: gputest/gimark.trace
expectations:
- device: gl-radeonsi-stoney
checksum: 52f76e6db877111845990ee128552082
checksum: 4442dbd44a9704c499da4817fffce306
- path: gputest/pixmark-julia-fp32.trace
expectations:
- device: gl-radeonsi-stoney
@@ -193,7 +193,7 @@ traces:
- path: gputest/plot3d.trace
expectations:
- device: gl-radeonsi-stoney
checksum: a62be186a3e0a33ecbd520edd3873eb1
checksum: 667078b0f51ac8e0469ef9a20326c616
- path: gputest/tessmark.trace
expectations:
- device: gl-radeonsi-stoney
@@ -201,7 +201,7 @@ traces:
- path: humus/AmbientAperture.trace
expectations:
- device: gl-radeonsi-stoney
checksum: 7ad498c94dcfbf22ef56f115648be86d
checksum: 664ea58a62b27737b7d0ae9e86ab85c0
- path: humus/CelShading.trace
expectations:
- device: gl-radeonsi-stoney

View File

@@ -17,11 +17,11 @@ traces:
- path: gputest/furmark.trace
expectations:
- device: gl-virgl
checksum: d5682aaa762a4849f0cae1692623bdcb
checksum: a38d4c123d13c5ccd3a86f0663fe1aab
- path: gputest/pixmark-piano.trace
expectations:
- device: gl-virgl
checksum: 1bcded27a6ba04fe0f76ff997b98dbc3
checksum: b580ae01560380461a103975cab77393
- path: gputest/triangle.trace
expectations:
- device: gl-virgl
@@ -121,7 +121,7 @@ traces:
- path: glmark2/refract.trace
expectations:
- device: gl-virgl
checksum: b1332df324d0fc1db22b362231d3ed01
checksum: cdadfee0518b964433d80c01329ec191
- path: glmark2/shading:shading=blinn-phong-inf.trace
expectations:
- device: gl-virgl
@@ -178,7 +178,7 @@ traces:
- path: gputest/plot3d.trace
expectations:
- device: gl-virgl
checksum: a1af286874f7060171cb3ca2e765c448
checksum: 7e818a6070005056700e5ef8590a3f8e
# Times out
# - path: gputest/tessmark.trace
# expectations:

View File

@@ -1408,16 +1408,16 @@ static SpvId
emit_atomic(struct ntv_context *ctx, SpvId op, SpvId type, SpvId src0, SpvId src1, SpvId src2)
{
if (op == SpvOpAtomicLoad)
return spirv_builder_emit_triop(&ctx->builder, op, type, src0, emit_uint_const(ctx, 32, SpvScopeWorkgroup),
return spirv_builder_emit_triop(&ctx->builder, op, type, src0, emit_uint_const(ctx, 32, SpvScopeDevice),
emit_uint_const(ctx, 32, 0));
if (op == SpvOpAtomicCompareExchange)
return spirv_builder_emit_hexop(&ctx->builder, op, type, src0, emit_uint_const(ctx, 32, SpvScopeWorkgroup),
return spirv_builder_emit_hexop(&ctx->builder, op, type, src0, emit_uint_const(ctx, 32, SpvScopeDevice),
emit_uint_const(ctx, 32, 0),
emit_uint_const(ctx, 32, 0),
/* these params are intentionally swapped */
src2, src1);
return spirv_builder_emit_quadop(&ctx->builder, op, type, src0, emit_uint_const(ctx, 32, SpvScopeWorkgroup),
return spirv_builder_emit_quadop(&ctx->builder, op, type, src0, emit_uint_const(ctx, 32, SpvScopeDevice),
emit_uint_const(ctx, 32, 0), src1);
}
@@ -2487,12 +2487,12 @@ emit_interpolate(struct ntv_context *ctx, nir_intrinsic_instr *intr)
}
static void
handle_atomic_op(struct ntv_context *ctx, nir_intrinsic_instr *intr, SpvId ptr, SpvId param, SpvId param2)
handle_atomic_op(struct ntv_context *ctx, nir_intrinsic_instr *intr, SpvId ptr, SpvId param, SpvId param2, nir_alu_type type)
{
SpvId dest_type = get_dest_type(ctx, &intr->dest, nir_type_uint32);
SpvId dest_type = get_dest_type(ctx, &intr->dest, type);
SpvId result = emit_atomic(ctx, get_atomic_op(intr->intrinsic), dest_type, ptr, param, param2);
assert(result);
store_dest(ctx, &intr->dest, result, nir_type_uint);
store_dest(ctx, &intr->dest, result, type);
}
static void
@@ -2531,7 +2531,7 @@ emit_ssbo_atomic_intrinsic(struct ntv_context *ctx, nir_intrinsic_instr *intr)
if (intr->intrinsic == nir_intrinsic_ssbo_atomic_comp_swap)
param2 = get_src(ctx, &intr->src[3]);
handle_atomic_op(ctx, intr, ptr, param, param2);
handle_atomic_op(ctx, intr, ptr, param, param2, nir_type_uint32);
}
static void
@@ -2552,7 +2552,7 @@ emit_shared_atomic_intrinsic(struct ntv_context *ctx, nir_intrinsic_instr *intr)
if (intr->intrinsic == nir_intrinsic_shared_atomic_comp_swap)
param2 = get_src(ctx, &intr->src[2]);
handle_atomic_op(ctx, intr, ptr, param, param2);
handle_atomic_op(ctx, intr, ptr, param, param2, nir_type_uint32);
}
static void
@@ -2687,13 +2687,24 @@ emit_image_intrinsic(struct ntv_context *ctx, nir_intrinsic_instr *intr)
type_to_dim(glsl_get_sampler_dim(type), &is_ms);
SpvId sample = is_ms ? get_src(ctx, &intr->src[2]) : emit_uint_const(ctx, 32, 0);
SpvId coord = get_image_coords(ctx, type, &intr->src[1]);
SpvId base_type = get_glsl_basetype(ctx, glsl_get_sampler_result_type(type));
enum glsl_base_type glsl_type = glsl_get_sampler_result_type(type);
SpvId base_type = get_glsl_basetype(ctx, glsl_type);
SpvId texel = spirv_builder_emit_image_texel_pointer(&ctx->builder, base_type, img_var, coord, sample);
SpvId param2 = 0;
if (intr->intrinsic == nir_intrinsic_image_deref_atomic_comp_swap)
/* The type of Value must be the same as Result Type.
* The type of the value pointed to by Pointer must be the same as Result Type.
*/
nir_alu_type ntype = nir_get_nir_type_for_glsl_base_type(glsl_type);
SpvId cast_type = get_dest_type(ctx, &intr->dest, ntype);
param = emit_bitcast(ctx, cast_type, param);
if (intr->intrinsic == nir_intrinsic_image_deref_atomic_comp_swap) {
param2 = get_src(ctx, &intr->src[4]);
handle_atomic_op(ctx, intr, texel, param, param2);
param2 = emit_bitcast(ctx, cast_type, param2);
}
handle_atomic_op(ctx, intr, texel, param, param2, ntype);
}
static void
@@ -3255,13 +3266,16 @@ emit_tex(struct ntv_context *ctx, nir_tex_instr *tex)
lod = emit_float_const(ctx, 32, 0.0);
if (tex->op == nir_texop_txs) {
SpvId image = spirv_builder_emit_image(&ctx->builder, image_type, load);
/* Additionally, if its Dim is 1D, 2D, 3D, or Cube,
/* Its Dim operand must be one of 1D, 2D, 3D, or Cube
* - OpImageQuerySizeLod specification
*
* Additionally, if its Dim is 1D, 2D, 3D, or Cube,
* it must also have either an MS of 1 or a Sampled of 0 or 2.
* - OpImageQuerySize specification
*
* all spirv samplers use these types
*/
if (tex->sampler_dim != GLSL_SAMPLER_DIM_MS && !lod)
if (!lod && tex_instr_is_lod_allowed(tex))
lod = emit_uint_const(ctx, 32, 0);
SpvId result = spirv_builder_emit_image_query_size(&ctx->builder,
dest_type, image,

View File

@@ -357,12 +357,18 @@ bool
zink_blit_region_fills(struct u_rect region, unsigned width, unsigned height)
{
struct u_rect intersect = {0, width, 0, height};
struct u_rect r = {
MIN2(region.x0, region.x1),
MAX2(region.x0, region.x1),
MIN2(region.y0, region.y1),
MAX2(region.y0, region.y1),
};
if (!u_rect_test_intersection(&region, &intersect))
if (!u_rect_test_intersection(&r, &intersect))
/* is this even a thing? */
return false;
u_rect_find_intersection(&region, &intersect);
u_rect_find_intersection(&r, &intersect);
if (intersect.x0 != 0 || intersect.y0 != 0 ||
intersect.x1 != width || intersect.y1 != height)
return false;
@@ -373,11 +379,23 @@ zink_blit_region_fills(struct u_rect region, unsigned width, unsigned height)
bool
zink_blit_region_covers(struct u_rect region, struct u_rect covers)
{
struct u_rect r = {
MIN2(region.x0, region.x1),
MAX2(region.x0, region.x1),
MIN2(region.y0, region.y1),
MAX2(region.y0, region.y1),
};
struct u_rect c = {
MIN2(covers.x0, covers.x1),
MAX2(covers.x0, covers.x1),
MIN2(covers.y0, covers.y1),
MAX2(covers.y0, covers.y1),
};
struct u_rect intersect;
if (!u_rect_test_intersection(&region, &covers))
if (!u_rect_test_intersection(&r, &c))
return false;
u_rect_union(&intersect, &region, &covers);
return intersect.x0 == covers.x0 && intersect.y0 == covers.y0 &&
intersect.x1 == covers.x1 && intersect.y1 == covers.y1;
u_rect_union(&intersect, &r, &c);
return intersect.x0 == c.x0 && intersect.y0 == c.y0 &&
intersect.x1 == c.x1 && intersect.y1 == c.y1;
}

View File

@@ -643,6 +643,9 @@ create_bvci(struct zink_context *ctx, struct zink_resource *res, enum pipe_forma
assert(bvci.format);
bvci.offset = offset;
bvci.range = !offset && range == res->base.b.width0 ? VK_WHOLE_SIZE : range;
uint32_t clamp = util_format_get_blocksize(format) * screen->info.props.limits.maxTexelBufferElements;
if (bvci.range == VK_WHOLE_SIZE && res->base.b.width0 > clamp)
bvci.range = clamp;
bvci.flags = 0;
return bvci;
}
@@ -909,36 +912,9 @@ update_existing_vbo(struct zink_context *ctx, unsigned slot)
return;
struct zink_resource *res = zink_resource(ctx->vertex_buffers[slot].buffer.resource);
res->vbo_bind_mask &= ~BITFIELD_BIT(slot);
ctx->vbufs[slot] = VK_NULL_HANDLE;
ctx->vbuf_offsets[slot] = 0;
update_res_bind_count(ctx, res, false, true);
}
ALWAYS_INLINE static struct zink_resource *
set_vertex_buffer_clamped(struct zink_context *ctx, unsigned slot)
{
const struct pipe_vertex_buffer *ctx_vb = &ctx->vertex_buffers[slot];
struct zink_resource *res = zink_resource(ctx_vb->buffer.resource);
struct zink_screen *screen = zink_screen(ctx->base.screen);
if (ctx_vb->buffer_offset > screen->info.props.limits.maxVertexInputAttributeOffset) {
/* buffer offset exceeds maximum: make a tmp buffer at this offset */
ctx->vbufs[slot] = zink_resource_tmp_buffer(screen, res, ctx_vb->buffer_offset, 0, &ctx->vbuf_offsets[slot]);
util_dynarray_append(&res->obj->tmp, VkBuffer, ctx->vbufs[slot]);
/* the driver is broken and sets a min alignment that's larger than its max offset: rebind as staging buffer */
if (unlikely(ctx->vbuf_offsets[slot] > screen->info.props.limits.maxVertexInputAttributeOffset)) {
static bool warned = false;
if (!warned)
debug_printf("zink: this vulkan driver is BROKEN! maxVertexInputAttributeOffset < VkMemoryRequirements::alignment\n");
warned = true;
}
} else {
ctx->vbufs[slot] = res->obj->buffer;
ctx->vbuf_offsets[slot] = ctx_vb->buffer_offset;
}
assert(ctx->vbufs[slot]);
return res;
}
static void
zink_set_vertex_buffers(struct pipe_context *pctx,
unsigned start_slot,
@@ -976,9 +952,9 @@ zink_set_vertex_buffers(struct pipe_context *pctx,
/* always barrier before possible rebind */
zink_resource_buffer_barrier(ctx, res, VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT);
set_vertex_buffer_clamped(ctx, start_slot + i);
} else
enabled_buffers &= ~BITFIELD_BIT(i);
} else {
enabled_buffers &= ~BITFIELD_BIT(start_slot + i);
}
}
} else {
if (need_state_change)
@@ -3118,11 +3094,15 @@ zink_fence_wait(struct pipe_context *pctx)
void
zink_wait_on_batch(struct zink_context *ctx, uint32_t batch_id)
{
struct zink_batch_state *bs = ctx->batch.state;
assert(bs);
if (!batch_id || bs->fence.batch_id == batch_id)
struct zink_batch_state *bs;
if (!batch_id) {
/* not submitted yet */
flush_batch(ctx, true);
bs = zink_batch_state(ctx->last_fence);
assert(bs);
batch_id = bs->fence.batch_id;
}
assert(batch_id);
if (ctx->have_timelines) {
if (!zink_screen_timeline_wait(zink_screen(ctx->base.screen), batch_id, UINT64_MAX))
check_device_lost(ctx);
@@ -3131,8 +3111,8 @@ zink_wait_on_batch(struct zink_context *ctx, uint32_t batch_id)
simple_mtx_lock(&ctx->batch_mtx);
struct zink_fence *fence;
assert(batch_id || ctx->last_fence);
if (ctx->last_fence && (!batch_id || batch_id == zink_batch_state(ctx->last_fence)->fence.batch_id))
assert(ctx->last_fence);
if (batch_id == zink_batch_state(ctx->last_fence)->fence.batch_id)
fence = ctx->last_fence;
else {
for (bs = ctx->batch_states; bs; bs = bs->next) {
@@ -3788,7 +3768,6 @@ rebind_buffer(struct zink_context *ctx, struct zink_resource *res, uint32_t rebi
u_foreach_bit(slot, res->vbo_bind_mask) {
if (ctx->vertex_buffers[slot].buffer.resource != &res->base.b) //wrong context
goto end;
set_vertex_buffer_clamped(ctx, slot);
num_rebinds++;
}
rebind_mask &= ~BITFIELD_BIT(TC_BINDING_VERTEX_BUFFER);
@@ -3932,8 +3911,6 @@ void
zink_rebind_all_buffers(struct zink_context *ctx)
{
struct zink_batch *batch = &ctx->batch;
u_foreach_bit(slot, ctx->gfx_pipeline_state.vertex_buffers_enabled_mask)
set_vertex_buffer_clamped(ctx, slot);
ctx->vertex_buffers_dirty = ctx->gfx_pipeline_state.vertex_buffers_enabled_mask > 0;
ctx->dirty_so_targets = ctx->num_so_targets > 0;
if (ctx->num_so_targets)

View File

@@ -261,8 +261,6 @@ struct zink_context {
uint16_t rp_clears_enabled;
uint16_t fbfetch_outputs;
VkBuffer vbufs[PIPE_MAX_ATTRIBS];
unsigned vbuf_offsets[PIPE_MAX_ATTRIBS];
struct pipe_vertex_buffer vertex_buffers[PIPE_MAX_ATTRIBS];
bool vertex_buffers_dirty;

View File

@@ -140,20 +140,20 @@ zink_descriptor_program_init_lazy(struct zink_context *ctx, struct zink_program
struct zink_shader **stages;
if (pg->is_compute)
stages = &((struct zink_compute_program*)pg)->shader;
else {
else
stages = ((struct zink_gfx_program*)pg)->shaders;
if (stages[PIPE_SHADER_FRAGMENT]->nir->info.fs.uses_fbfetch_output) {
zink_descriptor_util_init_fbfetch(ctx);
push_count = 1;
pg->dd->fbfetch = true;
}
}
if (!pg->dd)
pg->dd = (void*)rzalloc(pg, struct zink_program_descriptor_data);
if (!pg->dd)
return false;
if (!pg->is_compute && stages[PIPE_SHADER_FRAGMENT]->nir->info.fs.uses_fbfetch_output) {
zink_descriptor_util_init_fbfetch(ctx);
push_count = 1;
pg->dd->fbfetch = true;
}
unsigned entry_idx[ZINK_DESCRIPTOR_TYPES] = {0};
unsigned num_shaders = pg->is_compute ? 1 : ZINK_SHADER_COUNT;

View File

@@ -134,16 +134,16 @@ zink_bind_vertex_buffers(struct zink_batch *batch, struct zink_context *ctx)
return;
for (unsigned i = 0; i < elems->hw_state.num_bindings; i++) {
const unsigned buffer_id = ctx->element_state->binding_map[i];
struct pipe_vertex_buffer *vb = ctx->vertex_buffers + buffer_id;
struct pipe_vertex_buffer *vb = ctx->vertex_buffers + ctx->element_state->binding_map[i];
assert(vb);
if (vb->buffer.resource) {
buffers[i] = ctx->vbufs[buffer_id];
assert(buffers[i]);
struct zink_resource *res = zink_resource(vb->buffer.resource);
assert(res->obj->buffer);
buffers[i] = res->obj->buffer;
buffer_offsets[i] = vb->buffer_offset;
buffer_strides[i] = vb->stride;
if (HAS_VERTEX_INPUT)
elems->hw_state.dynbindings[i].stride = vb->stride;
buffer_offsets[i] = ctx->vbuf_offsets[buffer_id];
buffer_strides[i] = vb->stride;
zink_batch_resource_usage_set(&ctx->batch, zink_resource(vb->buffer.resource), false);
} else {
buffers[i] = zink_resource(ctx->dummy_vertex_buffer)->obj->buffer;
@@ -374,6 +374,8 @@ update_barriers(struct zink_context *ctx, bool is_compute)
access |= VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT;
pipeline |= VK_PIPELINE_STAGE_VERTEX_INPUT_BIT;
bind_count -= util_bitcount(res->vbo_bind_mask);
if (res->write_bind_count[is_compute])
pipeline |= VK_PIPELINE_STAGE_VERTEX_SHADER_BIT;
}
bind_count -= res->so_bind_count;
}
@@ -462,6 +464,9 @@ zink_draw_vbo(struct pipe_context *pctx,
const struct pipe_draw_start_count_bias *draws,
unsigned num_draws)
{
if (!dindirect && (!draws[0].count || !dinfo->instance_count))
return;
struct zink_context *ctx = zink_context(pctx);
struct zink_screen *screen = zink_screen(pctx->screen);
struct zink_rasterizer_state *rast_state = ctx->rast_state;

View File

@@ -165,6 +165,9 @@ create_bci(struct zink_screen *screen, const struct pipe_resource *templ, unsign
if (bind & PIPE_BIND_SHADER_IMAGE)
bci.usage |= VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT;
if (bind & PIPE_BIND_QUERY_BUFFER)
bci.usage |= VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT;
if (templ->flags & PIPE_RESOURCE_FLAG_SPARSE)
bci.flags |= VK_BUFFER_CREATE_SPARSE_BINDING_BIT;
return bci;

View File

@@ -438,7 +438,7 @@ zink_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
return 1;
case PIPE_CAP_TGSI_BALLOT:
return screen->vk_version >= VK_MAKE_VERSION(1,2,0) && screen->info.props11.subgroupSize <= 64;
return screen->info.have_vulkan12 && screen->info.have_EXT_shader_subgroup_ballot && screen->info.props11.subgroupSize <= 64;
case PIPE_CAP_SAMPLE_SHADING:
return screen->info.feats.features.sampleRateShading;
@@ -854,8 +854,10 @@ zink_get_shader_param(struct pipe_screen *pscreen,
return 0; /* not implemented */
case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
return screen->info.feats11.uniformAndStorageBuffer16BitAccess ||
(screen->info.have_KHR_16bit_storage && screen->info.storage_16bit_feats.uniformAndStorageBuffer16BitAccess);
//enabling this breaks GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
//return screen->info.feats11.uniformAndStorageBuffer16BitAccess ||
//(screen->info.have_KHR_16bit_storage && screen->info.storage_16bit_feats.uniformAndStorageBuffer16BitAccess);
return 0;
case PIPE_SHADER_CAP_FP16_DERIVATIVES:
return 0; //spirv requires 32bit derivative srcs and dests
case PIPE_SHADER_CAP_FP16:

View File

@@ -616,8 +616,8 @@ fs_generator::generate_shuffle(fs_inst *inst,
* easier just to split it here.
*/
const unsigned lower_width =
(devinfo->ver <= 7 || type_sz(src.type) > 4) ?
8 : MIN2(16, inst->exec_size);
devinfo->ver <= 7 || element_sz(src) > 4 || element_sz(dst) > 4 ? 8 :
MIN2(16, inst->exec_size);
brw_set_default_exec_size(p, cvt(lower_width) - 1);
for (unsigned group = 0; group < inst->exec_size; group += lower_width) {

View File

@@ -3928,7 +3928,10 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GFX7_BTI_SLM);
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
/* No point in masking with sample mask, here we're handling compute
* intrinsics.
*/
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
fs_reg data = get_nir_src(instr->src[0]);
data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);

View File

@@ -137,6 +137,8 @@ lower_shader_calls_instr(struct nir_builder *b, nir_instr *instr, void *data)
switch (call->intrinsic) {
case nir_intrinsic_rt_trace_ray: {
b->cursor = nir_instr_remove(instr);
store_resume_addr(b, call);
nir_ssa_def *as_addr = call->src[0].ssa;
@@ -217,6 +219,8 @@ lower_shader_calls_instr(struct nir_builder *b, nir_instr *instr, void *data)
}
case nir_intrinsic_rt_execute_callable: {
b->cursor = nir_instr_remove(instr);
store_resume_addr(b, call);
nir_ssa_def *sbt_offset32 =

View File

@@ -1238,6 +1238,28 @@ region_matches(struct brw_reg reg, enum brw_vertical_stride v,
region_matches(reg, BRW_VERTICAL_STRIDE_0, BRW_WIDTH_1, \
BRW_HORIZONTAL_STRIDE_0)
/**
* Return the size in bytes per data element of register \p reg on the
* corresponding register file.
*/
static inline unsigned
element_sz(struct brw_reg reg)
{
if (reg.file == BRW_IMMEDIATE_VALUE || has_scalar_region(reg)) {
return type_sz(reg.type);
} else if (reg.width == BRW_WIDTH_1 &&
reg.hstride == BRW_HORIZONTAL_STRIDE_0) {
assert(reg.vstride != BRW_VERTICAL_STRIDE_0);
return type_sz(reg.type) << (reg.vstride - 1);
} else {
assert(reg.hstride != BRW_HORIZONTAL_STRIDE_0);
assert(reg.vstride == reg.hstride + reg.width);
return type_sz(reg.type) << (reg.hstride - 1);
}
}
/* brw_packed_float.c */
int brw_float_to_vf(float f);
float brw_vf_to_float(unsigned char vf);

View File

@@ -1973,12 +1973,12 @@ anv_get_physical_device_properties_1_1(struct anv_physical_device *pdevice,
scalar_stages |= mesa_to_vk_shader_stage(stage);
}
if (pdevice->vk.supported_extensions.KHR_ray_tracing_pipeline) {
scalar_stages |= MESA_SHADER_RAYGEN |
MESA_SHADER_ANY_HIT |
MESA_SHADER_CLOSEST_HIT |
MESA_SHADER_MISS |
MESA_SHADER_INTERSECTION |
MESA_SHADER_CALLABLE;
scalar_stages |= VK_SHADER_STAGE_RAYGEN_BIT_KHR |
VK_SHADER_STAGE_ANY_HIT_BIT_KHR |
VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR |
VK_SHADER_STAGE_MISS_BIT_KHR |
VK_SHADER_STAGE_INTERSECTION_BIT_KHR |
VK_SHADER_STAGE_CALLABLE_BIT_KHR;
}
p->subgroupSupportedStages = scalar_stages;
p->subgroupSupportedOperations = VK_SUBGROUP_FEATURE_BASIC_BIT |

View File

@@ -635,20 +635,6 @@ add_aux_surface_if_supported(struct anv_device *device,
return VK_SUCCESS;
}
if (device->info.ver >= 12 &&
(image->vk.array_layers > 1 || image->vk.mip_levels)) {
/* HSD 14010672564: On TGL, if a block of fragment shader outputs
* match the surface's clear color, the HW may convert them to
* fast-clears. Anv only does clear color tracking for the first
* slice unfortunately. Disable CCS until anv gains more clear color
* tracking abilities.
*/
anv_perf_warn(VK_LOG_OBJS(&image->vk.base),
"HW may put fast-clear blocks on more slices than SW "
"currently tracks. Not allocating a CCS buffer.");
return VK_SUCCESS;
}
if (INTEL_DEBUG(DEBUG_NO_RBC))
return VK_SUCCESS;
@@ -2044,6 +2030,20 @@ anv_layout_to_aux_state(const struct intel_device_info * const devinfo,
bool aux_supported = true;
bool clear_supported = isl_aux_usage_has_fast_clears(aux_usage);
const struct isl_format_layout *fmtl =
isl_format_get_layout(image->planes[plane].primary_surface.isl.format);
/* Disabling CCS for the following case avoids failures in:
* - dEQP-VK.drm_format_modifiers.export_import.*
* - dEQP-VK.synchronization*
*/
if (usage & (VK_IMAGE_USAGE_TRANSFER_DST_BIT |
VK_IMAGE_USAGE_TRANSFER_SRC_BIT) && fmtl->bpb <= 16 &&
aux_usage == ISL_AUX_USAGE_CCS_E && devinfo->ver >= 12) {
aux_supported = false;
clear_supported = false;
}
if ((usage & VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT) && !read_only) {
/* This image could be used as both an input attachment and a render
* target (depth, stencil, or color) at the same time and this can cause
@@ -2265,6 +2265,17 @@ anv_layout_to_fast_clear_type(const struct intel_device_info * const devinfo,
case ISL_AUX_STATE_COMPRESSED_CLEAR:
if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
return ANV_FAST_CLEAR_DEFAULT_VALUE;
} else if (devinfo->ver >= 12 &&
image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E) {
/* On TGL, if a block of fragment shader outputs match the surface's
* clear color, the HW may convert them to fast-clears (see HSD
* 14010672564). This can lead to rendering corruptions if not
* handled properly. We restrict the clear color to zero to avoid
* issues that can occur with:
* - Texture view rendering (including blorp_copy calls)
* - Images with multiple levels or array layers
*/
return ANV_FAST_CLEAR_DEFAULT_VALUE;
} else if (layout == VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
/* When we're in a render pass we have the clear color data from the
* VkRenderPassBeginInfo and we can use arbitrary clear colors. They

View File

@@ -107,7 +107,11 @@ anv_render_pass_compile(struct anv_render_pass *pass)
all_usage |= subpass_att->usage;
if (pass_att->first_subpass_layout == VK_IMAGE_LAYOUT_UNDEFINED) {
/* first_subpass_layout only applies to color and depth.
* See genX(cmd_buffer_setup_attachments)
*/
if (vk_format_aspects(pass_att->format) != VK_IMAGE_ASPECT_STENCIL_BIT &&
pass_att->first_subpass_layout == VK_IMAGE_LAYOUT_UNDEFINED) {
pass_att->first_subpass_layout = subpass_att->layout;
assert(pass_att->first_subpass_layout != VK_IMAGE_LAYOUT_UNDEFINED);
}

View File

@@ -239,7 +239,8 @@ brw_flush_front(struct gl_context *ctx)
__DRIdrawable *driDrawable = driContext->driDrawablePriv;
__DRIscreen *const dri_screen = brw->screen->driScrnPriv;
if (brw->front_buffer_dirty && _mesa_is_winsys_fbo(ctx->DrawBuffer)) {
if (brw->front_buffer_dirty && ctx->DrawBuffer &&
_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
if (flushFront(dri_screen) && driDrawable &&
driDrawable->loaderPrivate) {

View File

@@ -5,7 +5,7 @@ traces:
- path: behdad-glyphy/glyphy.trace
expectations:
- device: gl-panfrost-t860
checksum: b6cd8d92987530edcfc36a933c9b07f6
checksum: 22bf5262745fd47c5c5eadb93d7cc420
- path: glmark2/desktop:windows=4:effect=blur:blur-radius=5:passes=1:separable=true.trace
expectations:
- device: gl-panfrost-t860
@@ -158,7 +158,7 @@ traces:
- path: glmark2/refract.trace
expectations:
- device: gl-panfrost-t860
checksum: e520a0071fd940be1401aea2bec97709
checksum: 6557deca1a47a7a77723658ea579ac63
- path: glmark2/shading:shading=blinn-phong-inf.trace
expectations:
- device: gl-panfrost-t860
@@ -209,11 +209,11 @@ traces:
- path: gputest/plot3d.trace
expectations:
- device: gl-panfrost-t860
checksum: e73715f3b6a4f1609eaf5432af03714e
checksum: a34223830866a42747db199b04c5e1be
- path: humus/AmbientAperture.trace
expectations:
- device: gl-panfrost-t860
checksum: b0d4a64e0907f817161b2a0e85af7a9a
checksum: e4c0b930ef99f14305e1ade7f1779c09
- path: humus/CelShading.trace
expectations:
- device: gl-panfrost-t860

View File

@@ -157,7 +157,7 @@ panvk_add_wait_event_syncobjs(struct panvk_batch *batch, uint32_t *in_fences, un
/* Nothing to do yet */
break;
case PANVK_EVENT_OP_WAIT:
in_fences[*nr_in_fences++] = op->event->syncobj;
in_fences[(*nr_in_fences)++] = op->event->syncobj;
break;
default:
unreachable("bad panvk_event_op type\n");

View File

@@ -305,6 +305,8 @@ __vk_errorv(const void *_obj, VkResult error,
VK_LOG_NO_OBJS(instance), file, line,
"%s (%s)", message, error_str);
}
ralloc_free(message);
} else {
if (object) {
__vk_log(VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT,

View File

@@ -531,7 +531,7 @@ wsi_create_prime_image(const struct wsi_swapchain *chain,
.sType = VK_STRUCTURE_TYPE_WSI_IMAGE_CREATE_INFO_MESA,
.prime_blit_src = true,
};
const VkImageCreateInfo image_info = {
VkImageCreateInfo image_info = {
.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
.pNext = &image_wsi_info,
.flags = 0,
@@ -552,6 +552,10 @@ wsi_create_prime_image(const struct wsi_swapchain *chain,
.pQueueFamilyIndices = pCreateInfo->pQueueFamilyIndices,
.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED,
};
if (pCreateInfo->flags & VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR) {
image_info.flags |= VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT |
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR;
}
result = wsi->CreateImage(chain->device, &image_info,
&chain->alloc, &image->image);
if (result != VK_SUCCESS)

View File

@@ -294,14 +294,25 @@ wsi_wl_display_add_drm_format_modifier(struct wsi_wl_display *display,
format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8_UNORM,
true, true);
FALLTHROUGH;
if (format)
wsi_wl_format_add_modifier(format, modifier);
if (srgb_format)
wsi_wl_format_add_modifier(srgb_format, modifier);
srgb_format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_SRGB,
false, true);
format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_UNORM,
false, true);
break;
case DRM_FORMAT_ABGR8888:
srgb_format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_SRGB,
true, true);
true, false);
format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_UNORM,
true, true);
true, false);
break;
case DRM_FORMAT_XRGB8888:
srgb_format = wsi_wl_display_add_vk_format(display, formats,
@@ -310,14 +321,25 @@ wsi_wl_display_add_drm_format_modifier(struct wsi_wl_display *display,
format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8_UNORM,
true, true);
FALLTHROUGH;
if (format)
wsi_wl_format_add_modifier(format, modifier);
if (srgb_format)
wsi_wl_format_add_modifier(srgb_format, modifier);
srgb_format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_SRGB,
false, true);
format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_UNORM,
false, true);
break;
case DRM_FORMAT_ARGB8888:
srgb_format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_SRGB,
true, true);
true, false);
format = wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_UNORM,
true, true);
true, false);
break;
}
@@ -336,11 +358,17 @@ wsi_wl_display_add_wl_shm_format(struct wsi_wl_display *display,
case WL_SHM_FORMAT_XBGR8888:
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8_SRGB,
false, true);
true, true);
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8_UNORM,
true, true);
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_SRGB,
false, true);
FALLTHROUGH;
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_UNORM,
false, true);
break;
case WL_SHM_FORMAT_ABGR8888:
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_R8G8B8A8_SRGB,
@@ -352,11 +380,17 @@ wsi_wl_display_add_wl_shm_format(struct wsi_wl_display *display,
case WL_SHM_FORMAT_XRGB8888:
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8_SRGB,
false, true);
true, true);
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8_UNORM,
true, true);
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_SRGB,
false, true);
FALLTHROUGH;
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_UNORM,
false, true);
break;
case WL_SHM_FORMAT_ARGB8888:
wsi_wl_display_add_vk_format(display, formats,
VK_FORMAT_B8G8R8A8_SRGB,
@@ -427,6 +461,12 @@ wl_shm_format_for_vk_format(VkFormat vk_format, bool alpha)
case VK_FORMAT_B8G8R8A8_UNORM:
case VK_FORMAT_B8G8R8A8_SRGB:
return alpha ? WL_SHM_FORMAT_ARGB8888 : WL_SHM_FORMAT_XRGB8888;
case VK_FORMAT_R8G8B8_UNORM:
case VK_FORMAT_R8G8B8_SRGB:
return WL_SHM_FORMAT_XBGR8888;
case VK_FORMAT_B8G8R8_UNORM:
case VK_FORMAT_B8G8R8_SRGB:
return WL_SHM_FORMAT_XRGB8888;
default:
assert(!"Unsupported Vulkan format");