Commit Graph

186524 Commits

Author SHA1 Message Date
Joshua Ashton
aa5bc3e41f wsi: Implement linux-drm-syncobj-v1
This implements explicit sync with linux-drm-syncobj-v1 for the
Wayland WSI.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-27 17:49:03 +00:00
Joshua Ashton
e4e3436d45 wsi: Add common infrastructure for explicit sync
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-27 17:49:03 +00:00
Joshua Ashton
becb5d5161 wsi: Get timeline semaphore exportable handle types
We need to know this for explicit sync

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-27 17:44:16 +00:00
Joshua Ashton
06c2af994b wsi: Track CPU side present ordering via a serial
We will use this in our hueristics to pick the most optimal buffer in AcquireNextImageKHR

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-25 21:00:54 +00:00
Joshua Ashton
d9cbc79941 wsi: Add acquired member to wsi_image
Tracks whether this wsi_image has been acquired by the app

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-25 21:00:54 +00:00
Joshua Ashton
e209b02b97 wsi: Track if timeline semaphores are supported
This will be needed before we expose and use explicit sync.

Even if the host Wayland compositor supports timeline semaphores, in the
case of Venus, etc the underlying driver may not.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22 00:24:26 +00:00
Joshua Ashton
8a098f591b build: Add linux-drm-syncobj-v1 wayland protocol
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22 00:24:26 +00:00
Joshua Ashton
754f52e1e1 wsi: Add explicit_sync to wsi_drm_image_params
Allow the WSI frontend to request explicit sync buffers.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22 00:24:26 +00:00
Joshua Ashton
00dba3992c wsi: Add explicit_sync to wsi_image_info
Will be used in future for specifying explicit sync for Vulkan WSI when supported.

Additionally cleans up wsi_create_buffer_blit_context, etc..

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-22 00:24:26 +00:00
Joshua Ashton
9c8f205131 wsi: Pass wsi_drm_image_params to wsi_configure_prime_image
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-20 17:21:27 +00:00
Joshua Ashton
f17f43b149 wsi: Pass wsi_drm_image_params to wsi_configure_native_image
No need to split this out into function parameters, it's just less clean.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2024-03-20 17:21:26 +00:00
Samuel Pitoiset
be4a6b946a radv: add a workaround for null IBO on GFX6
Based on PAL.

Fixes dEQP-VK.draw.*nulldescriptor_maintenance_5_maintenance6 on GFX6.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28263>
2024-03-20 16:27:58 +00:00
Juan A. Suarez Romero
d87ccf0632 broadcom/ci: add new expected failures
Add more expected failures that should have been included in
74be42d9a4.

Fixes: 74be42d9a4 ("broadcom/ci: add new expected test failures")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28298>
2024-03-20 16:06:35 +00:00
Mike Blumenkrantz
f79557dd38 zink: do io fixup on patch variables too
fixes spec@arb_separate_shader_objects@rendezvous by location (5 stages)

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28296>
2024-03-20 15:09:12 +00:00
Rhys Perry
f88922e816 radv: use dual_color_blend_by_location with Half-Life Alyx
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Ethan Lee <flibitijibibo@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10462
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28269>
2024-03-20 11:40:18 +00:00
Kenneth Graunke
a075b44493 intel/brw: Eliminate top-level FIND_LIVE_CHANNEL & BROADCAST once
brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL
outside of control flow.  None of our optimization passes generate
additional cases of that instruction, so once it's gone, we shouldn't
ever have to run the pass again.  Moving it out of the loop should
save a bit of CPU time.

While we're at it, also clean adjacent BROADCAST instructions that
consume the result of our FIND_LIVE_CHANNEL.  Without this, we have
to perform copy propagation to get the MOV 0 immediate into the
BROADCAST, then algebraic to turn it into a MOV, which enables more
copy propagation...not to mention CSE gets involved.  Since this
FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is
really common, and it's trivial to clean up, we can do that.  This
lets the initial copy prop in the loop see MOV instead of BROADCAST.

Zero impact on fossil-db, but less work in the optimization loop.

Together with the previous patches, this cuts compile time in
Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20 01:04:22 -07:00
Kenneth Graunke
5814534de5 intel/brw: Don't consider UNIFORM_PULL_CONSTANT_LOAD a send-from-GRF
It's a logical opcode which is lowered to a send-from-GRF later.  That
lowering code is responsible for ensuring the sources are set up in a
proper SEND payload.

This was preventing copy propagation of surface handles which started
out as scalars, were splatted out to full-SIMD values with NoMask, then
actually consumed as only component 0 (scalar again), because we thought
that scalar values were not allowed.

fossil-db on Alchemist shows improvements in q2rtx but no other titles:

   Totals:
   Instrs: 161310436 -> 161310152 (-0.00%)
   Cycles: 14370605159 -> 14370601066 (-0.00%)

   Totals from 17 (0.00% of 652298) affected shaders:
   Instrs: 16097 -> 15813 (-1.76%)
   Cycles: 185508 -> 181415 (-2.21%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20 01:04:22 -07:00
Kenneth Graunke
ea423aba1b intel/brw: Split out 64-bit lowering from algebraic optimizations
We don't necessarily want to split up MOVs for 64-bit addresses into
2x 32-bit MOVs right away, as this makes things like copy propagating
the whole address around harder.  We should do this late, once, while
still doing other algebraic optimizations earlier.

fossil-db results for Alchemist show tiny improvements:

   Totals:
   Instrs: 161310502 -> 161310436 (-0.00%); split: -0.00%, +0.00%
   Cycles: 14370605606 -> 14370605159 (-0.00%); split: -0.00%, +0.00%

   Totals from 33 (0.01% of 652298) affected shaders:
   Instrs: 15053 -> 14987 (-0.44%); split: -0.64%, +0.20%
   Cycles: 196947 -> 196500 (-0.23%); split: -0.25%, +0.02%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
2024-03-20 01:04:17 -07:00
Nanley Chery
831703157e iris: Use resource_get_param in resource_get_handle
Refactor iris_resource_get_handle to use iris_resource_get_param to pick
up the fix from the previous patch.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9994
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28258>
2024-03-19 23:12:06 +00:00
Nanley Chery
bf1008ac28 iris: Report the correct modifier for Tile4 images
In iris_resource_get_param, report the Tile4 modifier for Tile4 images
instead of reporting the linear modifier.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28258>
2024-03-19 23:12:06 +00:00
Mark Janes
345c918a76 intel/dev: remove pci revision from shader cache key
Pci revision was included in the shader cache key because it can
enable platform workarounds.  While some platform workarounds exist in
the compiler, none are dependent on the silicon stepping.

Many platforms differ only in the pci revision id, causing needless
duplication in cache entries between platforms.

When a platform ships publicly with stepping-specific compiler
workarounds, pci id must be incorporated into the shader cache key.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28085>
2024-03-19 15:11:19 -07:00
Timur Kristóf
58e3b1f930 aco: Allow passing constant operand to is_overwritten_since.
This is to make it more intuitive and also consistent
with last_writer_idx which does allow constant operands.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28046>
2024-03-19 20:50:12 +00:00
Gert Wollny
d1cac5ed05 zink: acquire - maybe clear timeout after waiting for presentation fence
If the presentation fence was signalled and we still hold
max_acquires or more images, then clear the timeout to avoid
a possible deadlock.

With that we avoid the validation error

  VUID-vkAcquireNextImageKHR-surface-07783

triggered by piglit

   spec@!opengl 1.0@gl-1.0-drawbuffer-modes

and others.

v2: clear timeout only if we have acquired more images than the
    reported max and add some comment why the timeout is cleared
    (Mike).

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28245>
2024-03-19 20:12:52 +00:00
Mary Guillemard
9e133c4000 nouveau: Add support for TERT opcodes in vk_push_print
Those opcodes are vestige of the old command format.

This implement handling of them and fix issues when analysing command
buffers that use thoses.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28277>
2024-03-19 19:56:07 +00:00
Kenneth Graunke
d473004576 intel/fs: Avoid generating useless UNDEFs for every SSA def
Emitting UNDEF is only necessary when the instructions we generate to
produce the NIR def are considered partial writes.  By adding a simple
check (adapted from fs_inst::is_partial_write()), we can avoid creating
loads of unnecessary UNDEFs that we have to clean up later.

Our first dead code elimination pass does get rid of them pretty
quickly, but this should save memory and time during our first
split_virtual_grfs and dead_code_elimination passes.

This generates roughly 30% fewer instructions at the beginning.

Improves compilation time of shaders:
- Rise of the Tomb Raider: -3.51563% +/- 0.103951% (n=7)
- Borderlands 3: -3.64422% +/- 0.300951% (n=7).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28169>
2024-03-19 19:32:18 +00:00
Konstantin Seurer
a6b93c50d0 radv/printf: Use fprintf instead of printf
For using other destinations than stdout.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28228>
2024-03-19 19:05:25 +00:00
Konstantin Seurer
d902b6d805 radv: Skip more acceleration structure build markers
We should skip even more stuff when using updates only.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28228>
2024-03-19 19:05:25 +00:00
Caio Oliveira
b58b6d2d32 anv: Enable VK_KHR_shader_quad_control
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19 18:41:15 +00:00
Caio Oliveira
b22879e753 intel/brw: Use predicates for quad_vote_any and quad_vote_all when available
Up until Xe2, we can use the predicates ANY4H and ALL4H to achieve the
same result with less instructions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19 18:41:15 +00:00
Caio Oliveira
857e62e6ac intel/brw: Implement quad_vote_any and quad_vote_all
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27279>
2024-03-19 18:41:15 +00:00
Ian Romanick
671745b616 intel/fs: Don't allow 0 stride on MOV destination
Outside SIMD1 instructions, a destination stride of zero doesn't make
any sense. When such strides exist, they would be fixed by the FS
generator. Currently the only place that intentionally generates such a
stride is setup_barrier_message_payload_gfx125, and this commit changes
that.

The existence of a zero stride that won't really be a zero stride causes
a variety of problems with other optimization passes. Those passes don't
know that 0 actually means 1, and they make incorrect assumptions about
sizes written, etc.

The assertion helped catch many bugs in some other work in progress that
tries to store convergent values in SIMD8 registers regardless of the
dispatch width. That code would accidentally generate destination
strides of zero.

v2: Check stride differently depending on register file. Suggested by
Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28256>
2024-03-19 18:17:59 +00:00
Danylo Piliaiev
d10b546776 freedreno/replay: Use real queueid for submissions and waits
Otherwise it failed when expected queueid is not 0.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27123>
2024-03-19 17:56:33 +00:00
Samuel Pitoiset
6f18f39208 zink/ci: enable RADV_PERFTEST=shader_object for polaris10
It's passing in CI now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28273>
2024-03-19 17:33:11 +00:00
Konstantin Seurer
6095b70f85 radv/rt: Use 32-bit offsets for load_sbt_entry
Totals from 82 (18.06% of 454) affected shaders:
MaxWaves: 820 -> 821 (+0.12%)
Instrs: 2765694 -> 2766338 (+0.02%); split: -0.08%, +0.10%
CodeSize: 14751988 -> 14735464 (-0.11%); split: -0.13%, +0.01%
VGPRs: 8464 -> 8448 (-0.19%)
SpillSGPRs: 454 -> 512 (+12.78%)
Latency: 19368679 -> 19344967 (-0.12%); split: -0.21%, +0.09%
InvThroughput: 5354427 -> 5346317 (-0.15%); split: -0.24%, +0.08%
VClause: 100183 -> 100331 (+0.15%); split: -0.02%, +0.17%
SClause: 66584 -> 66590 (+0.01%); split: -0.02%, +0.03%
Copies: 237008 -> 238684 (+0.71%); split: -0.53%, +1.23%
Branches: 113344 -> 113386 (+0.04%); split: -0.00%, +0.04%
PreSGPRs: 6141 -> 6194 (+0.86%)
PreVGPRs: 7916 -> 7880 (-0.45%)

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27725>
2024-03-19 17:03:28 +00:00
Konstantin Seurer
00dec03438 radv: Use radv_buffer_map for parsing IBs
We need matching pointers pointers for annotations to work.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:14 +00:00
Konstantin Seurer
a78cbc98cc ac: Improve context roll readability
Add new lines to improve visual separation and color registers:
- red = unchanged
- green = changed

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:14 +00:00
Konstantin Seurer
1d747653d4 radv: Add an IB annotation layer
The layer annotates the command buffers with api
entrypoint names.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:14 +00:00
Konstantin Seurer
8f0ee3a92b radv: Add support for IB annotations
Wires up ac_parse_ib annotation support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:14 +00:00
Konstantin Seurer
bf15688fa1 ac/parse_ib: Implement annotations
Annotates the IB dump with driver specified strings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:13 +00:00
Konstantin Seurer
0f436e0fe1 ac/parse_ib: Replace the parameter list with ac_ib_parser
It's more code but it should be more readable. This also makes adding
optional arguments easier.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:13 +00:00
Konstantin Seurer
2e4d365104 ac: Annotate context rolls
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27549>
2024-03-19 16:08:13 +00:00
Timur Kristóf
8f3cc3cb29 radv: Use mapped driver locations for determining I/O strides.
This will allow us to more accurately determine the
input and output strides, because the I/O locations mapped
by RADV don't match the locations in NIR.
As a result, ESO will use less LDS.

It also fixes the per-patch output stride of tess control
shaders, because previously we omitted tess factors from them.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28021>
2024-03-19 15:01:19 +00:00
Timur Kristóf
2f1f55cf32 radv: Extract input and output stride info to new functions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28021>
2024-03-19 15:01:19 +00:00
Eric Engestrom
c72bb8de75 r300: mark new fails
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/56480445

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28271>
2024-03-19 14:41:45 +00:00
Echo J
8c92ac3ee3 nvk: Add NVK to the Vulkan device name
Other Mesa Vulkan drivers do the same thing (this helps to identify
the driver better especially with the recent official name import)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28262>
2024-03-19 14:12:31 +00:00
Danylo Piliaiev
432d8bd081 freedreno/devices: Do not write to 8E79 on a750, KGSL has it protected
Writing REG_A7XX_RB_UNKNOWN_8E79 causes:
 adreno-gen7-gmu 3d68000.qcom,gmu: CP | Protected mode error | WRITE | addr=0x08e79 | status=0x00608e79

Fixes: ebde7d5e87
("tu/a7xx: Write even more magic regs to fix rendering issues on Android")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27912>
2024-03-19 13:35:12 +00:00
Daniel Schürmann
9bbb9f1104 aco: use small_vec as Block::edge_vec for predecessors and successors
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19 13:06:58 +00:00
Daniel Schürmann
3e58a736e4 aco/util: small_vec few additions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19 13:06:58 +00:00
Rhys Perry
5cbd7689be aco/util: add small_vec
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19 13:06:58 +00:00
Daniel Schürmann
4564ca313b aco: reorder code and use namespaces in aco_interface.cpp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19 13:06:58 +00:00