Mary Guillemard
250988e963
pan/clc: Build for v12
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:07 +02:00
Mary Guillemard
2210eb873a
pan/lib: Build for v13
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:07 +02:00
Mary Guillemard
9814f2d553
pan/lib: Build for v12
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:07 +02:00
Mary Guillemard
49417e6c86
pan/genxml: Build libpanfrost_decode for v13
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Mary Guillemard
811525b543
pan/genxml: Build libpanfrost_decode for v12
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Mary Guillemard
ece01443e1
pan/genxml: Add v13 definition
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Mary Guillemard
b6d5e01120
pan/genxml: Add v12 definition
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Mary Guillemard
079426dd62
pan/genxml: Rename UMIN32 opcode to COMPARE_SELECT32
...
This is the official name, let's match with newer generation too.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Mary Guillemard
2d7c402645
pan/bi: Allow no_psiz variant with IDVS2
...
This can be supported for IDVS2 and that will reduce arch diffs on panvk
and the Gallium driver.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Mary Guillemard
0f56b59cac
pan/bi: Lower IADD.v4s8 in algebraic on v11+
...
We lowered ISUB.v4s8 but forgot about IADD.v4s8.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032 >
2025-04-15 13:36:06 +02:00
Erik Faye-Lund
0eb0fd64aa
docs/panfrost: use anonymous hyperlinks
...
These aren't supposed to be referred to from elsewhere, so let's use
anonymous links here instead.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34500 >
2025-04-15 11:15:30 +00:00
Erik Faye-Lund
65b7d2e865
panvk: claim official conformance on v10
...
It's official, PanVK is Vulkan 1.1 conformant on v10. Let's make this
clear.
Backport-to: 25.0
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34500 >
2025-04-15 11:15:30 +00:00
Loïc Molinari
9205212d2e
mesa: Add CPU traces
...
A few function scope traces are added to texture management entry
points, shader compilation, draw and 2D rect copy to give better
context to driver traces.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com >
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com >
Acked-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385 >
2025-04-15 10:37:39 +00:00
Loïc Molinari
2bf9b9b73d
docs: Add Panfrost to the list of drivers with CPU traces
...
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com >
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com >
Acked-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385 >
2025-04-15 10:37:39 +00:00
Loïc Molinari
bb63d7cfee
pan/kmod: Add drmIoctl() wrapper pan_kmod_ioctl() with CPU trace
...
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com >
Co-authored-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385 >
2025-04-15 10:37:39 +00:00
Loïc Molinari
5cd89d48ee
panfrost: Add CPU traces
...
A few function scope traces are added to instrument blits, clears,
flushes, BOs and resources handling, shader compilation and cache,
draws, compute shader job emissions and AFBC packing.
Give the panfrost_flush_all_batches() call from panfrost_flush() a
more specific reason ("Gallium flush") to improve traces.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com >
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com >
Acked-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385 >
2025-04-15 10:37:39 +00:00
Loïc Molinari
a7727f692f
perfetto: Let MESA_TRACE_FUNC() take printf-like format arguments
...
This can be useful to track different values like buffer sizes, ioctl
ops, etc.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com >
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com >
Acked-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385 >
2025-04-15 10:37:39 +00:00
Collabora's Gfx CI Team
b1af5780d1
Uprev ANGLE to a3f2545f6bb3
...
3818d37d5e...a3f2545f6b
Also disable -Werror, because we're not necessarily using the same
toolchain or dependencies as the upstream builds.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34515 >
2025-04-15 10:06:53 +00:00
Collabora's Gfx CI Team
d5f4733702
Uprev Piglit to 0ecdebb0f592
...
ebdf60e0d4...0ecdebb0f5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34515 >
2025-04-15 10:06:53 +00:00
Erik Faye-Lund
d4797b8ab7
panvk: enable KHR_spirv_1_4 on v10+
...
The previous fix seems to be all that was needed to enable this, so
let's flip the switch.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34514 >
2025-04-15 09:15:29 +00:00
Erik Faye-Lund
e77a815299
panvk: set shared_addr_format
...
We need to set this, otherwise we end up failing tests.
Fixes: 4e111c259c ("panvk: Lower shared memory")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34514 >
2025-04-15 09:15:29 +00:00
Alyssa Rosenzweig
d31db877e2
util/simple_mtx: fix duplicate definition
...
botched #ifdef in the cited commit caused simple_mtx_t to be defined
twice in certain cases, which broke the docs build.
Fixes: cb31b5a958 ("clc,libcl: Clean up CL includes")
Reported-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Marek Olšák <maraeo@gmail.com >
Tested-by: Vinson Lee <vlee@freedesktop.org >
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34504 >
2025-04-15 08:30:40 +00:00
Valentine Burley
75214c599c
zink/ci: Work around recent OOM issues in zink-anv-tgl
...
Lower the concurrency for the zink-anv-tgl job to avoid the out-of-memory
issues seen recently.
Remove the flakes that were added as the previous workaround.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34505 >
2025-04-15 06:51:09 +00:00
Samuel Pitoiset
e86e0fc525
radv: allocate the SPM BO in GTT for faster readback
...
Reading VRAM from CPU is very slow.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34467 >
2025-04-15 06:30:38 +00:00
Job Noorman
af8105d085
ir3/ra: ignore phis handled by shared RA
...
If shared RA is used, it may have handled some phis. These are already
ignored by regular RA in handle_phi but were used before that in
potentially dangerous ways. More specifically, the interval of such phis
was accessed which may cause an out-of-bounds read since it was never
created. Fix this by skipping such phis earlier.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Fixes: c6a932d4b3 ("ir3/ra: handle phis with preferred regs first")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34503 >
2025-04-15 06:04:04 +00:00
Job Noorman
d8033ba173
ir3/ra: add helper for getting a dst interval
...
There have been multiple issues related to accessing intervals through
invalid register names. This usually results in a (difficult to
diagnose) out-of-bounds access. Wrap all the interval accesses in a
helper where we can assert that the name is in-bounds.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34503 >
2025-04-15 06:04:04 +00:00
Saroj Kumar
384bf8e58e
radeonsi: Move buffer descriptor slot to the beginning
...
Move the buffer descriptor slot to index 0 in 16 dword
image+sampler slot in si_descriptors.c
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491 >
2025-04-14 22:44:14 +00:00
Marek Olšák
dc70e1c198
radeonsi: determine VM_ALWAYS_VALID accurately
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491 >
2025-04-14 22:44:14 +00:00
Marek Olšák
480c8addd8
winsys/amdgpu: don't add VM_ALWAYS_VALID buffers into the BO list
...
They shouldn't be there.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491 >
2025-04-14 22:44:13 +00:00
Marek Olšák
7f7d6deb18
radeonsi: add ACO-specific main shader parts
...
We can't have merged shaders where the first part is compiled using ACO
and the second part is compiled using LLVM.
Add ACO-specific main shader parts to fix that.
This happens when ACO is enabled for gfx12 streamout where GS can be paired
with a previous shader compiled by LLVM.
Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491 >
2025-04-14 22:44:13 +00:00
Marek Olšák
4865ac57cc
radeonsi: make si_shader_selector::main_shader_part_* an iterable union
...
for the next commit
Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491 >
2025-04-14 22:44:13 +00:00
Marek Olšák
1adf969318
radeonsi/ci: add gfx12 failures and flakes
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491 >
2025-04-14 22:44:13 +00:00
Jose Maria Casanova Crespo
0bcb82048c
v3dv: avoid TFU reading unmapped pages beyond the end of the buffers
...
TFU units is doing a readahead of 64 bytes. This is causing invalid read
MMU errors that can be observed at the nightly full Vulkan runs on
Broadcom devices.
04:13:59.969: [ 85.623205] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x4869000, pte invalid
04:14:05.408: [ 91.019321] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x5209000, pte invalid
04:14:05.413: [ 91.031662] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x7521000, pte invalid
Although the log reports the TLB the real culprit is the TFU. A fix
to the kernel was submitted to fix AXI ID on V3D 4.2 and 7.1
So doing an over-allocation of 64-bytes at v3dv_AllocateMemory is
the simplest method to make these MMU errors itp disapear.
Running ./deqp-vk for an hour, we can see that ~%40 of allocations
would need an extra page (4096 bytes) to accomodate this 64 bytes
padding.
Fixes: ca330f7f04 ("v3dv: implement VK_EXT_memory_budget")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34475 >
2025-04-15 00:17:11 +02:00
Caio Oliveira
fafdd24285
intel/executor: Update bfloat example
...
Elaborate on the packed/unpack restrictions, use ADD(x, 0.0f)
as a workaround for F->BF conversion.
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506 >
2025-04-14 18:23:43 +00:00
Caio Oliveira
fbe5d559bd
brw: Update EU validation to allow packed BF mixed with packed F
...
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506 >
2025-04-14 18:23:43 +00:00
Caio Oliveira
d1dd088ede
brw: Allow DPAS with BF on Gfx125
...
MTL doesn't support, but both ACM and ARL-H do.
Fixes: e384ccde28 ("brw: Expand EU validation for DPAS")
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506 >
2025-04-14 18:23:43 +00:00
Caio Oliveira
050acb9def
intel: Disable has_bfloat16 for MTL
...
Not supported. Some operations *do* work, but proper support
was removed since it also doesn't support DPAS.
Fixes: 9916cc1050 ("brw: Add BRW_TYPE_BF for bfloat16")
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506 >
2025-04-14 18:23:43 +00:00
Caio Oliveira
adfab666a4
intel: Add intel_device_info::has_systolic
...
Gfx125+ has systolic, with exception for MTL and some ARL
variants. Update code and tests to use it.
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506 >
2025-04-14 18:23:43 +00:00
Mike Blumenkrantz
bf5273dd38
ci: update VVL to current week
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651 >
2025-04-14 17:51:05 +00:00
Mike Blumenkrantz
0b7611824a
zink: use implicit stride in ntv for temp vars
...
APPARENTLY explicit stride is illegal for temp vars because they should
just be using the element stride implicitly
this makes total sense and is very obvious
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651 >
2025-04-14 17:51:05 +00:00
Mike Blumenkrantz
b4e3535650
zink: stop setting ArrayStride on image arrays
...
this is illegal
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651 >
2025-04-14 17:51:05 +00:00
Mike Blumenkrantz
1c0de360bc
zink: don't set shared block stride without KHR_workgroup_memory_explicit_layout
...
this is illegal
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651 >
2025-04-14 17:51:05 +00:00
Connor Abbott
74531094cb
ir3: Vectorize shared memory loads/stores
...
This drastically helps a Path of Exile 2 compute dispatch, going from
4.6ms to 2.7ms.
Totals from 969 (0.59% of 164134) affected shaders:
MaxWaves: 9586 -> 9560 (-0.27%); split: +0.02%, -0.29%
Instrs: 1252433 -> 1234724 (-1.41%); split: -1.47%, +0.05%
CodeSize: 2237424 -> 2195238 (-1.89%); split: -1.91%, +0.03%
NOPs: 362213 -> 360913 (-0.36%); split: -0.92%, +0.56%
MOVs: 58879 -> 59591 (+1.21%); split: -0.62%, +1.83%
Full: 15817 -> 15867 (+0.32%); split: -0.04%, +0.36%
(ss): 35671 -> 35434 (-0.66%); split: -1.80%, +1.14%
(sy): 23953 -> 23964 (+0.05%); split: -0.38%, +0.43%
(ss)-stall: 127807 -> 124930 (-2.25%); split: -3.43%, +1.18%
(sy)-stall: 583947 -> 585886 (+0.33%); split: -0.61%, +0.94%
Early-preamble: 317 -> 316 (-0.32%)
Cat0: 394577 -> 393316 (-0.32%); split: -0.85%, +0.53%
Cat1: 100335 -> 101057 (+0.72%); split: -0.36%, +1.08%
Cat2: 415880 -> 415835 (-0.01%); split: -0.05%, +0.04%
Cat3: 187928 -> 187929 (+0.00%); split: -0.00%, +0.00%
Cat5: 19143 -> 19148 (+0.03%)
Cat6: 69630 -> 52523 (-24.57%)
Cat7: 47160 -> 47136 (-0.05%); split: -0.56%, +0.51%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34441 >
2025-04-14 17:22:47 +00:00
Connor Abbott
9977c4d682
ir3: Move load/store vectorization to finalize
...
Some frontends such as rusticl and turnip call the optimization loop
before choosing the shared memory layout, in order to be able to delete
variables that turn out to be unused. This means that we can't vectorize
them until after the first run of the optimization loop. Other drivers
also seem to do something similar.
This also has the benefit that by delaying vectorization of UBOs until
after they are lowered from derefs, we don't insert casts which remove
the ability of nir_lower_explicit_io to insert a range, which was
blocking the pushing of vectorized indirect UBO loads. This has a
significant positive impact on fossil-db:
Only doing vectorization later exposes a bug where vectorization could
change the bitsize after we used it to determine which descriptor to
use. It happened to work before because vectorization was usually done
early. To fix it, move adjusting the descriptor to a new pass that
happens after finalizing.
Totals:
MaxWaves: 2249140 -> 2281068 (+1.42%); split: +1.43%, -0.01%
Instrs: 49624230 -> 49143117 (-0.97%); split: -1.14%, +0.17%
CodeSize: 103796862 -> 104143744 (+0.33%); split: -0.98%, +1.31%
NOPs: 8489860 -> 8512218 (+0.26%); split: -1.55%, +1.81%
MOVs: 1531650 -> 1574911 (+2.82%); split: -1.37%, +4.20%
Full: 1814334 -> 1748906 (-3.61%); split: -3.64%, +0.03%
(ss): 1155395 -> 1128249 (-2.35%); split: -3.48%, +1.13%
(sy): 608650 -> 567972 (-6.68%); split: -7.32%, +0.64%
(ss)-stall: 4352550 -> 4340473 (-0.28%); split: -2.08%, +1.80%
(sy)-stall: 17852259 -> 16943647 (-5.09%); split: -6.25%, +1.16%
STPs: 24568 -> 24215 (-1.44%)
LDPs: 37799 -> 37468 (-0.88%)
Early-preamble: 115698 -> 113694 (-1.73%); split: +0.17%, -1.90%
Cat0: 9345228 -> 9367782 (+0.24%); split: -1.41%, +1.65%
Cat1: 2445265 -> 2549122 (+4.25%); split: -0.81%, +5.06%
Cat2: 18704736 -> 18377519 (-1.75%); split: -1.76%, +0.01%
Cat3: 14210303 -> 14130558 (-0.56%); split: -0.56%, +0.00%
Cat4: 1346895 -> 1346462 (-0.03%); split: -0.03%, +0.00%
Cat5: 1420418 -> 1420417 (-0.00%); split: -0.07%, +0.07%
Cat6: 745590 -> 549358 (-26.32%); split: -26.66%, +0.34%
Cat7: 1405795 -> 1401899 (-0.28%); split: -0.96%, +0.68%
Totals from 79089 (48.19% of 164134) affected shaders:
MaxWaves: 947648 -> 979576 (+3.37%); split: +3.40%, -0.03%
Instrs: 38664140 -> 38183027 (-1.24%); split: -1.47%, +0.22%
CodeSize: 80179110 -> 80525992 (+0.43%); split: -1.27%, +1.70%
NOPs: 6880907 -> 6903265 (+0.32%); split: -1.91%, +2.23%
MOVs: 1183855 -> 1227116 (+3.65%); split: -1.78%, +5.43%
Full: 1107056 -> 1041628 (-5.91%); split: -5.96%, +0.05%
(ss): 939342 -> 912196 (-2.89%); split: -4.28%, +1.39%
(sy): 457959 -> 417281 (-8.88%); split: -9.73%, +0.85%
(ss)-stall: 3664495 -> 3652418 (-0.33%); split: -2.47%, +2.14%
(sy)-stall: 12266805 -> 11358193 (-7.41%); split: -9.10%, +1.69%
STPs: 7494 -> 7141 (-4.71%)
LDPs: 7050 -> 6719 (-4.70%)
Early-preamble: 46339 -> 44335 (-4.32%); split: +0.43%, -4.75%
Cat0: 7548630 -> 7571184 (+0.30%); split: -1.75%, +2.05%
Cat1: 1823872 -> 1927729 (+5.69%); split: -1.09%, +6.78%
Cat2: 14767716 -> 14440499 (-2.22%); split: -2.22%, +0.01%
Cat3: 10630582 -> 10550837 (-0.75%); split: -0.75%, +0.00%
Cat4: 1150090 -> 1149657 (-0.04%); split: -0.04%, +0.00%
Cat5: 1068913 -> 1068912 (-0.00%); split: -0.09%, +0.09%
Cat6: 554910 -> 358678 (-35.36%); split: -35.82%, +0.45%
Cat7: 1119427 -> 1115531 (-0.35%); split: -1.20%, +0.86%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34441 >
2025-04-14 17:22:46 +00:00
Connor Abbott
2f93137308
nir/opt_preamble: Handle load_global_ir3
...
fossil-db results with turnip:
Totals from 994 (0.60% of 165023) affected shaders:
MaxWaves: 10720 -> 11528 (+7.54%); split: +7.57%, -0.04%
Instrs: 1032004 -> 972314 (-5.78%); split: -5.99%, +0.21%
CodeSize: 1847536 -> 1942472 (+5.14%); split: -0.11%, +5.25%
NOPs: 261089 -> 233279 (-10.65%); split: -10.89%, +0.23%
MOVs: 57217 -> 51434 (-10.11%); split: -14.11%, +4.00%
Full: 16412 -> 14647 (-10.75%); split: -10.96%, +0.21%
(ss): 23330 -> 25594 (+9.70%); split: -5.51%, +15.21%
(sy): 17803 -> 15711 (-11.75%); split: -11.93%, +0.18%
(ss)-stall: 96387 -> 107976 (+12.02%); split: -5.14%, +17.17%
(sy)-stall: 952952 -> 765754 (-19.64%); split: -19.84%, +0.19%
STPs: 494 -> 327 (-33.81%)
LDPs: 1447 -> 1163 (-19.63%)
Early-preamble: 668 -> 22 (-96.71%)
Cat0: 280935 -> 251779 (-10.38%); split: -10.60%, +0.22%
Cat1: 93400 -> 84766 (-9.24%); split: -11.79%, +2.55%
Cat2: 343880 -> 337270 (-1.92%); split: -3.20%, +1.28%
Cat3: 189311 -> 180918 (-4.43%)
Cat4: 21008 -> 19920 (-5.18%)
Cat5: 17788 -> 17783 (-0.03%)
Cat6: 45786 -> 39531 (-13.66%)
Cat7: 39896 -> 40347 (+1.13%); split: -0.43%, +1.56%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483 >
2025-04-14 16:53:34 +00:00
Connor Abbott
ec780eb0e7
ir3: Pass through access flags when lowering global accesses
...
This will let us do optimizations such as moving loads to a preamble.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483 >
2025-04-14 16:53:34 +00:00
Boris Brezillon
b7ff9dddd4
pan/earlyzs: Fix the read-only ZS optimization
...
Read-only ZS optimization can only happen if the ZS tile buffer is not
written, which can only be known when the fixed-function settings is
set.
Change pan_earlyzs_get() to take an enum instead of a boolean and
differentiate ZS-read and ZS-read-with-readonly-optimization-allowed.
Fixes: 25a993731087 ("pan/earlyzs: Support the shader ZS read-only case and its optimization on v10+")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34480 >
2025-04-14 15:20:06 +00:00
Eric R. Smith
69a6db4b2b
panfrost: fix transaction elimination crc valid calculation
...
The setting of the clean_pixel_write_enable flag in pan_prepare_rt
was not consistent with the crc valid calculations in pan_emit_fbd.
This caused the crc_valid flag to not be accurate, causing transaction
elimination to fail.
Fixes: eac8f1d460 ("Revert "panfrost: Disable CRC by default"")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34408 >
2025-04-14 14:56:35 +00:00
Adam Jackson
c4b305079d
meson: Simplify the power8 optimization logic
...
If it compiles, it works. And there's not a particularly good reason to
disable it, so don't let people disable it.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Acked-by: Dylan Baker <dylan.c.baker@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34239 >
2025-04-14 14:12:30 +00:00
Maíra Canal
3122df666e
broadcom/simulator: Fix Indirect CSD jobs for V3D 7.1.6+
...
Signed-off-by: Maíra Canal <mcanal@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34465 >
2025-04-14 12:13:30 +00:00