Marek Olšák
fbbf029529
radeonsi: enable 16-bit mediump IO for PS outputs only, and VS->PS with env var
...
It has been implemented and works for PS outputs already.
The lowering callback needs 2 variants because we can't access
pipe_screen from it. The callback is rewritten to be more general.
We also need to do nir_clear_mediump_io_flag for any outputs we don't
lower because the mediump flag might prevent optimizations if it's not
cleared.
v2: fix si_nir_optim
Acked-by: Timur Kristóf <timur.kristof@gmail.com > (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
5a7ff54aaa
radeonsi: remove gs_input_verts_per_prim from si_shader_info
...
It can be computed from input_primitive.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
1a197aa057
radeonsi: remove unused output_type and output_usage from si_shader_info
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
58f12b3c81
radeonsi: don't count outputs with GS streams > 0 for outputs_written_before_ps
...
outputs_written_before_ps is used to determine kill_outputs, which removes
param exports, but non-zero GS streams are xfb-only and not exported.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
0b3b105bde
radeonsi: use si_assign_param_offsets for legacy GS too
...
The result of that function was overwritten by other code, so just remove it.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
cc497fd0e4
radeonsi: move gfx10_shader_ngg.c contents into si_shader.c
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
d3c1c638c4
radeonsi: cull against cull distances in the shader and don't export them
...
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
5b5addd9e9
radeonsi: enable culling against clip/cull distances and clip planes in GS
...
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
cee54211df
radeonsi: reduce the size of 2 fields in si_shader_variant_info
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
45acb5857d
radeonsi: pack clip/cull distance export components
...
This removes unused and no-op clip/cull distance components, though
it's not very common.
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
fec40557d3
radeonsi: use nir_opt_clip_cull_const
...
It eliminates no-op (>= 0) clip/cull distance output components by setting
no_sysval_output = true.
We have to gather clip/cull distances manually to get reduced clip/cull
masks.
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
c743c3dd1a
radeonsi: support 8 non-ClipVertex clip planes instead of 6
...
If there are more than 6 planes without gl_ClipVertex and gl_ClipDistance,
add "gl_ClipVertex = gl_Position;" to support up to 8 planes.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
1b594e6745
radeonsi: gather nr_pos_exports from the final NIR
...
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
2c0eb09e39
radeonsi: simplify old_vs & old_ps checking in si_update_shaders
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Marek Olšák
e73f70e135
radeonsi: add si_shader_variant_info::clip/culldist_mask
...
so that it can be different between shader variants
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529 >
2025-07-12 10:28:20 +00:00
Valentine Burley
391c40f9fc
freedreno/ci: Add ASan jobs on a618
...
Introduce nightly Address Sanitizer jobs for GLES and Vulkan on a618.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35053 >
2025-07-12 09:21:03 +00:00
Valentine Burley
08152633fb
ci/lava: Add arm64 ASan job templates
...
Signed-off-by: Valentine Burley <valentine.burley@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35053 >
2025-07-12 09:21:03 +00:00
Valentine Burley
201ac3bf49
turnip/ci: Skip Vulkan Video tests
...
Vulkan Video isn't supported, since video isn't part of the gpu.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35053 >
2025-07-12 09:21:02 +00:00
Georg Lehmann
92d433c54a
aco: vectorize conversions from 8bit to 16bit
...
Massively helps emulated fp8 performance.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854 >
2025-07-12 08:39:15 +00:00
Georg Lehmann
7fece5592c
aco: vectorize 16bit extracts
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854 >
2025-07-12 08:39:14 +00:00
Georg Lehmann
a045e9a624
ac/nir: lower uniform extract_i8/u8 to 32bit
...
To prevent vectorizing this later.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854 >
2025-07-12 08:39:13 +00:00
Georg Lehmann
2cc3e1876c
ac/llvm: support vec2 extract
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854 >
2025-07-12 08:39:13 +00:00
Julia Zhang
d34b069e9b
radeonsi: small fixes of radeonsi renderstage
...
Remove redundent lines in si_perfetto.cpp
Convert offset_B of buffer gpu_address to uint64_t ptr offset when
si_buffer_map is being called to read timestamp.
Destroy sctx->trace by calling u_trace_fini in si_utrace_fini which
will be called in si_destroy_context.
Signed-off-by: Julia Zhang <julia.zhang@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36066 >
2025-07-12 07:28:46 +00:00
Marek Olšák
f8918ed6c6
radv: stop using LLVM LDS linking logic
...
Not needed.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:06 +00:00
Marek Olšák
44dd39d121
radv: pack clip and cull distance outputs for both legacy and NGG pipelines
...
This increases primitive throughput when packing reduces the number
of pos exports due to holes in clip and cull distance arrays that could be
punched out by nir_opt_clip_cull_const. This applies to all chips.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:06 +00:00
Marek Olšák
2751d488ce
radv: enable nir_opt_clip_cull_const for GS too
...
The pass also supports GS now.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:05 +00:00
Marek Olšák
bdcfe15457
radv: don't export cull distances if the shader culls against them
...
This increases primitive throughput for all hw with NGG if the shader
culls and the removal of cull distances reduces the number of position
exports.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:05 +00:00
Marek Olšák
0cce0505cc
radv: compute the number of position outputs after compilation
...
It will be different between NGG and legacy because NGG with culling
will not export cull distances.
The number of position exports could also be gathered from final NIR
to reduce logic duplication.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:05 +00:00
Marek Olšák
21646b0124
radv: don't include positions exports in pipeline executable stats
...
It will be different between NGG and legacy because NGG with culling
will not export cull distances.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:04 +00:00
Marek Olšák
88a1c1f881
radv: enable NGG culling for GS
...
This is very useful for increasing raw primitive throughput for GS
(mostly just RDNA 2), increasing raw primitive throughput with clip
and cull distance outputs when they actually cull anything (RDNA 1-4),
and reducing attribute store bandwidth usage (RDNA 3-4).
It will also replace fixed-func culling against cull distances when
culling in the shader is enabled, which will increase primitive throughput
even further.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:04 +00:00
Marek Olšák
ae4d539540
radv: rework radv_link_shaders_info as as not be called in a loop
...
It receives all shaders and decides how to link them.
When culling is enabled for GS, we will need ES, GS, and FS in this
function at the same time.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:03 +00:00
Marek Olšák
b97c4bfd58
radv: enable W/front/back face NGG culling with multiple viewports
...
This is supported.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:03 +00:00
Marek Olšák
89e1ec92c5
radv: cull against clip and cull distances in the shader
...
Clip and cull distance outputs decrease primitive throughput, so culling
against them in the shader has even more benefit than other culling
options.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:03 +00:00
Marek Olšák
ae78e8d198
ac/nir: handle VARYING_SLOT_VARn_16BIT the same as other slots
...
They are the same as regular VARn.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:02 +00:00
Marek Olšák
762fdf8236
ac/nir: fix mediump XFB
...
The previous code was completely wrong and untested. This is tested.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:02 +00:00
Marek Olšák
56f80479fc
ac/nir: remove unnecessary 16-bit handling from pre-rast GS and XFB loads/stores
...
All callers always pass 32 bits in there.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:02 +00:00
Marek Olšák
65972f2301
ac/nir: return GSVS emit sizes from legacy GS lowering and simplify shader info
...
This simplifies shader info in drivers by returning GSVS emit sizes from
ac_nir_lower_legacy_gs. The pass knows the sizes, so drivers shouldn't
have to determine them independently.
This also makes the values more accurate because both drivers were
computing the GSVS emit sizes inaccurately and had redundant fields
in shader info. RADV had a lot of redudancy there.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:02 +00:00
Marek Olšák
c1d3108855
radv: call radv_get_legacy_gs_info after ac_nir_lower_legacy_gs
...
The pass will determíne the GSVS ring size, so radv_get_legacy_gs_info
must be called after that.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:01 +00:00
Marek Olšák
76ce37058d
radv: set the maximum possible workgroup size for legacy GS before linking
...
The optimal workgroup size will be set after lowering.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:00 +00:00
Marek Olšák
d674e97d5c
radv: use shared ac_legacy_gs_compute_subgroup_info
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:20:00 +00:00
Marek Olšák
8a1e357f71
radv: use shared ac_ngg_compute_subgroup_info
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12496
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473 >
2025-07-12 05:19:59 +00:00
Connor Abbott
a3a53b7cee
tu: Implement VK_VALVE_fragment_density_map_layered
...
In order to implement the extension we have to override the last
pre-rasterization shader to inject "gl_ViewportIndex = gl_Layer" at the
end, because there is no layered rendering equivalent to the
VIEWPORTINDEXINCR bit that adds gl_ViewIndex to gl_ViewportIndex in HW.
We also have to deal with the case where layered rendering is enabled
but the bit isn't set, in which case patchpoints that depend on the view
will see num_views = 1 but the patchpoint is for a higher view (aka
layer). This requires changing all of the patchpoints to handle this
case. Finally we have to change a number of cases which needed the
number of FDM layers to stop using num_views directly from the
renderpass and take into account whether per-layer rendering is enabled.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35594 >
2025-07-11 22:05:20 +00:00
Connor Abbott
5a653d8dd4
tu: Split out viewport faking from per-view viewports
...
For VK_VALVE_fragment_density_map_layered, we need to split out whether
to enable per-view viewports and whether viewport 0 should be splatted
to all viewports. We also have to split this out for
VK_QCOM_multiview_per_view_viewport, where the splatting needs to be
disabled. To avoid conflicts between them, plumb through
"fake_single_viewport" which will be needed for the former extension and
needs to be modified for the latter extension.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35594 >
2025-07-11 22:05:18 +00:00
Connor Abbott
c65017f746
vk/runtime: Handle VK_PIPELINE_CREATE_2_PER_LAYER_FRAGMENT_DENSITY_BIT_VALVE
...
This flag must match in pre-rasterization and fragment shader state.
Pass it through in both.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35594 >
2025-07-11 22:05:18 +00:00
Roland Scheidegger
39ccb1ddac
llvmpipe: Improve persepctive correction with centroid/sample interpolation
...
When doing perspective correct interpolation with centroid/sample, we should
do the perspective correction wrt to the centroid/sample position - the
precalculated 1/w value however is always wrt pixel center (when we calculate
this we don't know yet what kind of interpolation we're going to use).
Refactor things a bit as well to avoid code duplication.
Reviewed-by: Brian Paul <brian.paul@broadcom.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36079 >
2025-07-11 21:48:09 +00:00
Olivia Lee
483489ed1f
bin/people.csv: update my name/email
...
Missed this in 63557a03df .
Signed-off-by: Olivia Lee <olivia.lee@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36084 >
2025-07-11 12:39:33 -07:00
Pohsiang (John) Hsu
6033357635
mediafoundation: don't send METransformNeedInput when in Flush/Drain
...
Reviewed-by: Rohit Athavale <rathavale@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36082 >
2025-07-11 18:47:46 +00:00
Sil Vilerino
9c452f8140
mediafoundation: Fix interop without copy fallback from DX11 to DX12
...
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com >
Reviewed-by: Rohit Athavale <rathavale@microsoft.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13446
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36082 >
2025-07-11 18:47:46 +00:00
Pohsiang (John) Hsu
6d1cb645d8
mediafoundation: fix build after updating sdk to 26100.4188
...
Reviewed-by: Sil Vilerino <sivileri@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36082 >
2025-07-11 18:47:46 +00:00
Connor Abbott
6579b378ca
tu: Add debug flag to force disable FDM
...
Useful to check if a bug is caused by FDM.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35837 >
2025-07-11 17:48:04 +00:00