ntq_setup_fs_inputs and ntq_setup_gs_inputs sort the inputs according to
the driver location. This input array is then used to calculate the VPM
offset for the outputs in the previous stage. However, it wasn’t taking
into account variables that are packed into a single varying slot. In
that case they would have the same driver_location and are
distinguished by location_frac.
This patch makes it additionally sort by location_frac when the driver
locations are equal. This can happen when the compiler packs varyings
that are sized less than vec4. Without this fix, when the VPM is used to
transmit data free-form between the stages (such as VS->GS) then it
would end up writing to inconsistent locations.
Fixes dEQP tests such as:
dEQP-GLES31.functional.primitive_bounding_box.lines.global_state.
vertex_geometry_fragment.default_framebuffer_bbox_equal
Fixes: 5d578c27ce ("v3d: add initial compiler plumbing for geometry shaders")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5787>
(cherry picked from commit deefebc55b)
If we clear depth only texture via glClearTex(Sub)Image it may cause:
../src/intel/blorp/blorp_genX_exec.h:1554: blorp_emit_surface_states: Assertion `params->depth.enabled || params->stencil.enabled' failed.
due to clear_depth_stencil calling blorp_clear_depth_stencil when
depth is already fast-cleared and there is no stencil.
Fixes piglit test: arb_clear_texture-depth
Fixes: 51638cf18a
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5770>
(cherry picked from commit 77844690be)
The offset for the VPM write for storing outputs from the geometry
shader isn’t necessarily uniform across all the lanes. This can happen
if some of the lanes don’t emit some of the vertices. In that case the
offset for the subsequent vertices will be different in each lane. In
that case we need to use the stvpmd instruction instead of stvpmv
because it will scatter the values out.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3150
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5621>
(cherry picked from commit 3b1c511b09)
Previously we would match the start of the compatible string, in
a couple of cases, in order to match compatible strings like
"qcom,adreno-630.2". But these cases would always list a more
generic compatible (ie. "qcom,adreno") as a later choice. So if
we parse all the compatible strings, we can do a more precise
exact match.
This avoids us accidentially matching on "qcom,adreno-smmu" and
the hilarity that ensues.
Fixes: 5a13507164 ("freedreno/perfcntrs: add fdperf")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5762>
(cherry picked from commit 385d036f58)
This was causing vkAcquireNextImageKHR to not signal the fences and
semaphores. In the case where the semaphore was brand new, this could
cause an unsignalled syncobj to be passed into execbuffer2 which it will
reject with -EINVAL leading to VK_ERROR_DEVICE_LOST. Thanks to Henrik
Rydgård who works on the PPSSPP project for helping me figure this out.
Fixes: ca3cfbf6f1 "vk: Add an initial implementation of the actual..."
Fixes: 778b51f491 "vulkan/wsi: Add a hooks for signaling semaphores..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5672>
(cherry picked from commit b0bbb62325)
Some Piglit tests (rightfully) fail because of min >= max when exposed
to perf counters that do not explicitly define their max value.
Failing tests:
spec/amd_performance_monitor/api/test_counter_info
spec/amd_performance_monitor/vc4/test_counter_info
u32/u64 changes are no-ops.
Fixes: 4cd1cfb983 ("st/mesa: implement GL_AMD_performance_monitor")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5473>
(cherry picked from commit 2f4a112ec4)
Dot product is multiplication followed by addition, and absolute value
does not distribute into addition.
Only vec4 platforms are affected by this change as scalar-only platforms
never have any of the fdot_replicated instructions. In the shader-db
results, below, shaders in MANY different applications are affected.
Trine, Doom3, Enemy Territory: Quake Wars, Counter Strike: Global
Offensive, Mad Max, Metro Last Light, and on and on... I'm really
shocked that there were no test regressions!
All Haswell and earlier platforms had similar results. (Haswell shown)
total instructions in shared programs: 16219743 -> 16219820 (<.01%)
instructions in affected programs: 12171 -> 12248 (0.63%)
helped: 1
HURT: 78
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.78% max: 0.78% x̄: 0.78% x̃: 0.78%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.35% max: 2.38% x̄: 0.91% x̃: 1.06%
95% mean confidence interval for instructions value: 0.92 1.03
95% mean confidence interval for instructions %-change: 0.78% 1.00%
Instructions are HURT.
total cycles in shared programs: 538481383 -> 538491045 (<.01%)
cycles in affected programs: 470796 -> 480458 (2.05%)
helped: 149
HURT: 142
helped stats (abs) min: 1 max: 1338 x̄: 71.13 x̃: 4
helped stats (rel) min: 0.06% max: 40.99% x̄: 2.76% x̃: 0.67%
HURT stats (abs) min: 1 max: 2092 x̄: 142.68 x̃: 12
HURT stats (rel) min: 0.07% max: 55.38% x̄: 5.07% x̃: 1.07%
95% mean confidence interval for cycles value: -5.28 71.69
95% mean confidence interval for cycles %-change: -0.07% 2.19%
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 62795475e8 ("nir/algebraic: Distribute source modifiers into instructions")
Closes: #3129
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5581>
(cherry picked from commit 8591adea38)
This code was broken, but it worked by accident, as the
pad and the edgeflag were reversed, however when Roland
removed the cliptest field back in 2015, he stopped copying
the pad which actually stopped copy the edgeflag.
Fix the function to actually copy the edgeflag.
Fixes: 1b22815af6 ("draw: don't pretend have_clipdist is per-vertex")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5679>
(cherry picked from commit 2bf2e6c83d)
The code was expecting the lower 32-bits of the 64-bit to be
what it wanted, don't be implicit, pull the value from the union.
This should fix rendering on big endian systems since NIR was
introduced.
Fixes: 44a6b0107b ("gallivm: add nir->llvm translation (v2)")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5677>
(cherry picked from commit 237d728418)