spirv_to_nir sometimes wraps derefs in vec2 or mov instructions as part of
its texture handling. These get in the way of
nir_rematerialize_derefs_in_use_blocks_impl. Running copy propagation
should get rid of the extra move instructions and get us back to intact
deref chains for everything except variable pointer use-cases.
fossil-db (Sienna Cichlid):
Totals from 6 (0.00% of 134572) affected shaders:
CodeSize: 92656 -> 93088 (+0.47%)
Instrs: 17060 -> 17138 (+0.46%)
Latency: 224408 -> 227539 (+1.40%)
InvThroughput: 37402 -> 37924 (+1.40%)
VClause: 408 -> 402 (-1.47%)
Copies: 1065 -> 1107 (+3.94%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5668
Fixes: 14a12b771d ("spirv: Rework our handling of images and samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13924>
(cherry picked from commit b425100781)
The function need_temp_reg_initialization looks suspecious.
It will only ever return true if we get past this if:
if (!(emit->info.indirect_files && (1u << TGSI_FILE_TEMPORARY)) ...
Using the logical && means the intended initialization done
based on the result of this check is not performed.
This code was both introduced and altered in MR 5317.
ccb4ea5a introduces the function.
ba37d408 is a collection of performance improvements and misc
fixes. This altered the if from using bitwise to logical and.
This commit changes it back to bitwise.
Spotted from a compile warning.
Fixes: ba37d408da ("svga: Performance fixes")
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12157>
(cherry picked from commit 64292c0f05)
Now that we removed the intel intrinsic and just use the generic one,
we can skip it in the intel call lowering pass and just deal with it
in the intel rt intrinsic lowering.
v2: rewrite with nir_shader_instructions_pass() (Jason)
v3: handle everything in switch (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 423c47de99 ("nir: drop the btd_resume_intel intrinsic")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12113>
(cherry picked from commit c5a42e4010)
If the source and destination were within the same full register, like
hr90.x and hr90.y (which both map to r45.x), then we'd perform the
swap/copy with the wrong register. This broke
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.35 once BDA is enabled.
Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
(cherry picked from commit c98adc56f4)
VK CTS just added some new tests to write to a compressed image
from a compute shader, which was overrunning memory.
The image width/height need to be sized according to the block
sizes to avoid overwriting memory.
dEQP-VK.image.sample_texture.*bit_compressed*
Cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13618>
(cherry picked from commit 27903abbb6)
When the client calls vkMapMemory(), we have to align the requested
offset down to the nearest page or else the map will fail. On platforms
where we have DRM_IOCTL_I915_GEM_MMAP_OFFSET, we always map the whole
buffer. In either case, the original map may start before the requested
offset and we need to take that into account when we clflush.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
(cherry picked from commit 90ac06e502)
10c75ae4 moved handling of this state to the functions that
depend on ctx->_ImageTransferState.
So we can't depend on _NEW_PIXEL being set to call this function,
since it'll be always clear earlier by _mesa_update_state_locked.
Example sequence that would trigger the issue:
glPixelTransferi(...)
glClear(...)
glTexSubImage2D(...) <-- won't use the new value set by
glPixelTransferi because glClear caused
_NEW_PIXEL to be cleared.
_NEW_PIXEL itself is kept because st_update_pixel_transfer depends
on it.
Fixes: 10c75ae4 ("mesa: move _mesa_update_pixel out of _mesa_update_state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5273
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13596>
(cherry picked from commit 1ee3fbd703)
vertex shader stages that can produce xfb must have
their input size clamped to the compiler define MAX_VARYING
to successfully be able to export an xfb output for each input
fixes KHR-GL46.geometry_shader.limits.max_input_components
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13632>
(cherry picked from commit 5d1b81d8ac)
Fixes dEQP-VK.reconvergence.*nesting* tests.
There are cases when cmod is set to an instruction that cannot have
conditional modifier. E.g. following:
find_live_channel(32) vgrf166:UD, NoMask
cmp.z.f0.0(32) null:D, vgrf166+0.0<0>:D, 0d
is optimized to:
find_live_channel.z.f0.0(32) vgrf166:UD, NoMask
v2:
- Add unit test to check cmod is not set to 'find_live_channel' (Matt Turner)
- Update flag_subreg when conditonal_mod is updated (Ian Romanick)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5431
Fixes: 32b7ba66b0 ("intel/compiler: fix cmod propagation optimisations")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13268>
(cherry picked from commit 2dbb66997e)
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13406>
We need to know the internal (physical) formats used for AFBC of a given
logical format, in order to check format compatibility and determine if
we need to decompress AFBC for conformance.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13205>
(cherry picked from commit 93c9123c31)
When compiling for x86 with MSVC, Vulkan API entry points follow the
__stdcall convention (VKAPI_CALL maps to __stdcall), which uses the
following name mangling:
_<function_name>@<arguments_size>
Fix the vk_entrypoint_stub()/alternatename definitions accordingly.
Fixes: 6d44b21d4f ("vulkan: Fix weak symbol emulation when compiling with MSVC")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13516>
(cherry picked from commit 1813bb5917)
The gl_FragColor lowering in the fragment shader depends on the number
of render targets, which can change every set_framebuffer_state.
set_framebuffer_state thus needs to force a rebind of the fragment
shader.
Fixes a regression in Piglit fbo-drawbuffers-none gl_FragColor -auto
-fbo from enabling AFBC on Mali G52.
Fixes: 28ac4d1e00 ("panfrost: Call nir_lower_fragcolor based on key")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13498>
(cherry picked from commit e0335ad888)
This uses the new property for AFBC we've added. The AFBC quirk is
applied only to v4, and we only set dev->has_afbc on v5+ so this is not
a regression. It now respects the hardware-specific AFBC disable.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13497>
(cherry picked from commit 68a7fafe2a)
originally, a slab attempts to reclaim a single bo. there are two outcomes
to this which can occur:
* the bo is reclaimed
* the bo is not reclaimed
if the bo is reclaimed, great.
if the bo is not reclaimed, it remains at the head of the list until it can
be reclaimed. this means that any bo with a "long" work queue which makes it
into a slab will effectively kill the entire slab. in a benchmarking scenario,
this can occur in rapid succession, and every slab will get 1-2 suballocations
before it reaches a bo that blocks long enough for a new slab to be needed.
the inevitable result of this scenario is that all memory is depleted almost instantly,
all because pb assumes that if the first bo in the reclaim list isn't ready, none of them
can be ready
for drivers like radeonsi, this happens to be a fine assumption
for drivers like zink, this is entirely not workable and explodes the gpu
Cc: mesa-stable
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13345>
(cherry picked from commit 3d6c8829f5)
Scratch patching code in iris_upload_dirty_render_state (see MERGE_SCRATCH_ADDR
calls) assumes that in all shader stages derived_data field stores 3DSTATE_XS
packet first.
This is not true for TESS_EVAL (DS), so we end up patching 3DSTATE_TE
instead of 3DSTATE_DS leading to DWordLength becoming 11 instead of 9
(9 == 3DSTATE_DS.DWordLength, 2 == 3DSTATE_TE.DWordLength, and 9|2 == 11),
and hardware hanging on the next instruction.
Fix this by reversing the order of packets for TESS_EVAL stage.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5499
Fixes: 4256f7ed58 ("iris: Fill out scratch base address dynamically")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13358>
(cherry picked from commit 5387522bd0)
When eglReleaseThread() is called from application's
destructor (API with __attribute__((destructor))),
it crashes due to invalid memory access.
In this case, _egl_TLS is freed in the flow of
_eglAtExit() as below but _egl_TLS is not set to NULL.
_eglDestroyThreadInfo
_eglFiniTSD
_eglAtExit
_run_exit_handlers
exit
Later when the eglReleaseThread is called from
application's destructor, it ends-up accessing
the freed _egl_TLS pointer.
eglReleaseThread -> in libEGL_mesa
eglReleaseThread -> in libEGL(glvnd)
destructor() -> App's destructor
To resolve the invalid access, setting the _egl_TLS
pointer as NULL after freeing it.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5466
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13302>
(cherry picked from commit 796c9ab3fd)
Mapping unimplemented entrypoints to a global function pointer variable
initialized to NULL is a bit cumbersome, and actually led to a bug
in the vk_xxx_dispatch_table_from_entrypoints() template: the !override
case didn't have the right check on the source table entries. Instead of
fixing that case, let's simplify the logic by creating a stub function
and making the alternatename pragma point to this stub. This way we get
rid of all those uneeded xxx_Null symbols/variables and simplify the
tests in vk_xxxx_dispatch_table_from_entrypoints().
Cc: mesa-stable
Fixes: 98c622a96e ("vulkan: Update dispatch table gen for Windows")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13348>
(cherry picked from commit 6d44b21d4f)
Two carchase compute shaders (shader-db) and two Fallout 4 fragment
shaders (fossil-db) were helped. Based on the NIR of the shaders, all
four had structures like
for (i = 0; i < 1; i++) {
...
for (...) {
...
}
}
All HSW+ platforms had similar results. (Ice Lake shown)
total loops in shared programs: 6033 -> 6031 (-0.03%)
loops in affected programs: 4 -> 2 (-50.00%)
helped: 2
HURT: 0
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 143692018 -> 143692006 (-0.0%)
SENDs in all programs: 6947154 -> 6947154 (+0.0%)
Loops in all programs: 38285 -> 38283 (-0.0%)
Cycles in all programs: 8434822225 -> 8434476815 (-0.0%)
Spills in all programs: 191665 -> 191665 (+0.0%)
Fills in all programs: 298822 -> 298822 (+0.0%)
In the presense of loop unrolling like this, the change in cycles is not
accurate.
v2: Rearrange the logic in the if-condition to read a little better.
Suggested by Tim.
Closes: #5089
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13323>
(cherry picked from commit ae99ea6f4d)
KHR-GL32.packed_pixels.pbo_rectangle.r16i on zink on lavapipe
ends up using a pbo that does an SINT image write. This was producing
truncated rather than clamped values.
Fix the calculations for 8/16-bit signed ints to clamp not truncate.
Fixes: 13e5f331db ("gallivm/nir: fix image store conversions")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13187>
(cherry picked from commit 1d48022dab)
Conflicts:
src/gallium/drivers/zink/ci/deqp-zink-lvp-fails.txt
We already set HALF_INTEGER, which is what the compiler actually does.
If we also set PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER, we get
incorrect lowering. Only set the CAP we respect.
On Bifrost, this convention is arbitrary. We should consider moving the
Bifrost lowering into NIR to optimize this better...
Fixes Piglit glsl-arb-fragment-coord-conventions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13070>
(cherry picked from commit 12facf23b1)
Conflicts:
src/panfrost/ci/piglit-panfrost-g52-fails.txt
This CI file doesn't exist in 21.2.x, so it was removed from this patch
When the determinant that we use for calculating triangle area
is NaN, it's not possible to decide the facing of the triangle.
This can happen when a coordinate of one of the triangle's vertices
is INFINITY. It's better to just accept these triangles in the shader
and let the PA deal with them.
Let's do the same for +/- Infinity too.
Though we haven't seen this yet, it may be troublesome as well.
Fixes: 651a3da1b5Closes: #5470
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13299>
(cherry picked from commit 783f8f728c)
Dropping the final pipe ref could in turn drop the final ref to one
of a couple other bo's, which in turn could indirectly recurse back
into cleanup_fences() on the same bo, resulting in a double decrement
of bo->nr_fences and underflow to a large positive #. This happens
because free'ing a bo back to the bo cache periodically calls
fd_bo_cache_cleanup() and any bo's that have not been re-used can
be really free'd, which in turn calls cleanup_fences().
Fixes: 7dabd62464 ("freedreno/drm: Userspace fences")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13263>
(cherry picked from commit faed3d4dfe)
Gallium caches sampler states for TBOs. Now if a buffer is first
attached to a TBO specifying one format, and later attached by
specifying another format and this TBO is then used, that would lead
to an assertion failure in debug builds, or to invalid rendering in
release builds, because the TBO picks the original, wrong format for
the sampler view.
Resolve this by signalling the change to Gallium (and other drivers), so
that Gallium clears the sampler view cache.
Fixes: f0ecd36ef8
st/mesa: add an entirely separate codepath for setting up buffer views
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13230>
(cherry picked from commit e95ecff784)
Fixes leaks (release) or assertion failures (debug) on allocating small
scanout resources, when falling through to the non-scanout-specific layout
code, which became more common as of ad50b47a14 ("gbm: assume
USE_SCANOUT in create_with_modifiers").
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13202>
(cherry picked from commit 60f464bbce)
The number of special varyings on midgard can influence how much space
we need to allocate for varyings in the compiler ABI. Move the enum so
we can get access it.
No functional change. This is cc stable purely so the following patches
can be backported.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13070>
(cherry picked from commit 00b0529061)
When a GPU hang occurs, the syncobj will eventually timeout,
if this is a wait, just set ready, so things will continue.
This matches 965 behaviour better.
Fixes: c282a082be ("crocus/query: poll the syncobj in the no wait situation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13188>
(cherry picked from commit 0a592db573)
We need it in the next commit to replace setting the stack alignment on
i386 with LLVM >= 13 through the TargetOption::StackAlignmentOverride,
which was removed in the upstream commit
<3787ee4571>.
Unfortunately Module::setOverrideStackAlignment() is not available
through the C API and we need to wrap it ourselves.
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reference: mesa/mesa#4906
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11940>
(cherry picked from commit c1b4c64a28)
If we're going to have to bind them as separate planes with colorspace
conversion for sampling on the frontend, then we need to report that
they're only for external-image samplers, otherwise the lowering won't be
applied.
Fixes: 4e3a7dcf ("gallium: enable EGL_EXT_image_dma_buf_import_modifiers unconditionally")
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13038>
(cherry picked from commit c530510514)
This fixes SPECVIEWPERF13 creo test case hang:
1. Client: send present pixmap request (serial=1) when swap_interval==1
and increase send_sbc=1
2. Server: pend the request before vblank arrives
3. Client: set swap_interval=0 (so set XCB_PRESENT_OPTION_ASYNC),
send another present pixmap request (serial=2), increase send_sbc=2
4. Server: handle the async request immediately and send complete event
(serial=2)
5. Client: handle the event and set recv_sbc=event->serial=2
6. Server: vblank arrives so handle pending request and send complete
event (serial=1)
7. Client: handle the event and set recv_sbc=event->serial=1
8. Client: someone call loader_dri3_swapbuffer_barrier() and waiting
on recv_sbc==send_sbc, but no one will set recv_sbc=2 again
So basically it's caused by swap happens out of order. This commit
fixes the problem by waiting on the pending sync swaps all done when
switching to async mode, so move 6&7 before 3.
Attach the xtrace when problem happens:
005:<:003e: 72: Present-Request(148,1): Pixmap window=0x03000002 pixmap=0x0300000b serial=1 valid=0x00000000 update=0x00000000 x_off=0 y_off=0 target_crtc=0x00000000 wait_fence=0x00000000 idle_fence=0x0300000c options=0 target_msc=4294967296 divisor=0 remainder=0 notifies=;
...
005:<:0041: 72: Present-Request(148,1): Pixmap window=0x03000002 pixmap=0x03000011 serial=2 valid=0x00000000 update=0x00000000 x_off=0 y_off=0 target_crtc=0x00000000 wait_fence=0x00000000 idle_fence=0x03000012 options=Async target_msc=0 divisor=0 remainder=0 notifies=;
005:>:0041: Event Generic(35) Present(148) IdleNotify(2) event=0x03000006 window=0x03000002 serial=2 pixmap=0x03000011 idle_fence=0x03000012
005:>:0041: Event Generic(35) Present(148) CompleteNotify(1) kind=Pixmap(0x00) mode=Copy(0x00) event=0x03000006 window=0x03000002 serial=2 ust=7505462213117739011 msc=3565046193979392
005:>:0041: Event Generic(35) Present(148) IdleNotify(2) event=0x03000006 window=0x03000002 serial=1 pixmap=0x0300000b idle_fence=0x0300000c
005:>:0041: Event Generic(35) Present(148) CompleteNotify(1) kind=Pixmap(0x00) mode=Copy(0x00) event=0x03000006 window=0x03000002 serial=1 ust=7505533793042694147 msc=3565050488946688
Cc: mesa-stable
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13019>
(cherry picked from commit e55e61758c)
The vk_shader_module_handle_from_nir() macro was constructing a
temporary vk_shader_module and passing it through
vk_shader_module_to_handle(). Since this is a function and not a macro,
it means that the lifetime of the temporary vk_shader_module will end
once the to_handle() function is called. Technically, this is a
use-after-free. I really don't know why no one has been bitten by this
yet....
Fixes: a41e98ddca "vk/util: add a util macro for initializing stack..."
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13101>
(cherry picked from commit 24637a6579)
This arg size should be 1 instead of 3. It does not affect functionality
because we does not enable it in SPI_PS_INPUT_ADDR. But it does affect
the VGPR number that LLVM produce when LLVM still count with all PS
function arguments.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
(cherry picked from commit 6f9f350622)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13053>
With NGG culling, the shaders are split into two parts:
the top part that computes just the position output,
and the bottom part which produces the other outputs.
To reduce redundancy between the two, I added some code
to reuse uniform variables between them. However, there is
an edge case I didn't think about: because of vertex repacking,
it is possible for the bottom part to process a different vertex.
Therefore it can take a different divergent code path
(though it must still take the same uniform code path).
Due to this, when a uniform value comes from divergent control
flow, this may be undefined in the bottom part.
This commit stops reusing uniform variables from
divergent control flow, to fix issues that arise from this.
Fossil DB stats on Sienna Cichlid with NGGC on:
Totals from 1723 (1.34% of 128647) affected shaders:
VGPRs: 89312 -> 89184 (-0.14%); split: -0.15%, +0.01%
SpillSGPRs: 4575 -> 120 (-97.38%)
CodeSize: 10846424 -> 10873836 (+0.25%); split: -0.68%, +0.93%
MaxWaves: 34582 -> 34602 (+0.06%); split: +0.06%, -0.01%
Instrs: 2124471 -> 2128835 (+0.21%); split: -0.51%, +0.72%
Latency: 7274569 -> 7293899 (+0.27%); split: -0.22%, +0.48%
InvThroughput: 1637130 -> 1635490 (-0.10%); split: -0.17%, +0.07%
VClause: 25141 -> 25414 (+1.09%); split: -0.02%, +1.10%
SClause: 56367 -> 59503 (+5.56%); split: -1.36%, +6.93%
Copies: 230704 -> 219313 (-4.94%); split: -5.49%, +0.55%
Branches: 72781 -> 72681 (-0.14%); split: -0.21%, +0.07%
PreSGPRs: 118766 -> 100176 (-15.65%); split: -15.70%, +0.05%
PreVGPRs: 76876 -> 76833 (-0.06%)
Fixes: 0bb543bb60
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13001>
(cherry picked from commit 09f89d15e4)
Use create_backed_surface_view helper function to create/reuse
alternate surface view when the to-be-bound surface view was created
in a different context. This fixes render target views leak running gazebo.
Cc: mesa-stable
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12952>
(cherry picked from commit e5dc900226)
These are I/O variables which are not going to be removed anyway.
However, get_variable_io_mask handles their location incorrectly.
Found using the GCC undefined behavior sanitizer.
Fixes the following error:
runtime error:
shift exponent 4294967258 is too large
for 64-bit type 'long unsigned int'
Closes: #5319
Fixes: cf5f8f55c3
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>
(cherry picked from commit 872d21820f)
match_mask checks the intrinsic type and decides whether it's
per-patch or not. VS don't have per-patch outputs,
so this causes wrong behaviour there.
Found using the GCC undefined behavior sanitizer.
Fixes the following error:
runtime error:
shift exponent 18446744073709551584 is too large
for 64-bit type 'long unsigned int'
Closes: #5319
Fixes: bf966d1c1d
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>
(cherry picked from commit 13e467a147)
Commit 0245b825 switched from returning the error code VK_ERROR_OUT_OF_DATE_KHR
to returning the success code VK_SUBOPTIMAL_KHR. Prior to that commit, the error
code caused all code paths to fail immediately, but the success code does not.
Currently the success code is not recorded in some scenarios, resulting in a
result of VK_SUCCESS instead. This breaks applications that rely on the
result (per the spec) to trigger resizes.
This commit ensures that the proper VK_SUBOPTIMAL_KHR success code is set as a
sticky status (as comments indicate was intended), ensuring that it is
propagated to user code.
Fixes#5331
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12782>
(cherry picked from commit fc5ea6a054)
The fd_fence_finish() may be passed a special timeout value PIPE_TIMEOUT_INFINITE.
This gets propagated all the way to get_abs_timeout(), where it gets converted to
a huge timeout value and passed down to the kernel. At least on iMX53, the kernel
may complain about this value being too large and emit a backtrace. The relevant
piece of information there is the following:
schedule_timeout: wrong timeout value bf94984b
Per suggestion by Rob Clark, fix this in get_abs_timeout() by picking the same
rollover implementation present in etnaviv. This fixes one part of the problem
where the tv_nsec becomes larger than NSEC_PER_SEC, which is invalid.
However, the PIPE_TIMEOUT_INFINITE is sufficiently large to make tv_secs larger
than KTIME_SEC_MAX, which makes kernel-side ktime_set() return KTIME_MAX and
that in turn triggers the above "wrong timeout value N" message. Fix this by
setting the timeout to large enough value in case of PIPE_TIMEOUT_INFINITE.
While the timeout is not truly infinite, the timeout is long enough as anything
longer than a few seconds means the GPU got hung.
The "util/timespec.h" is added so we can use NSEC_PER_SEC instead of ad-hoc
constant 1000000000 . The "pipe/p_defines.h" is needed for PIPE_TIMEOUT_INFINITE.
This problem can be reliably triggered on iMX53 using Qt5 with EGLFS support,
using the qtbase examples, as follows:
/usr/share/examples/opengl/qopenglwidget/qopenglwidget -platform eglfs
Fixes: f3cc0d2747 ("freedreno: import libdrm_freedreno + redesign submit")
Signed-off-by: Marek Vasut <marex@denx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12886>
(cherry picked from commit 6da2727174)
As we don't know if we are going to have spilling or not, emit always a
last thrsw at the end of the shader.
If later we don't have spillings and we don't need that last thrsw, we
remove it and switch back to the previous one.
This way we ensure all the spilling happens always before the last
thrsw.
v2 (Juan):
- Rework the code to force a last thrsw and remove later if no spilling
v3:
- Merge functionality inside vir_emit_last_thrsw (Iago)
- Add vir_restore_last_thrsw (Juan)
v4 (Iago):
- Fix/add new comments
- Rename variables/parameters
v5 (Iago):
- Fix comments
- Add assertion
Cc: mesa-stable
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4760
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12322>
(cherry picked from commit c98ddc778a)
Conflicts:
src/broadcom/compiler/nir_to_vir.c
This estimation was incorrect, the number of waves doesn't only
depend of the number of VGPRs.
Though, {SPI,COMPUTE}_TMPRING_SIZE.WAVES should limit the number of
scratch waves in flight, not sure if limiting it really works.
This fixes a GPU hang with an upcoming game, and this might also
helps resolving some spurious random GPU hangs.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12700>
(cherry picked from commit b31994cf67)
Conflicts:
src/amd/vulkan/radv_shader.c
In this case std::array doesn't behave like a regular array, therefore
it is NOT okay to index it outside the array, even though std::fill
needs us to do so.
Change the syntax to do the same thing slightly differently,
and add an assertion to make sure the registers are always within
the array's bounds.
Closes: #5289
Fixes: 0e4747d3fb
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12664>
(cherry picked from commit 9d20cf2732)
Conflicts:
src/amd/compiler/aco_optimizer_postRA.cpp
At some point I split up the flags into overall/source flags and made
copies from immed/const only set IR3_REG_IMMED/IR3_REG_CONST on the
source flags, but I forgot to update this. Noticed by inspection.
Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12747>
(cherry picked from commit e7f8d283d7)
Without this, lowered saturating ALU instructions would only clamp to
the range of the new type instead of the range of the old type.
v2: Use nir_iclamp. Suggested by Jason. Use new
u_{int,uint}N_{min,max}() helpers.
Fixes: 090e282407 ("nir: Add a saturated unsigned integer add opcode")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
(cherry picked from commit 7d8bf7c167)
Many places need to know the maximum or minimum possible value for a
given size integer... so everyone just open-codes their favorite
version. There is some potential to hit either undefined or
implementation-defined behavior, so having one version that Just Works
seems beneficial.
v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in
u_intmin. Noticed by CI. Lol. Rename functions
`s/u_(uint|int)(min|max)/u_\1N_\2/g`. Suggested by Jason. Add some
unit tests that would have caught the copy-and-paste bug before wasting
CI time. Change the implementation of u_intN_min to use the same
pattern as stdint.h. This avoids the integer division. Noticed by
Jason.
v3: Add changes to convert_clear_color
(src/gallium/drivers/iris/iris_clear.c). Suggested by Nanley.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>
(cherry picked from commit 72259a870f)
According to OES_EGL_image_external_essl3:
On p. 196 in the errors section for BindImageTexture, replace the
last error with the following:
"An INVALID_OPERATION error is generated if <texture> is neither the
name of an immutable texture object, nor the name of an external
texture object."
According to OES_EGL_image_external:
The command
void EGLImageTargetTexture2DOES(enum target, eglImageOES image);
with <target> set to TEXTURE_EXTERNAL_OES defines the currently bound
external texture object to be a target sibling of <image>.
...
If <target> is not TEXTURE_EXTERNAL_OES, the error INVALID_ENUM is
generated. (Note: if GL_OES_EGL_image is supported then <target> may
also be TEXTURE_2D).
Currently, mesa only allows GL_TEXTURE_EXTERNAL_OES textures to be bound
by glBindImageTexture. However, the language of the specification does not
appear to use "external" to refer to GL_TEXTURE_EXTERNAL_OES specifically,
since OES_EGL_image_external allows external eglImageOES to be attached
to GL_TEXTURE_2D in the presence of GL_OES_EGL_image. Thus, it should be
interpreted to refer to all types of external textures, including 2D
textures attached via glEGLImageTargetTexture2DOES.
Fixes: ed43dd62ac ("main: allow external textures for BindImageTexture")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12383>
(cherry picked from commit e52aa1edff)
The lowered LS and NGG stages use local_invocation_index and they
can benefit from the unsigned upper bound because they can emit a
less expensive integer multiplication instruction.
This was working in the past, but accidentally borked by a refactor.
Fossil DB changes on Sienna Cichlid:
Totals from 956 (0.74% of 128647) affected shaders:
CodeSize: 2354172 -> 2344712 (-0.40%)
Instrs: 434359 -> 434327 (-0.01%)
Latency: 1883949 -> 1876814 (-0.38%)
InvThroughput: 762638 -> 757405 (-0.69%)
Fossil DB changes on Sienna Cichlid (with NGGC enabled):
Totals from 57873 (44.99% of 128647) affected shaders:
CodeSize: 155844192 -> 155607064 (-0.15%)
Instrs: 29799184 -> 29799152 (-0.00%)
Latency: 130959764 -> 130814224 (-0.11%); split: -0.11%, +0.00%
InvThroughput: 21100300 -> 20928635 (-0.81%); split: -0.81%, +0.00%
Fixes: 8af6766062
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
(cherry picked from commit 548b383310)
Consider the following sequence in a shader:
b = p_extract a
c = v_mad_u32_u16 b, X, 0
The optimizer applies extract, resulting in:
c = v_mad_u32_u16 a, X, 0 (correct)
Then it mistakenly turns that into:
c = v_mul_u32_u24 a, X, 0 (incorrect)
In this case, the p_extract is applied to v_mad_u32_u16 by
apply_extract. After this, we can no longer be sure that
the operands are still 16 or 24-bit, so we have to remove
this flag.
No Fossil DB changes.
Fixes: 54292e99c7
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
(cherry picked from commit 76b9dd6266)
We can't append instructions following a return/halt instruction
because the control flow helpers will modify the successor of the
block containing the return/halt. And the NIR validator enforces that
the return/halt must have the end of the function as successor.
This tends to happen following lower_shader_calls lowering which
inserts halts. This probably doesn't prevent the optimization, it'll
just happen in one of the return shaders after the halt has been
removed.
v2: Move prev block ending check earlier in the function (Daniel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12506>
(cherry picked from commit a13e79843e)
The hardware can handle much larger textures than we allowed. The game
"Cathedral" requires larger textures for some bizarre reason. Raise the
limit so the game runs, syncing MAX_MIP_LEVELS, the comments, and the CAPs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Closes: #5203
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12312>
(cherry picked from commit d051b06a48)
This ensures each context can have a separate batch writing a resource
and we don't race trying to flush each other's batches. Unfortunately
the extra hash table operations regress draw-overhead numbers by about
8% but I'd rather eat the overhead and have an obviously correct
implementation than leave known buggy code in tree.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
(cherry picked from commit 8503cab2e0)
Conflicts:
src/gallium/drivers/panfrost/pan_job.c
This expresses the semantic of the flush only applying to batches within
the context, not globally, in line with OpenGL's multithreading rules.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
(cherry picked from commit e98aa55413)
Conflicts:
src/gallium/drivers/panfrost/pan_job.c
ralloc is not thread safe, so we cannot use a pipe_screen as a ralloc
context unless we lock the screen. The allocation patterns for resources
are trivial, so just use malloc/calloc/free directly instead of ralloc.
This fixes a segfault in:
dEQP-EGL.functional.sharing.gles2.multithread.random.images.copytexsubimage2d.1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
(cherry picked from commit e6924be737)
Without a lock, two threads may bind the same shader CSO simultaneously,
allocate the same variant simultaneously, and then race each other in
the compiler. This manifests in various ways, most commonly failing the
assertion that UBO pushing has only run once. The simple_mtx_t solution
is used in Iris. Fixes the crash in:
dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
(cherry picked from commit 40edc87956)
The current code overwrote the original view which meant if we
had to reemit a surface the second emit would be wrong.
This fixes cubemaps on gm45 and maybe some issues with 3D textures
elsewhere.
Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12514>
(cherry picked from commit 63138c42c5)
Conflicts:
src/gallium/drivers/crocus/crocus_state.c
If a user attempts to run Panfrost on an unsupported GPU (e.g. Mali
T604), Panfrost will refuse to load and will destroy the screen
immediately, allowing for a graceful fallback to a software rasterizer.
However, the screen destroy code calls a screen_destroy function in the
GenXML vtbl -- and this function is still NULL when the allowlist is
checked. This manifests as crashes on unsuported GPUs.
Issue tracked down with Icecream95's mad Ghidra skills.
Closes: #5269
Fixes: 88dc4db6be ("panfrost: Init/destroy blitter from per-gen file")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12512>
(cherry picked from commit 2d31d469f7)
Fixes an invalid read caught by valgrind when there is a hole in the
valid render target mask:
==6749== Conditional jump or move depends on uninitialised value(s)
==6749== at 0x5E88EC0: panfrost_prepare_fs_state (pan_cmdstream.c:417)
==6749== by 0x5E88EC0: panfrost_emit_frag_shader (pan_cmdstream.c:501)
==6749== by 0x5E88EC0: panfrost_emit_frag_shader_meta (pan_cmdstream.c:573)
==6749== by 0x5E88EC0: panfrost_update_state_fs (pan_cmdstream.c:2593)
==6749== by 0x5E8B0BF: panfrost_direct_draw (pan_cmdstream.c:2839)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Fixes: a124c47b9f ("panfrost: Fix NULL derefs in pan_cmdstream.c")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
(cherry picked from commit 3113dbd837)
Otherwise we end up accessing overwritten registers. Fixes
dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_enable_buffer_enable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
(cherry picked from commit f45ceb8182)
This seems to have fixed a few additional tests on 21.2:
- dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.const_expression_fragment
- dEQP-GLES31.functional.draw_buffers_indexed.overwrite_indexed.common_separate_blend_func_buffer_separate_blend_func
- dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_blend_func_buffer_separate_blend_func
Fixes framebuffer_fetch and blend_equation_advanced dEQP tests on v6.
v2: Use clause dependencies rather than comparing the message type
v3: Shift the BIFROST_SLOT_* constants before using them as a mask
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12375>
(cherry picked from commit 295807e666)
Apparently, CLPER_V7 is missing from Mali G31, but CLPER_V6 works. Fixes
INSTR_INVALID_ENC faults and failures in
dEQP-GLES3.functional.shaders.derivate.* on Dvalin.
Technically not an errata but an implementation difference. I suspect
Mali G51 will need this as well, should we ever allowlist it.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
(cherry picked from commit 61c8e39649)
Conflicts:
src/panfrost/bifrost/bi_quirks.h
Use the explicit sample mode and set the sample ID in the pixel indices
structure to the current sample ID. This fixes tilebuffer loads in blend
shaders on multisampled framebuffers.
Make sure the new routine is broken out to a helper for use with ST_TILE
in the next commit.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
(cherry picked from commit 16394dc71a)
Although it is passing all of dEQP-GLES31, it is failing a few
KHR-GLES31.* tests. It also has performance issues at the moment. Invert
the existing noindirect debug flag to become a indirect debug flag. Set
this flag for dEQP-GLES31 CI on G52, to make sure the code doesn't bit
rot on the hope someone will pick this up later on.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
(cherry picked from commit a7f7d74137)
Conflicts:
src/gallium/drivers/panfrost/pan_screen.c
src/panfrost/lib/pan_util.h
In b9c095cc2c ("panfrost: Rewrite the clear colour packing code"),
packing of clear colours was corrected to use the tilebuffer's
fractional bits, fixing dithering of the clear colour with formats like
RGB565. Unfortunately, that commit did so unconditionally. If the
framebuffer is dithered, but dithering is disabled at the time of
the clear, we would incorrectly dither the clear.
This is a regression, as the old (broken) code passed the relevant CTS
test. What's the catch? Depending on dither state, there are two
formulas to pack tilebuffer colours. We need to handle both. Fixes
KHR-GLES31.core.draw_buffers_indexed.color_masks.
Fixes: b9c095cc2c ("panfrost: Rewrite the clear colour packing code")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>
(cherry picked from commit 22538b89b3)
Conflicts:
src/panfrost/lib/tests/test-clear.c
src/panfrost/vulkan/panvk_cmd_buffer.c
We have no reason to report a subpixel precision of 4 for lines; in fact
LLVMpipe uses 8 subpixel bits for lines, similar to other primitives.
But let's use the pipe-cap for this instead of hard-coding it.
Fixes: 9fbf6b2abf ("lavapipe: implement VK_EXT_line_rasterization")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12277>
(cherry picked from commit a16f3963d3)
Commit 3a772be026 ("freedreno: Add perfetto renderpass support")
uses C++17 init-statement feature.
GCC
../src/gallium/drivers/freedreno/freedreno_perfetto.cc: In lambda function:
../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: init-statement in selection statements only available with ‘-std=c++17’ or ‘-std=gnu++17’
148 | if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
| ^~~~
Clang
../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: 'if' initialization statements are a C++17 extension [-Wc++17-extensions]
if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
^
Intel C++ Compiler
../src/gallium/drivers/freedreno/freedreno_perfetto.cc(148): error: expected a ")"
if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
^
Fixes: 3a772be026 ("freedreno: Add perfetto renderpass support")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5193
Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12293>
(cherry picked from commit 4fc2a6cbdb)
With hw devices, when you submit a present, implicit sync will
make sure the work submitted to the gpu on the client will end
up happening before the present work submitted on the server.
However with sw paths there is no real GPU, the lavapipe fake
GPU thread is client side only and presenting is done directly
from the pixmap (or later shared pixmap). In order for this to
make sense the wsi common code should wait for the fence on the
image before queueing the submit to the server so that all
client works has been flushed to the pixmap before the copy or
present operation is submitted.
Fixes: 8004fa9c95 ("vulkan/wsi: add sw support. (v2)")
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12502>
(cherry picked from commit 0cddfba328)
It can happen that we create an enormous merge set, even larger than the
entire register file, in which case find_best_gap() would loop
infinitely. This seems to be triggered more often with
IR3_SHADER_DEBUG=spillall, since it actually happened with a CTS test.
Just bail out in that case.
Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>
(cherry picked from commit efb34d6ee6)
When we mark live-through sources that are merged with the destination
as killed, we kept the bitsets in sync, but we forgot to keep them in
sync when unmarking them after allocating the destination. The result
was that "available" wasn't correct for any instruction afterwards. This
resulted in a bad register allocation with IR3_SHADER_DEBUG=spillall for
a dEQP-VK test.
While we're changing this, use ra_foreach_src().
Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>
(cherry picked from commit 70c22d3894)
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.
Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.
A tiny helper function is introduced to compute the modifier of a
resource.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb223639 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
(cherry picked from commit 8de086e12f)
Conflicts:
src/broadcom/ci/piglit-v3d-rpi4-fails.txt
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.
Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.
A tiny helper function is introduced to compute the modifier of a
resource.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb223639 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
(cherry picked from commit b1fbceac6f)
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.
Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c092947df ("panfrost: fail in get_handle(TYPE_KMS) without a scanout resource")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
(cherry picked from commit 99fc6f7271)
When this was rewritten to support Vulkan, we stopped initializing
file_max to -1 in the case of no inputs. This causes the draw module
to go down a needlessly pessimistic case, printing an error while we're
at it.
Fixes: 42b5cfdbd2 ("gallivm/nir: fix vulkan vertex inputs")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12440>
(cherry picked from commit 63529782d3)
16bit integer support also implies using 16-bit results when sampling
textures.
Because we're returning the results in float SSA values instead of int,
we need to bitcast back to integers before truncating the values.
Fixes: 00ff60f799 ("gallivm: add 16-bit integer support")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12413>
(cherry picked from commit 45a61f1782)
Currently stride, offset, modifier is obtained by invoking
lima_resource_get_handle() with WINSYS_HANDLE_TYPE_KMS.
Before commit 47f000c170 this path was working. Obtained handle
was simply ignored by DRI frontend and only requested data used.
After commit 47f000c170 such requests started to fail when
DRI is initialized using KMSRO and resource has no scanout data.
When lima_resource_get_param() is implemented, it will be used in
a first place to obtain resource data.
Fixes: 47f000c170 ("lima: fail in get_handle(TYPE_KMS) without a scanout resource")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12362>
(cherry picked from commit 5ec6b6e9bb)
When operators other than eq and ne are involved we can't really
move operands around and negate them because such transformation
may change the value of the whole expression.
Some examples:
For unsigned var:
0 >= 1u + var would eventually become 0xffffffff >= var,
which would always evaluate to true, when original expression
was true only for var == 0xffffffff.
For signed var:
0 >= 1 + var would become -1 >= var, which would evaluate to
false for var == 2147483647, when original expression evaluated
to true (because signed overflow is defined to wrap around in
glsl, 1 + 2147483647 == -2147483648, so 0 >= -2147483648).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5226
Fixes: 34ec1a24d6 ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12359>
(cherry picked from commit 89bc8ff408)
At the beginning of a render pass, the hardware will fill the tilebuffer
with an arbitrary 128-bit word. To implement colour clears, the driver
must pack the API-specific clear colour according to the 128-bit layout
of the tilebuffer. This layout depends only on the render target format.
The existing code to handle this was based on loose guesswork. It works
for the format / clear colour combinations tested in dEQP-GLES3, but it
is severely deficient in the general case. It works by matching on the
PIPE format of the render target (not the layout of the tilebuffer). For
special cased PIPE formats, it open codes a buggy pack routine.
Otherwise, it defaults to util_pack_color in the hope that will work.
Since util_pack_color doesn't know anything about Mali tilebuffer
layouts, that means it's defaulting to wrong behaviour.
Now that we understand internal tilebuffer layouts, let's rewrite the
packing code. Instead of matching PIPE formats, map the PIPE format to
the internal tilebuffer layout using the common table, ensuring the
mapping remains in sync with the render target descriptor. Then for
blendable tilebuffer formats, pack using a common float -> fixed point
path supporting optional sRGB translation. Raw formats use
util_pack_color as before.
For formats with less than 8 bits per channel, the new code uses the
fractional bits of the fixed-point representation. This is required for
correct dithering if the clear colour is not exactly representable in
the final low precision format.
In summary, at least the following bugs in the old code are fixed:
* Swapped R/B channels with sRGB
* Swapped R/B channels with some missing formats
* Incorrect dithering with RGB565, RGB5_A1
Fixes the following test cases:
dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb
dEQP-EGL.functional.wide_color.window_888_colorspace_srgb
dEQP-EGL.functional.wide_color.pbuffer_888_colorspace_srgb
Later in the series, unit tests are added for the new implementation.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12365>
(cherry picked from commit b9c095cc2c)
Conflicts:
src/panfrost/lib/pan_util.h
Transfer ownership of the render node fd to the panfrost_device (minor
change to panvk), and then close the file descriptor for the render node
bound to the panfrost_device when destroying the panfrost_device. Of all
the users of panfrost_open_device, panvk is the only one that correctly
closed the fd before. Accordingly, this fixes an fd leak in the Gallium
driver (and performance counter utilities).
This fix still applies to the Gallium driver when renderonly is in use--
although renderonly closes its own fd, the fd is _duplicated_ in
panfrost_drm_winsys.c, so renderonly and panfrost must _both_ close
their respective fd to fix the leak.
This fixes a crash when running dEQP-EGL for more than two hours.
dEQP-EGL creates a new screen for every test case and then immediately
destroys it. If destroying a screen leaks the fd, this causes the number
of open file descriptors to increase monotonically until the process
ends. This will eventually hit the system limit for number of open files
and abort the process.
This bug was identified while attempting to run the OpenGL ES
conformance tests via cts-runner, and then confirmed with `lsof`. With
the fix, the number of file descriptors reported by `lsof | wc -l` is
now constant while running dEQP-EGL as expected.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12346>
(cherry picked from commit 76377de99b)
mmap requires its offset is page aligned, but the current code only
guarantees 4k alignment, causing drm-shim to break badly on kernels with
>4k page sizes. This fixes drm-shim on my Apple M1, running bare metal
Linux with 16k pages. It probably also fixes exotic PowerPC systems with
64k pages.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Zoltan Boszormenyi <zboszor@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12347>
(cherry picked from commit 38f39cc1e2)
When creating an image out of a swapchain on Android, the android
layer call will detect a VkBindImageMemorySwapchainInfoKHR in the
pNext chain of the vkBindImageMemory2() call and add a
VkNativeBufferANDROID in the chain. This is what we should use as
backing memory for that image.
v2: Fix a couple of obvious mistakes (Tapani)
v3: Silence build warning (Lionel)
Fix invalid object argument to vk_error() (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bc3c71b87a ("anv: don't try to access Android swapchains")
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5180
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12244>
(cherry picked from commit 19b7bbba73)
Before a73cb106a6, cso contexts were never reused, but now that they
are we need to be extra careful that the state in the cso context and
in the pipe context matches even after an unbind, since when the cso
context is reused the state might otherwise get out of sync (as there is
no concept of "initial state", basically cso always relied on the default
values being the same both in cso and the drivers).
This fixes some errors we've seen internally with lavapipe.
Fixes: a73cb106a6 ("aux/cso: split cso_destroy_context into unbind and a destroy functions")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12261>
(cherry picked from commit 513fb5438b)
On Gfx4 and Gfx5, sel.l (for min) and sel.ge (for max) are implemented
using a separte cmpn and sel instruction. This lowering occurs in
fs_vistor::lower_minmax which is called very, very late... a long, long
time after the first calls to opt_cmod_propagation. As a result,
conditional modifiers can be incorrectly propagated across sel.cond on
those platforms.
No tests were affected by this change, and I find that quite shocking.
After just changing flags_written(), all of the atan tests started
failing on ILK. That required the change in cmod_propagatin (and the
addition of the prop_across_into_sel_gfx5 unit test).
Shader-db results for ILK and GM45 are below. I looked at a couple
before and after shaders... and every case that I looked at had
experienced incorrect cmod propagation. This affected a LOT of apps!
Euro Truck Simulator 2, The Talos Principle, Serious Sam 3, Sanctum 2,
Gang Beasts, and on and on... :(
I discovered this bug while working on a couple new optimization
passes. One of the passes attempts to remove condition modifiers that
are never used. The pass made no progress except on ILK and GM45.
After investigating a couple of the affected shaders, I noticed that
the code in those shaders looked wrong... investigation led to this
cause.
v2: Trivial changes in the unit tests.
v3: Fix type in comment in unit tests. Noticed by Jason and Priit.
v4: Tweak handling of BRW_OPCODE_SEL special case. Suggested by Jason.
Fixes: df1aec763e ("i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dave Airlie <airlied@redhat.com>
Iron Lake
total instructions in shared programs: 8180493 -> 8181781 (0.02%)
instructions in affected programs: 541796 -> 543084 (0.24%)
helped: 28
HURT: 1158
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.86% x̄: 0.53% x̃: 0.50%
HURT stats (abs) min: 1 max: 3 x̄: 1.14 x̃: 1
HURT stats (rel) min: 0.12% max: 4.00% x̄: 0.37% x̃: 0.23%
95% mean confidence interval for instructions value: 1.06 1.11
95% mean confidence interval for instructions %-change: 0.31% 0.38%
Instructions are HURT.
total cycles in shared programs: 239420470 -> 239421690 (<.01%)
cycles in affected programs: 2925992 -> 2927212 (0.04%)
helped: 49
HURT: 157
helped stats (abs) min: 2 max: 284 x̄: 62.69 x̃: 70
helped stats (rel) min: 0.04% max: 6.20% x̄: 1.68% x̃: 1.96%
HURT stats (abs) min: 2 max: 48 x̄: 27.34 x̃: 24
HURT stats (rel) min: 0.02% max: 2.91% x̄: 0.31% x̃: 0.20%
95% mean confidence interval for cycles value: -0.80 12.64
95% mean confidence interval for cycles %-change: -0.31% <.01%
Inconclusive result (value mean confidence interval includes 0).
GM45
total instructions in shared programs: 4985517 -> 4986207 (0.01%)
instructions in affected programs: 306935 -> 307625 (0.22%)
helped: 14
HURT: 625
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.82% x̄: 0.52% x̃: 0.49%
HURT stats (abs) min: 1 max: 3 x̄: 1.13 x̃: 1
HURT stats (rel) min: 0.12% max: 3.90% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.29% 0.36%
Instructions are HURT.
total cycles in shared programs: 153827268 -> 153828052 (<.01%)
cycles in affected programs: 1669290 -> 1670074 (0.05%)
helped: 24
HURT: 84
helped stats (abs) min: 2 max: 232 x̄: 64.33 x̃: 67
helped stats (rel) min: 0.04% max: 4.62% x̄: 1.60% x̃: 1.94%
HURT stats (abs) min: 2 max: 48 x̄: 27.71 x̃: 24
HURT stats (rel) min: 0.02% max: 2.66% x̄: 0.34% x̃: 0.14%
95% mean confidence interval for cycles value: -1.94 16.46
95% mean confidence interval for cycles %-change: -0.29% 0.11%
Inconclusive result (value mean confidence interval includes 0).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>
(cherry picked from commit 38807ceeae)
Per https://gitlab.freedesktop.org/mesa/mesa/-/issues/5178#note_1019666,
the assumption fundamental to this optimization is false. Section
2.4.1 (Float to Integer) of Ivy Bridge PRMs describes the situation.
The wording of the section is somewhat confusing (because it doesn't
clearly delineate between signed and unsigned integers), but the last
two rows of the table make it clear that F->UD conversion clamps
negative float values to 0.
All other hardware mentioned in that thread seems to behave the same
way.
The real problem is that, with hardware that behaves in this ways,
converting f2u(2147483648.0) to f2i(2147483648.0) changes the bit pattern
that would be produced from 0x80000000 to 0x7fffffff.
This reverts commit ad05920258.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>
(cherry picked from commit 84d2e53789)
Otherwise drivers that don't use 16-bit slots for varyings will get
confused and have their driver_locations scribbled over. This has caused
multiple problems for both Panfrost and Asahi this week. Given the only
other user of the pass for varyings is radeonsi, which needs both
together, I think this is the least controversial fix.
Fixes: fb29cef8dd ("nir: add many passes that lower and optimize 16-bit input/outputs and samplers")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11732>
(cherry picked from commit 03c18f7efc)
On Gfx4 and Gfx5, sel.l (for min) and sel.ge (for max) are implemented
using a separte cmpn and sel instruction. This lowering occurs in
fs_vistor::lower_minmax which is called very, very late... a long, long
time after the first calls to opt_cmod_propagation. As a result,
conditional modifiers can be incorrectly propagated across sel.cond on
those platforms.
No tests were affected by this change, and I find that quite shocking.
After just changing flags_written(), all of the atan tests started
failing on ILK. That required the change in cmod_propagatin (and the
addition of the prop_across_into_sel_gfx5 unit test).
Shader-db results for ILK and GM45 are below. I looked at a couple
before and after shaders... and every case that I looked at had
experienced incorrect cmod propagation. This affected a LOT of apps!
Euro Truck Simulator 2, The Talos Principle, Serious Sam 3, Sanctum 2,
Gang Beasts, and on and on... :(
I discovered this bug while working on a couple new optimization
passes. One of the passes attempts to remove condition modifiers that
are never used. The pass made no progress except on ILK and GM45.
After investigating a couple of the affected shaders, I noticed that
the code in those shaders looked wrong... investigation led to this
cause.
v2: Trivial changes in the unit tests.
v3: Fix type in comment in unit tests. Noticed by Jason and Priit.
v4: Tweak handling of BRW_OPCODE_SEL special case. Suggested by Jason.
Fixes: df1aec763e ("i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dave Airlie <airlied@redhat.com>
Iron Lake
total instructions in shared programs: 8180493 -> 8181781 (0.02%)
instructions in affected programs: 541796 -> 543084 (0.24%)
helped: 28
HURT: 1158
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.86% x̄: 0.53% x̃: 0.50%
HURT stats (abs) min: 1 max: 3 x̄: 1.14 x̃: 1
HURT stats (rel) min: 0.12% max: 4.00% x̄: 0.37% x̃: 0.23%
95% mean confidence interval for instructions value: 1.06 1.11
95% mean confidence interval for instructions %-change: 0.31% 0.38%
Instructions are HURT.
total cycles in shared programs: 239420470 -> 239421690 (<.01%)
cycles in affected programs: 2925992 -> 2927212 (0.04%)
helped: 49
HURT: 157
helped stats (abs) min: 2 max: 284 x̄: 62.69 x̃: 70
helped stats (rel) min: 0.04% max: 6.20% x̄: 1.68% x̃: 1.96%
HURT stats (abs) min: 2 max: 48 x̄: 27.34 x̃: 24
HURT stats (rel) min: 0.02% max: 2.91% x̄: 0.31% x̃: 0.20%
95% mean confidence interval for cycles value: -0.80 12.64
95% mean confidence interval for cycles %-change: -0.31% <.01%
Inconclusive result (value mean confidence interval includes 0).
GM45
total instructions in shared programs: 4985517 -> 4986207 (0.01%)
instructions in affected programs: 306935 -> 307625 (0.22%)
helped: 14
HURT: 625
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.82% x̄: 0.52% x̃: 0.49%
HURT stats (abs) min: 1 max: 3 x̄: 1.13 x̃: 1
HURT stats (rel) min: 0.12% max: 3.90% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.29% 0.36%
Instructions are HURT.
total cycles in shared programs: 153827268 -> 153828052 (<.01%)
cycles in affected programs: 1669290 -> 1670074 (0.05%)
helped: 24
HURT: 84
helped stats (abs) min: 2 max: 232 x̄: 64.33 x̃: 67
helped stats (rel) min: 0.04% max: 4.62% x̄: 1.60% x̃: 1.94%
HURT stats (abs) min: 2 max: 48 x̄: 27.71 x̃: 24
HURT stats (rel) min: 0.02% max: 2.66% x̄: 0.34% x̃: 0.14%
95% mean confidence interval for cycles value: -1.94 16.46
95% mean confidence interval for cycles %-change: -0.29% 0.11%
Inconclusive result (value mean confidence interval includes 0).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12330>
In the compute dispatch path we do not allocate a huge amount
of space to cover everything so the individual functions have to
allocate. This was missing here, causing a hang in Cyberpunk when
accessing the system menu at some locations with thread tracing
enabled.
Fixes: bd1186572f ("radv: add support for push constants inlining when possible")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12271>
(cherry picked from commit 02b6015945)
The flock is per-fd, not per thread, and we do it outside of the main mutex. This was
done to avoid having to wait in the mutex, but we can get a case where one ends up running
the body with the flock unlocked.
Fix this by adding a mutex that doesn't need to be locked for reads.
Fixes: 4f0f8133a3 "util/fossilize_db: Do not lock the fossilize db permanently."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12266>
(cherry picked from commit 30a359d633)
Don't anticipate seeing any partial written headers but just in case we
should probably wait on the lock to make sure whatever header was being
written is finished being written.
Fixes: 4f0f8133a3 "util/fossilize_db: Do not lock the fossilize db permanently."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12204>
(cherry picked from commit 96bfefe8d1)
Otherwise, another thread might reuse their object ids for other
objects. For example,
T1: free queue with object id X
T2: reuse id X
T2: emit vkCreateFoo with id X
T1: emit vkDestroyDevice
virglrenderer happily accepts that which leads to double frees of the
queue: once when X is updated to point to another object and once when
vkDestroyDevice is executed. virglrenderer should be fixed to catch
such invalid object id reuse as well.
Fixes
dEQP-VK.api.object_management.multithreaded_shared_resources.device_group.
Fixes: ddd7533055 ("venus: initial support for queue/fence/semaphore")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12252>
(cherry picked from commit da000ea2ef)
When a vertex program is translated to nir, it uses
nir_to_tgsi_compile_options for drivers with only nir-to-tgsi based
NIR support. But this compile option might not be the same as the NIR
compile option from llvmpipe, hence when the nir shader is bound
to the draw module, it hits an assertion in do_alu_action() when
encounters nir_op_fdot3.
With this patch, draw will take the nir-to-tgsi path if preferred IR
from the driver is TGSI.
Fixes assert running Maya on SVGA device.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: 21.2 mesa-stable
(cherry picked from commit 6751d832c5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12200>
Some drivers doesn't support PIPE_SHADER_CAP_INTEGERS.
This leads to using load_ubo_vec4 which throws llvmpipe off the guard since
it doesn't expect load_ubo_vec4 in shader. Use nir_to_tgsi utility in
such a case.
This fixes crash seen with conform's mustpass.c, select.c and feedback.c.
Also, few gl-select related piglit tests exhibit same crash. Found in vmware's
internal testing
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
v2: incorporated Emma's comments. Added check for PIPE_SHADER_CAP_INTEGERS and
remove PIPE_SHADER_IR_TGSI check
v3: As per Emma's comment, removed expected crashes for i915 piglit
v4: update expetcted passes
(cherry picked from commit b5e782f5f4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12200>
The radv_emit_ngg_culling_state function won't write the
SPI_SHADER_PGM_RSRC2_GS register when it knows in advance that
radv_emit_graphics_pipeline will overwrite it anyway.
However, there is an unhandled case:
radv_emit_graphics_pipeline will not emit anything (including this
register) when the pipeline is already emitted. Hence, improve
the check in radv_emit_ngg_culling_state to consider this.
Fixes: 9a95f5487f
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12237>
(cherry picked from commit 74181ffcc5)
Fixes
In file included from ../src/gallium/drivers/iris/iris_query.c:49:
../src/gallium/drivers/iris/iris_genx_macros.h:81:10: fatal error: genxml/genX_bits.h: No such file or directory
and
In file included from ../src/gallium/drivers/crocus/crocus_query.c:50:
../src/gallium/drivers/crocus/crocus_genx_macros.h:86:10: fatal error: genxml/genX_bits.h: No such file or directory
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12149>
(cherry picked from commit b4214adefc)
It was added to make sure every pipeline for a merge request has at
least one job which passes (otherwise it's not possible to merge the
MR). Now the "sanity" job always exists in such pipelines, so this
isn't needed anymore.
Fixes: 4c41d1900e "ci: Add jobs running ci-fairy checks"
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12198>
(cherry picked from commit 6ccf11ac2b)
Navi 10 can hang when an NGG workgroup has no output,
so we work around that by always exporting a single zero-area
triangle with a single vertex that has all-NaN coordinates.
Thus far, we only employed this for NGG GS, because on all
other stages, the output can't be empty.
However, with NGG culling, the output can be empty, so let's
apply the same workaround there too.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12169>
(cherry picked from commit 448592b9ae)
The `argument::size` is supposed to represent the size of a pointer on
the host and not on the device (for which argument::target_size`
exists).
v3: Use `sizeof(buf)` instead of `marg.size`. (Francisco Jerez)
Fixes: 7c6f1d3bf9 ("clover/nir: extract constant buffer into its own section")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10256>
(cherry picked from commit b4e5bf0637)
This resolves clover returning `CL_INVALID_ARG_SIZE` whenever the OpenCL
CTS called `clSetKernelArg()` for 3-component vectors.
Fixes: 2147386505 ("clover/spirv: Add functions for parsing arguments, linking programs, etc.")
v2: Remove “api/clsetkernelarg/set kernel argument for cl_int3” from the
expected fails for llvmpipe
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10256>
(cherry picked from commit a6c26a6ad9)
SRV descriptors can require state-transitions before it's legal to set
them on the command-list. We used to just set them right away, and get
away with is, because the validator didn't verify this because we used
to flag the parameters as volatile.
Now that we don't, we trigger validation errors when setting a root
parameter that needs a transition first.
So let's split up the logic a bit, so we can prepare the tables, then do
the transision, and finally set the tables. We do this for all tables
instead of just the SRVs, just because it makes the logic a bit easier to
follow. We leave root constants alone, because they will never require
this, and doing them late would just compilcate things.
Fixes: 1208290558 ("d3d12: Sets all SRV descriptors as data-static")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12187>
(cherry picked from commit cd79351f02)
Most of the drivers don't set up the maximum value in the query info. So
when later hud_pane_set_max_value() is invoked, we are using a rather
"random" number.
Turns out that in some 32bit cases, this random number is big enough
that `leftmost_digit` is 0 because DIV_ROUND_UP() overflows, aborting
with an assertion.
Fixes: c91cf7d7d2 ("gallium: implement a heads-up display module")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12181>
(cherry picked from commit 10541d1fad)
This fixes a hang in the following piglit test when GCM moves a
UBO load outside of the loop.
tests/shaders/ssa/fs-if-def-else-break.shader_test
The end NIR ends up looking like this:
vec2 32 ssa_3 = intrinsic load_ubo (ssa_2, ssa_0) (0, 1073741824, 0, 0, 8)
vec1 32 ssa_4 = mov ssa_3.x
vec1 32 ssa_5 = inot ssa_3.y
/* succs: block_1 */
loop {
...
if ssa_5 { }
}
Fixes: 1edf67fc3f ("intel/fs: Generate if instructions with inverted conditions")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
(cherry picked from commit a654e39f15)
To properly init buffer memory requirement for AHB, memory type bits
from dma_buf fd properties need to be masked. However, creating a test
AHB at buffer creation is too costy. This patch caches the ahb backed
buffer memory type bits at device creation time if the app is requesting
AHB extension.
Cc: 21.2 mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12171>
(cherry picked from commit e08960482a)
Similar to the fix in 6bf8e960fa ("pan/bi: Do helper termination
analysis on clauses")
Though apparently a "theoretical issue only", fixes artefacts in
DarkPlaces with both D3D9 and GL renderers.
Fixes: 9a7f0e268b ("pan/mdg: Use the helper invo analyze passes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12156>
(cherry picked from commit a2b37e9592)
Hand typed. We could generate this from the XML to avoid the repititon
but I think the cure is worse than the disease.
This fixes instruction encoding faults seen in conformance tests.
Only a single shader-db affected, and it was likely already broken...
quadwords HURT: shaders/glmark/22-1.shader_test MESA_SHADER_FRAGMENT: 133 -> 135 (1.50%)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12114>
(cherry picked from commit 2cdf95703a)
Conflicts:
src/panfrost/bifrost/bi_schedule.c
The existing code handled the case where the new definition of the
same field was larger than the old one.
This commit adds a check to handle the reverse case: the new def
is smaller than the old one (= so writing using the merged macro
would affect the next fields).
The affected fields are:
* LGKM_CNT (in SQ_WAVE_IB_STS)
* DONUT_SPLIT (in VGT_TESS_DISTRIBUTION)
* HEAD_QUEUE (in GDS_GWS_RESOURCE)
DONUT_SPLIT is the only one used by radeonsi/radv.
Fixes: e6184b0892 ("amd/registers: scripts for processing register descriptions in JSON")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12063>
(cherry picked from commit 3914bd457b)
Failure to create a buffer for scanout should not be fatal when
importing a buffer. Buffers allocated from a render-only device may not
be able to scanned out directly but can still be used for other
rendering purposes (e.g. as a texture).
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12081>
(cherry picked from commit 7bcb223639)
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.
Instead of returning a handle for the wrong device, fail if we don't
have one.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 47f000c170)
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.
Instead of returning a handle for the wrong device, fail if we don't
have one.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 4c092947df)
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.
Instead of returning a handle for the wrong device, fail if we don't
have one.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 465eb7864b)
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.
Instead of returning a handle for the wrong device, fail if we don't
have one.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 9da901d2b2)
A previous commit cleaned up the asserts but the last part of
this assert looks like it got mixed up. It should have allowed
param->rel for D3DSPR_INPUT if version is 3.0. Instead it does
&& on the enum value D3DSPR_ADDR which is of course always true,
with the version check. The result is that we miss input
validation with version 3.0.
Spotted by a compile warning
Fixes: 5974401a4a ("st/nine: Regroup param->rel tests")
Reviewed-by: Axel Davy davyaxel0@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11880>
(cherry picked from commit 71a5bcb865)
We can have more than 32 samplers, but the code below will assert in that
case. The return value is not used for samplers, so let's just return
zero early and be done with it.
Fixes: c18ff60087 ("lavapipe: emit correct textures_used for texture-arrays")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11845>
(cherry picked from commit bff8a948f7)
Under XWayland, the first present after a window resize is sometimes
completed with COPY (seems to happen when the previous present with the
old size is pending; not really sure). The following presents are
completed with FLIP.
When a swapchain is created with an old swapchain, and
old_chain->last_present_mode is FLIP, chain->last_present_mode is set to
FLIP as well. This causes the new swapchain to be marked
VK_SUBOPTIMAL_KHR, which is sticky, if the first present is completed
with COPY.
Instead of inheriting, treat each swapchain as independent. We will
miss the case where an old swapchain is flipping but a new swapchain is
copying. But swapchain reallocation normally happens in response to
present engine state change. If the newly allocated swapchain is
copying, another reallocation is unlikely to fix that.
Fixes: 61309c2a72 ("vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12030>
(cherry picked from commit 206fe780d5)
Implement resource_get_param for PIPE_RESOURCE_PARAM_NPLANES and fix
resource_get_handle to walk to the correct linked resource for
multiplanar images, allowing gbm_bo_get_handle_for_plane to be called
with plane > 0.
This fixes an assert that is triggered when a wayland client tries
to send weston an NV12 dmabuf, for example:
weston: .../mesa/src/gbm/backends/dri/gbm_dri.c:752: gbm_dri_bo_get_handle_for_plane: Assertion `plane == 0' failed.
Fixes: 788f6dc857 ('Revert "gallium/dri: fix dri2_from_planar for multiplanar images"')
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12037>
(cherry picked from commit 8ba44103b3)
The code in dri_make_current just checks the value of the pointers
to decide to update texture_stamp or not. This is buggy since a new
allocated drawable could share the same address with the previous
released drawable. Fix the stale pointer issue by always resetting
these pointers to NULL in dri_unbind_context.
v2:
Move the reset codes to the end of the function.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12050>
(cherry picked from commit 7ff30a0499)
The comment was incorrect: we can have N draws using the
same mode with N > 1 (eg: GL_QUAD_STRIP draws
cannot be merged).
This commit fixes the drawing code to use the correct draw
function.
This fixes a hang in Starsector (see issue #5086).
Fixes: b328d8e9bc ("dlist: use an union instead of allocating a 1-sized array")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11978>
(cherry picked from commit 11d6441b94)
When the dot product uses a source which can be optimized to a scalar,
after a bunch of nir optimization steps the source to fsum will be a scalar
with a x replicate swizzle. Hence nir_src_num_components is just 1 and the
fsum was just a no-op which is not correct. Arguably this could be optimized
a bit better, but just determine the number of addends by using nir_op_infos
instead (the operand fetch was fixed already by 39a938ecf4 doing the same).
Fixes: 4eb0475b5a ("gallivm/nir: add fsum support")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12066>
(cherry picked from commit cac5711d43)
In v3dv_write_uniforms_wg_offsets() function, the job's cmd_buffer is a
valid command buffer, so there is no reason to check if its NULL or not.
This fixes CID#1487441 ("Dereference after null check") error.
v1:
- `job->cmd_buffer` is the same as `cmd_buffer` (Alejandro)
v2:
- Use `cmd_buffer` instead of `job->cmd_buffer` (Iago)
Fixes: 31a786c80a ("v3dv: handle QUNIFORM_FB_LAYERS")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11999>
(cherry picked from commit 4eb0475b5a)
Only fragment and some compute shaders support implicit derivatives.
They're totally meaningless without helper invocations and some
understanding of the dispatch pattern. We've got code to lower
nir_texop_tex in these shader stages to use an explicit derivative of 0
but it was pretty badly broken:
1. It only handled nir_texop_tex, not nir_texop_txb or nir_texop_lod.
2. It didn't take min_lod into account
3. It was conflated with adding a missing LOD parameter to opcodes
which expect one such as nir_texop_txf. While not really a bug,
this does make it way harder to reason about the code.
4. Unless you set a flag (which most drivers don't), it left the
opcode nir_texop_tex instead of nir_texop_txl which it should have
been.
This reworks it to go through roughly the same path as other LOD
lowering only with a constant lod of 0 instead of calling out to
nir_texop_lod. We also get rid of the lower_tex_without_implicit_lod
flag because most drivers set it and those that don't are probably
subtly broken. If someone really wants to get nir_texop_tex in their
vertex shaders, they can write a new patch to add the flag back in.
Fixes: e382890e25 "nir: set default lod to texture opcodes that..."
Fixes: d5ac5d6e83 "nir: Add option to lower tex to txl when..."
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
(cherry picked from commit 74ec2b12be)
Required to build Mesa on macOS with
-Dbuild-tests=true -Dglx=gallium-xlib
Without this change, the build fails with
In file included from ../src/gallium/targets/graw-xlib/graw_xlib.c:8:
../src/gallium/include/frontend/xlibsw_api.h:5:10: fatal error: 'X11/Xlib.h' file not found
#include <X11/Xlib.h>
With `brew sh` X11 is found but linking fails due to `llvm-ar` missing
in the path. That issue appears to be unrelated to this missing
dependency. X11 is installed via XQuartz, so Homebrew should not be
required.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12022>
(cherry picked from commit 061508d310)
This expands on commit c54c42321e. See the code comment for full
justifications. At the time of the previous commit Ian wanted to
limit the relaxing of the rule to GLSL 3.30 as that was the highest
version of shaders seen in the wild that were having trouble with
the stricter rules.
However since then I've found that the long standing issue with tess
shaders failing to compile in the game 'Layers Of Fear' is due to
this same issue. The game uses 4.10 shaders and also makes use of
explicit varying locations, so here we relax the rule to 4.20 and
make sure to apply the restriction to shaders using varyings with
explicit locations also.
Fixes: c54c42321e ("glsl: relax rule on varying matching for shaders older than 4.00")
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11873>
(cherry picked from commit 0e0633ca49)
I was trying to fix this test, but noticed brw_clip.c in i965
does a * 2 here, and it seems to fix this test as well.
Fixes:
dEQP-GLES2.functional.polygon_offset.default_displacement_with_units
Fixes: f9e2c24326 ("draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_float")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12029>
(cherry picked from commit 1e5a470b43)
Unfortunately I contacted the dev about this issue years ago and he
made a fix, but it has never been released after all these years.
This stops the screen from being completely black in game.
CC: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11941>
(cherry picked from commit f3ec4a934d)
Before commit f7e0cdcf1a, we tried these in order
- if (!ForceSoftware) surfaceless_probe_device(disp, false);
- surfaceless_probe_device(disp, true);
- surfaceless_probe_device_sw(disp);
The commit changed it to
- surfaceless_probe_device(disp, ForceSoftware);
- surfaceless_probe_device_sw(disp);
and broke 2D virtio-gpu and vgem when ForceSoftware is false. This
commit restores the old behavior.
Fixes: f7e0cdcf1a ("egl/surfaceless: simplify dri2_initialize_surfaceless()")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11992>
(cherry picked from commit 384181921c)
With this pass enabled in Intel drivers, running shader-db on
shaders/unity/38.shader_test resulted in
Program received signal SIGSEGV, Segmentation fault.
gcm_schedule_early_src (src=0x555555d45348, void_state=0x7fffffffba40) at ../../SOURCE/master/src/compiler/nir/nir_opt_gcm.c:297
297 if (info->early_block->index < src_info->early_block->index)
(gdb) print src_info->early_block
$1 = (nir_block *) 0x0
I tracked this down to an early exit from gcm_schedule_early_instr on
the parent instruction because instr->pass_flags was 0x1c. That
should be an impossible value for this pass, so I inferred that
pass_flags must have dirt left from some previous pass.
Fixes: 8dfe6f672f ("nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
(cherry picked from commit 436668874a)
Like in the case of emitting a block, process pending TMU operations
before a jump is executed.
Fixes dEQP-VK.graphicsfuzz.stable-binarysearch-tree-nested-if-and-conditional.
Fixes: 197090a3fc ("broadcom/compiler: implement pipelining for general
TMU operations")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11971>
(cherry picked from commit dc40157888)
The function radeonsi_screen_create_impl() tries to create the
aux_context but doesn't actually check for the returned value from
si_create_context().
Then, on si_destroy_screen() the aux_context is used without actually
checking whether it's a thing or not.
As a result, if for any reason si_create_context() failed, we shall
crash in si_destroy_screen() with a NULL pointer dereference trying to
access ((struct si_context *)sscreen->aux_context)->log.
Simply check for aux_context not being NULL to avoid that crash.
Cc: mesa-stable
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11948>
(cherry picked from commit 5bfd1a7e19)
Originally I fixed the case where the nir itself has a shared mem size of
0, but the frontend (e.g. clover) set it to some other value.
But st/mesa sets the shared mem size on the state object as well and we
end up actually doubling the value in the driver as we set smemSize to the
value from the state object before calling into the compiler.
So just max the value instead.
Fixes the compute_shader.shared-max CTS test.
Fixes: dc667b1f19 ("nv50/ir/nir: fix smem size")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11047>
(cherry picked from commit ff55412f40)
This replaces some new/delete uses with malloc/free.
This is more consistent with most of the other glsl IR code but
more importantly it allows the game "Battle Block Theater" to
start working on some mesa drivers. The game overrides new and
ends up throwing an assert and crashing when it sees this
function calling new [0].
Note: The game still crashes with radeonsi due to similar conflicts
with LLVM.
CC: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11907>
(cherry picked from commit 749251391d)
Error handling with DRM_IOCTL_I915_QUERY is tricky and we got it wrong
in one of the two calls here. Use the common helper instead. This also
fixes a theoretical bug where calloc() fails. While we're here, inline
iris_bufmgr_update_meminfo because we're not really benefiting from
having it separate anymore.
Fixes: e60114b2ae "iris/bufmgr: Query memory region info."
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
(cherry picked from commit 3fa6b8d041)
Error handling with DRM_IOCTL_I915_QUERY is tricky and we got it wrong
in one of the two calls here. Use the common helper instead. This also
fixes a theoretical bug where calloc() fails. While we're here, inline
anv_track_meminfo because we're not really benefiting from having it
separate anymore.
Fixes: 65e8d72bc1 "anv: Query memory region info"
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
(cherry picked from commit 35ec1d9730)
We also add a helper which contains the standard query+alloc+query
pattern used by anv_gem_get_engine_info(). The caller is required to
free the pointer.
These are declared static inline not because we care about the
performance of these helpers but because we're going to use them in the
intel_device_info code and we don't want a link dependency.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
(cherry picked from commit ffdf4d7683)
DRM_IOCTL_I915_QUERY is a multi-query. The most egregious errors are
returned via the usual ioctl error mechanism but there are also
per-query errors that are indicated by item.length < 0. We need to
handle those as well. While we're at it, scrape errno so we can return
a proper integer error.
Fixes: c0d07c838a "anv: Support i915 query (DRM_IOCTL_I915_QUERY)..."
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
(cherry picked from commit b664481ba9)
On APUs, we fake heaps to simulate a dGPU setup because it seems to
have the maximum compatibility. Though, some applications like RDR2
still only looks at GTT if the driver reports an iGPU which means it
will only use 1/3rd of total memory available.
This is currently behind a drirc option because it might have
implications for other apps but we might want to extend this later
if everything is fine.
Cc: 21.2 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11891>
(cherry picked from commit cadf2d63b7)
Fix defect reported by Coverity Scan.
Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking so suggests that it may be null,
but it has already been dereferenced on all paths leading to the
check.
Fixes: dcd2d8ca50 ("asahi: Track more Gallium state")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11898>
(cherry picked from commit f5c8761eda)
Drops the vk_format_to_pipe (and it's outdated table) for vk_format_to_pipe_format, aswell as the duplicated vk_format_aspects function.
The old format table was missing USCALED and other values, causing incorrect rendering in many games.
Fixes rendering in Portal 1, Hat in Time, Half-Life 2 and pretty much every other D3D9 title with DXVK.
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11863>
(cherry picked from commit 1744372714)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.