When primitive is points, EndPrimitive can't be used to count
primitive. Need to use vertex count instead. And it's also not
needed to do vertex per primitive count and overwrite incomplete
primitive work for points.
Fixes: 2be99012e9 ("nir: Add ability to count emitted GS primitives.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17805>
(cherry picked from commit 84956286a8)
TEX instrutions can't write xyz and w to separate registers so we
need to create variables from them first, otherwise we can create
two variables from ALU writing the same register xyz and w in other
branch (this usually works when TEX is not present as the xyz and
w can read/write from different registers).
This fixes regalloc because the variables are later used as a
graph nodes.
The variable order should not matter but it slightly does (leading
to approx 0.3% shader-db temps increase as compared to previous
state), so just sort the variables list afterwards to be as close
to the previous behavior as possible and prevent the regression.
CC: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6936
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17987>
(cherry picked from commit 88fd397c74)
Due to the driver live shader cache, it's possible
two different d3d9 shaders get the same cso.
As it's disallowed to destroy a shader cso being
bound, nine checks for this scenario. However it
was not taking into account the cso might be from
a different shader.
cc: mesa-stable
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18021>
(cherry picked from commit 93da6e9f34)
The nir shader memory is freed in nir_to_tgsi(), but the already
freed shader info is referenced later when create compute state.
To avoid referencing the freed memory, copy the shader info first before
calling nir_to_tgsi.
Fixes vmx crash running aztec on SVGA driver.
Fixes: 580f1ac473 ("nir: Extract shader_info->cs.shared_size out of union")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17999>
(cherry picked from commit 4393be8291)
We can't invalidate CCU if there is any dirty data that hasn't been
flushed yet. In the case where we clear depth, we know that the depth
attachment itself isn't dirty but there may be dirty data from other
renderpasses. Therefore we need to flush before invalidating depth.
Fixes: 487aa80 ("tu: Rewrite flushing to use barriers")
Closes: #6987
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17940>
(cherry picked from commit a7e64ab63c)
For example, "DS -> branch -> VMEM -> branch -> DS".
fossil-db (navi10):
Totals from 639 (0.40% of 161220) affected shaders:
Instrs: 629090 -> 628254 (-0.13%); split: -0.19%, +0.06%
CodeSize: 3410164 -> 3406748 (-0.10%); split: -0.14%, +0.04%
Latency: 7834755 -> 7821011 (-0.18%); split: -0.70%, +0.52%
InvThroughput: 1369698 -> 1374495 (+0.35%); split: -0.12%, +0.47%
A lot of the fossil-db changes are noise.
threekingdoms.8db138826c386a62.1.foz/0b222ed175eebad0 is an example of a
shader that actually has this issue.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c037ba1bb7 ("aco/gfx10: Mitigate LdsBranchVmemWARHazard.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17697>
(cherry picked from commit b17e59a03b)
OpenGL API calls like glClearBufferData() result in mapping/unmapping
of a given buffer by Mesa and unmapping of a host blob fails in
virglrenderer because VirGL driver uses command that is intended for
unmapping of a guest buffer. In particular this causes problem for the
"Total War: Warhammer" game that gets GL_OUT_OF_MEMORY error due to the
failed unmapping command. Fix this by setting the mapping usage flag in
accordance to the resource flags, allowing virgl_buffer_transfer_unmap()
to differentiate host buffer from guest.
Fixes: 3b54e5837a ("virgl: support PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17914>
(cherry picked from commit 46396e97be)
Wa_14010455700 is dependent on the format and sample count, but our
code to track whether or not it had been applied was only dependent on
the format.
As a result, we failed to enable the workaround when an app used a D16
2xMSAA buffer, then a D16 1xMSAA buffer right afterwards.
Make the workaround tracking code sample-dependent to fix this.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17859>
(cherry picked from commit a75cd15b94)
Wa_14010455700 is dependent on the format and sample count, but our
code to track whether or not it had been applied was only dependent on
the format.
As a result, we failed to enable the workaround when an app used a D16
2xMSAA buffer, then a D16 1xMSAA buffer right afterwards.
Make the workaround tracking code sample-dependent to fix this.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17859>
(cherry picked from commit e7419c11ae)
if the tcs was generated, then the prgram was added to the non-tcs cache,
which means deleting it from the tcs+tes cache will fail and then
context_destroy will explode
Fixes: 4123ee3c71 ("zink: invoke descriptor_program_deinit for programs on context destroy")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17866>
(cherry picked from commit c7ef4f9735)
This is needed because when we switch between GLES and GL on the host,
we have to lower atomics to ssbo, and with that the shaders can't be
pulled from the cache anymore. Likewise when we move the disk image with
a shader cache to a different host, other features might change that
will need lowering. To avoid using stale shaders in this case, merge the
caps into into the shader cache sha.
Fixes: d6db4d2e08
virgl: Add simple disk cache
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17798>
(cherry picked from commit f9703ac34d)
The workaround for draws that need a CP_WAIT_FOR_ME didn't work if the
barrier before the draw is in a separate command buffer from the draw.
The barrier would add a pending CP_WAIT_FOR_ME, but it would get dropped
on the floor at the end of the command buffer and the draw wouldn't have
a pending CP_WAIT_FOR_ME so it wouldn't emit one. We don't know in the
barrier if the destination is a draw with the workaround, so we have two
options:
- Emit any pending CP_WAIT_FOR_ME at the end of the command buffer (and
before secondaries) in case there is a workaround draw later. This
will emit an extra CP_WAIT_FOR_ME at the end of the command buffer in
case there is an indirect command barrier.
- Always assume at the beginning of the command buffer that there is a
pending CP_WAIT_FOR_ME. This will emit an extra CP_WAIT_FOR_ME before
the first workaround-requiring draw in the command buffer, in case
there was a barrier earlier.
The only draws requiring a workaround are currently
vkCmdDraw*IndirectCount(), which we assume are rarer than indirect
command barriers, so we implement the second option. This entails
treating it as a cache invalidate.
This fixes some upcoming dynamic rendering CTS tests that do
vkCmdDrawIndirectCount() in a secondary but put the barrier for it in
the primary.
Fixes: 37939e9c54 ("turnip: Fix the lack of WFM before indirect draws")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
(cherry picked from commit c5be444500)
RECT textures used to be required to be supported by drivers. But since
the state-tracker learned how to lower these to 2D textures, some
drivers no longer support them.
While we have lowering in place for this, lowering it involves some
needless overhead. So let's just use a 2D texture instead of a RECT
texture.
Because having two versions and switching between them is more
complicated than it needs to be, let's just always use a 2D texture.
Similarly, let's just always multiply the reciprocal here, so we don't
have to test for PIPE_CAP_TGSI_DIV first.
Cc: mesa-stable
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17707>
(cherry picked from commit ba2146f93f)
When a flush happens the per-context setup is used to hold the fence
for the last scene sent to the rasterizer. However when multiple
contexts are in use, this fence won't get returned to be blocked on.
Instead move the last fence to the rasterizer object, and return
that instead as it should be valid across contexts.
Fixes gtk4 bugs on llvmpipe since overlapping vertex/fragment.
Fixes: 6bbbe15a78 ("Reinstate: llvmpipe: allow vertex processing and fragment processing in parallel")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17899>
The hwconfig api may change unexpectedly prior to public release of
new platforms. Also, public documentation of the hwconfig api
sometimes lags the release.
For these reasons, warnings about unhandled hwconfig keys are noisy,
likely to occur, and unhelpful to most users. This commit drops those
warnings, in favor of a separate internal process for tracking
hwconfig api changes.
Suggested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 6401d768b9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17862>
nr_images is the trigger for allocating double the number of buffers
for attributes. When there are no images, there is not always enough
space for ALIGN_POT(k, 2) to not move k out of bounds, so don't
execute the line in that case.
Fixes: dc85f65e05 ("panfrost: emit shader image attribute descriptors")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17447>
(cherry picked from commit fe613a8de9)
Before this patch, we were leaking compressed resources in iris_create_surface.
Specifically, when we failed to create an uncompressed ISL surface and view for
a compressed resource, we didn't unreference the resource pointer we referenced
into the pipe_surface.
Fix this by delaying the pipe_surface initialization code to after attempting
to create the uncompressed surface and view.
Cc: 22.1 <mesa-stable>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17598>
(cherry picked from commit 6c65e990b6)
Before this patch, we were leaking surface states in iris_create_surface.
Specifically, when we failed to create an uncompressed ISL surface and view for
a compressed resource, we didn't free surface states we allocated for it.
Fix this by attempting to create the uncompressed surface and view before we
allocate the surface states.
Cc: 22.1 <mesa-stable>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17598>
(cherry picked from commit bca601ffe9)
this is a weird corner case where glsl permits a zero value, so clamp to 1
and then don't emit any vertices to avoid driver hangs
affects:
dEQP-GL45-ES31.functional.geometry_shading.emit.points_emit_0_end_0
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17639>
(cherry picked from commit 5b58f8df53)
zink_kopper_acquire_readback will flush any outstanding clears, this means that
the current clears need to be applied first before calling zink_kopper_acquire_readback.
This was already done for native_blit and native_resolve, also do this for the emulated draw path.
Seen as intermittent failures in cts case GTF-GL33.gtf21.GL2FixedTests.buffer_clear.buffer_clear.
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17631>
(cherry picked from commit 2159a377c0)
Spirv spec does not allow the use of OpEmitVertex or OpEndPrimitive when there are multiple streams.
Instead emit the multi-stream version of these with stream set to 0.
This issue was seen when testing cts case KHR-GL46.transform_feedback.draw_xfb_stream_test
Fixes: 35e346f428 ("zink: handle vertex streams")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17513>
(cherry picked from commit 3dfd8e4d7d)
Since we allocate this ourselves we can immediately add it to the
job at the time we allocate it.
This also fixes a bug we introduced when we implemented inline
uniforms because since that commit, if we had an inline uniform
buffer at index 1 which happend to have indirect access we would
track it in slot 0 instead of slot 1, potentially overwriting
the push constant buffer reference.
Fixes: ea3223e7a4 ('v3dv: implement VK_EXT_inline_uniform_block')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17536>
(cherry picked from commit e451c612df)
this can get called from multiple threads with the recent llvmpipe
overlapping rendering changes, so make sure to lock around the
map/unmapping so they can't race.
This should fixes some crashes seen with kwin.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Tested-by: Adam Williamson (Fedora)
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17531>
(cherry picked from commit 50e3303b3d)
while some (tg4) sample ops can use different bit sizes in spirv, most
cannot, and all the shader variables are always emitted as 32bit, so
ensure the 32bit type is always what's being used for sampling
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17427>
(cherry picked from commit 49d5fa12f2)
Fixes dEQP-VK.ray_query.advanced.using_wrapper_function.comp.*
An empty struct is causing problems because when passing it as
argument the spirv parser will just drop the argument, considering it
does not hold any data.
v2: update radv CI
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4c703686db ("spirv: handle ray query intrinsics")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17420>
(cherry picked from commit a41e8dc588)
Conflicts:
src/amd/ci/radv-navi21-aco-fails.txt
src/amd/ci/radv-navi22-aco-fails.txt
src/amd/ci/radv-vangogh-aco-fails.txt
navi tests are removed as they don't exist in 22.1
Because clear colors are stored as 4 32bit component values, there is
an issue if you try to format instance :
- clearing in R16G16_UNORM
- draw in R32_UINT
Clear will use 2 components of the clear color in dword0 & dword1.
While draw will use only one component of dword0.
This change uses the mutable format information to track whether clear
colors can be non-zero for fast clears.
With :
- non mutable formats, we can fast clear with any color on Gfx > 8
- mutable formats with incompatible component sizes, we can only
fast clear with 0 color
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5930
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17329>
(cherry picked from commit 682383e5b3)
Conflicts:
src/intel/vulkan/anv_image.c
Bifrost (and Valhall) separate early-ZS configuration into two fields: when does
the depth/stencil buffer update happen? and when are pixels killed by the
depth/stencil tests? The driver separately configures these to occur early
(before the shader executes) or late (after the ATEST instruction executes at
the end of the shader). Early tests are generally more efficient, but various
combinations of API state and fragment shader properties can require late
updates and/or late kills for correctness. Determining how to configure these
fields is nontrivial.
Our current implementation (on Bifrost) configures these fields at fragment
shader compile time and bakes the settings into the RSD. This is both wrong
(using early testing when late testing is required) and suboptimal (using late
testing when early testing would suffice). We need to defer this configuration
until draw time, when we know rasterizer and Z/S state.
Reclassifying at draw time (as we currently do on Valhall) would be expensive,
especially with the extra terms added in here. To cope, decouple the shader
classification from the draw-time configuration. Since there are only a few bits
of draw state involved, this implementation just calculates all possible states.
Then the draw time classification is just indexing into a lookup table.
The actual algorithm used to classify is written with correctness and clarity in
mind. Unlike the current classification algorithm (which tries to match what the
DDK does, poorly), this algorithm embeds its proofs of correctness.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
(cherry picked from commit e96292bc07)
In theory, ATEST can take any combination of registers for inputs.
Experimentally, however, ATEST requires the coverage mask in R60. This avoids
regressing the following dEQP tests, which write their coverage mask with
pixel-frequency-shading but without writing to the depth/stencil buffer.
dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_pixel.*
This issue is known to affect both Mali-G52 (v7) and Mali-G57 (v9). I am unsure
if this is a silicon bug or just an obscure implementation detail.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17428>
(cherry picked from commit db2bdc1dc3)
Instead of trying to compact the surface state table to get rid of any
unused render targets, emit MAX(1, colorAttachmentCount) surface states
always. This ensures that secondaries will always match with primaries
when we go to do the copy since there's no rule requiring the secondary
to have VK_FORMAT_UNDEFINED when the primary has a NULL image view.
Fixes: 3501a3f9ed ("anv: Convert to 100% dynamic rendering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17013>
This pass tries to move register usage closer to SSA, and for large
shaders this means we can overflow the register index, which only has
RC_REGISTER_INDEX_BITS size. This creates invalid code and leads to
crash at a later stage. Limit the pool of available registers to
RC_REGISTER_MAX_INDEX, currently is was two times the number of
shader instructions.
This means we'll fail the compile right away if we wanted more than
RC_REGISTER_MAX_INDEX temps, but when we've got that many we're
already well past how many instructions we can support anyway.
CC: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6017
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17393>
(cherry picked from commit 42a3d22f16)
the EXT_external_object spec originally was underspecified with regards
to this function, leaving room for synchronization errors where:
* app calls SignalSemaphoreEXT to signal a semaphore
* mesa defers pipe_context::fence_server_signal with threaded context
* driver defers gpu submission
* SignalSemaphoreEXT has long since returned, app submits vk cmdbuf waiting on semaphore
* spec violation / device lost
to prevent this, the spec is being changed to:
1) require an implicit flush when calling SignalSemaphoreEXT
2) require that this implicit flush also forces GPU submission before SignalSemaphoreEXT returns
all affected drivers have been updated
fixes#6568
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17376>
(cherry picked from commit 21b3a23404)
The Vulkan specification states:
> Query commands, for the same query and submitted to the same queue,
> execute in their entirety in submission order, relative to each other. In
> effect there is an implicit execution dependency from each such query
> command to all query commands previously submitted to the same queue.
Fixes dEQP-VK.query_pool.statistics_query.reset_after_copy.*
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17400>
(cherry picked from commit 768cd5715d)
Flushing the batch midframe (splitting a renderpass) is expensive on a tiler, as
it requires the GPU to flush the framebuffer contents to main memory and read
them back. Clearing the framebuffer should not trigger a flush. Apps expect
clears to be (almost) free, flushing for a clear is at the very least unexpected
behaviour.
The only reason we previously flushed is to ensure we could always use a "fast"
clear. But a slow clear is a heck of a lot faster than a flush ;-) Instead of
flushing, we should clear with a draw (via u_blitter) in case a fast clear isn't
possible.
This fixes pathological performance for applications that rely on partial clears
within a frame. This issue was identified with Inochi2D, which repeatedly clears
the stencil buffer midframe, in order to implement masking efficiently with the
stencil buffer. In total, the all-important workload of rendering Asahi Lina is
improved from 17fps to 29fps on a panfrost device.
Fixes: c138ca80d2 ("panfrost: Make sure a clear does not re-use a pre-existing batch")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17112>
(cherry picked from commit 638b22354e)
These opcodes where fixed to return an integer instead of a boolean
value some time ago but the documentation for them was not updated
and still talked about a boolean result.
Fixes: b0d4ee520 ('nir/opcodes: Fix up uadd_carry and usub_borrow')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>
(cherry picked from commit 84a0dca9df)
TGSI has no legitimate[1] notion of linked shaders, which means tgsi_to_nir
should conservatively assume everything all shaders are separable. This requires
setting nir->info.separate_shader to warn drivers that shader CSOs might be
mixed and matched. Otherwise, the driver might enable optimizations that
are invalid for separate shaders, causing issues when the shaders are
later treated as separable.
This will fix varying linking with u_blitter's shaders on Panfrost (Bifrost and
older), when util_blitter_clear is used with Panfrost.
[1] There was a TGSI property added recently to forward
nir->info.separate_shader up to virglrenderer, but it's not actually used for
anything in virglrenderer and I am still struggling to understand what the use
case would be. My gut says we should revert b634030542 ("tgsi: Add
SEPARABLE_PROGRAM property"), but I'm not interested in fighting that yak right
now. Notably, the u_blitter and hud shaders are separable but are not marked
with this property.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17282>
(cherry picked from commit 151aa19c21)
VUID-VkViewport-minDepth-01234 specifies that depth must be in the range [0.0, 1.0],
so the viewport must always be clamped to this range
this affects texture clears using u_blitter, as this expects to be able
to use the GL range of [-1.0, 1.0], so pass the depth value as though it's
been de-converted back to a GL z coordinate to account for viewport transform
cc: mesa-stable
fixes#6757
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17319>
(cherry picked from commit 90c5eea22b)
The postponed spill is predicated using the condition from the
last write, but this is only correct if the register was only
written once in the TMU sequence, or if it is always written with
the same predication.
While we could try to track whether this is the case or not, it
would make the postponed spill path even more complex than it
already is, so let's just avoid predicating these. We are already
discouraging TMU spilling of registers in the middle of TMU
sequences, so this should not be a very common case.
Cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17201>
(cherry picked from commit cfccd93efc)
If we are spilling a register that is used in the middle of a TMU
sequence, we postpone the spill until the TMU sequence finishes,
at which point we inject the spill and rewrite the original
instruction to write to the new temp.
However, this doesn't work if the register is written multiple
times during the TMU sequence. In that scenario, we need to ensure
that all writes are rewritten to use the new temp, not just the last
one.
Cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17201>
(cherry picked from commit 98420408d0)
ralloc is not thread-safe. While a given context can only be accessed from a
single thread at once, multiple contexts can be created against the same screen
at once. The ralloc allocations against the shared screens will race. Depending
on the result of the race, the same block of memory can be returned as the two
new contexts in two different threads, causing a use-after-free when the context
is freed later.
We free the context explicitly when it's destroyed anyway. If screens are
getting destroyed without the contexts getting destroyed first, that's a state
tracker bug, not a Panfrost one.
This matches what Iris does.
Fixes crash in test_integer_ops.int_math on Panfrost.
Fixes: 0fcf73bc2d ("panfrost: Move to use ralloc for some allocations")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17234>
(cherry picked from commit f18492faa9)
Otherwise passes which expect offsets to be in bytes (like
brw_nir_lower_mem_access_bit_sizes, called from brw_postprocess_nir)
may produce incorrect results.
Fixes 64-bit load/stores in task/mesh shaders.
Fixes: c36ae42e4c ("intel/compiler: Use nir_var_mem_task_payload")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16196>
(cherry picked from commit 42b551fe7f)
only the viewMask parameter of VkPipelineRenderingCreateInfoKHR can
be accessed in the fragment stage, so for pipeline libraries it should
be assumed that zs attachments exist for the purpose of copying dynamic
state values, and then these dynamic states will naturally be pruned
during final pipeline construction if the attachments turn out to not
be present
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17219>
(cherry picked from commit 2a69aeb9c1)
For example, the proof for this pattern
(('bcsel', ('flt', 'a@32', 0), 'b@32', 'c@32'), ('fcsel_ge', a, c, b)),
would be
bcsel(a < 0, b, c)
bcsel(!(a < 0), c, b)
bcsel(a >= 0, c, b)
fcsel_ge(a, c, b)
However, !(a < 0) => (a >= 0) is well known to produce different
results if `a` is NaN.
Instead of that replacement, use this replacement:
bcsel(a < 0, b, c)
bcsel(-0 < -a, b, c)
bcsel(0 < -a, b, c)
fcsel_gt(-a, b, c)
This is NaN-safe and exact.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes: 0f5b3c37c5 ("nir: Add opcodes for fused comp + csel and optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
(cherry picked from commit a2a2fbc510)
wezterm in fullscreen 4k was exceeding the xcb max request size
on the put image with llvmpipe. This fixes it to send sub-images,
the Xlib put image used in glx does this internally, but not
the xcb one, so just do it in sections here.
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
(cherry picked from commit e6082ac62e)
Fix the transform to make sure it doesn't disturb the depth range
of the blitted image. Set the Z coordinates of the vertices
by hand instead of relying on the transform to do it.
This is a pre-requisite to Zink always enabling depth clamping.
Fixes: 26c6640835
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16929>
(cherry picked from commit 810135fb42)
Conflicts:
src/gallium/drivers/lima/ci/lima-fails.txt
r300 and r400 have strict rules with swizzles, so we
will need to convert swizzle back.
Operating on 0, 1, H in this case unnecessarily makes
rest of r300 overly complicated.
(also it's not currently able to handle this)
helps with:
deqp-gles2@functional@shaders@random@exponential@fragment@24
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17117>
(cherry picked from commit 6cbb19110b)
Whether a sync object is used cannot depend on where the batch is
submitted from, remove the in_sync and out_sync fields from
panfrost_batch_submit.
Always use an output syncobj, this is required for glFinish to work
correctly. This could be skipped for batches which another batch
depends on, but because of the existence of empty batches which emit
no job, doing so is not trivial.
Never use an input syncobj. There appears to be no point to this, the
kernel driver does implicit sync anyway.
Fixes "seconds per frame" rendering with Neverball; previously, every
batch was submitted with out_sync=0, so DRI's frame throttling could
do nothing. New jobs would keep getting submitted until more than a
thousand were queued in the kernel, which increased rendering latency
for the compositor far beyond acceptable levels.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16966>
(cherry picked from commit d8803c724b)
usually inlining is optimal for cpu drivers since the majority of
time is spent in the shaders, and any amount of reduction to shader code
will be optimal
if, however, the shaders are still really big after inlining, this improvement
will be negated by the insane amount of time spent doing stupid llvm optimizer
passes, so check post-inline size to see whether it exceeds a size threshold
lavapipe release build - 1700% improvement
* spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec4-index-rd-after-barrier
before: 142.15s user 0.42s system 99% cpu 2:23.14 total
after: 8.60s user 0.07s system 99% cpu 8.677 total
fixes#6647
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16977>
If multisample is enabled and alpha testing happens, the
branch can jump out of the fragment shader before the other
samples are generated. Just don't take the branch optimisation
post alpha test if multisample is enabled.
This should fix some rendering bugs in kicad with multisample
enabled.
Cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17049>
(cherry picked from commit 6e5d126a65)
This was an optimization done a while ago that doesn't seem to be having
much of an impact anymore, and on the other hand, causes all sorts of
breakage with queries, as many of our HW counters don't get incremented
when rasterization is disabled.
This fixes a bunch of issues Zink has with ANV, but more importantly, it
fixes upcoming CTS tests:
dEQP-VK.transform_feedback.primitives_generated_query.*.empty_frag.*
dEQP-VK.transform_feedback.primitives_generated_query.*.no_attachment.*
dEQP-VK.transform_feedback.primitives_generated_query.*.color_write_disable_*
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17038>
(cherry picked from commit 4666ef720e)
Conflicts:
src/gallium/drivers/zink/ci/zink-anv-tgl-fails.txt
CI file was deleted as it doesn't exist on 22.1
These are normally only set once because it's constant across the entire
renderpass, but they're trashed by the 3d store path because it needs to
store to CCU instead of GMEM. Therefore we need to save/restore them. Do
it in a way compatible with #5181.
Fixes: b157a5d ("tu: Implement non-aligned multisample GMEM STORE_OP_STORE")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17058>
(cherry picked from commit cba6da2b21)
Even though image views for attachments must use the identity swizzle,
there are cases where we have to add in our own swizzle, in particular
for D24S8 when the view is depth-only/stencil-only. Therefore we have to
reset it to the identity, similar to what we do with input attachments.
Fixes: b157a5d ("tu: Implement non-aligned multisample GMEM STORE_OP_STORE")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17058>
(cherry picked from commit 705c0d0373)
Conflicts:
src/freedreno/ci/freedreno-a630-fails.txt
this was correct for 64bit loads and manually converted 32bit loads (e.g., bindless),
but it was broken for the case where 64bit was not supported, as the offset wasn't
being correctly adjusted
break out the offset division to hopefully make this a little clearer
Fixes: 150d6ee97e ("zink: move all 64-32bit shader load rewriting to nir pass")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16669>
(cherry picked from commit bbe5136658)
Conflicts:
src/gallium/drivers/zink/ci/zink-tu-a630-fails.txt
CI file removed as it doesn't exist in 22.1
each sampler is 1 driver location, so use the base variable
Fixes: 2d745904ca ("zink: add a gently mangled version of the d3d12 cubemap -> array compiler pass")
fixes:
dEQP-GL45-ES31.functional.shaders.opaque_type_indexing.sampler.const_expression.*.samplercubearray
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17008>
(cherry picked from commit de6af39534)
This is not safe because it may skip regenerating the flags for the
loop condition in the loop continue block and these flags may be
stomped in the loop body by other conditionals.
Fixes: 9909fe6ba ('broadcom/compiler: Skip bool_to_cond where possible')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17020>
(cherry picked from commit a97f78eb14)
Looks like some hardware needs this info in the shader to match the
topology. Since there's no spot in the shader info for it, we're
currently using the array size of the TCS input vars to store it.
Cc: mesa-stable
Reviewed-by: Paul Dodzweit <paul.dodzweit@amd.com>
Tested-by: Paul Dodzweit <paul.dodzweit@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16920>
(cherry picked from commit cc805aef69)
Conflicts:
src/microsoft/compiler/dxil_nir.h
The hardware writes one CRC per (effective) tile, the tile size of the CRC
buffer is the same as the configured effective tile size. However, all our CRC
infrastructure assumes 16x16 tiles. In case CRC is used with smaller tiles,
buffer overflows and incorrect rendering are all possible. Don't use CRC at
smaller tile sizes. Note disabling CRC correctly invalidates any bound CRC
buffers.
Fixes: 2e97d7c835 ("panfrost: Transaction elimination support")
Closes: #6332
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16983>
(cherry picked from commit 44223e5f28)
It has a single user -- in a section of code that only runs for MFBD GPUs and
that has already decided whether to use CRCs -- so inlining it simplifies its
definition greatly and may avoid redeciding the CRC setting.
[Note for mesa-stable maintainers: This is not a bug fix but is marked for
backport so the next patch applies cleanly.]
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16983>
(cherry picked from commit cac0578ee5)
all of ntv requires scalarized io since the offsets are now array indices
instead of byte offsets, so enforce scalarization here to avoid breaking
the universe
Fixes: 150d6ee97e ("zink: move all 64-32bit shader load rewriting to nir pass")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16669>
(cherry picked from commit cdaa601de3)
With the following structures :
struct StructA
{
uint64_t value0;
uint8_t value1;
};
struct TopStruct
{
struct StructA a;
uint8_t value3;
};
Currently offsetof(struct TopStruct, value3) = 9. While the same code
on the CPU gives offsetof(struct TopStruct, value3) = 16.
This is impacting OpenCL kernels we're trying to use to build
acceleration structures.
v2: Add comment/link to some description of the alignment/size
computation
Cc: mesa-stable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16940>
(cherry picked from commit 133620196d)
the 'notemplates' debug mode is somewhat misleading since there's no
uncached+notemplates mechanism, meaning that if the descriptor cache
explodes it'll still use templates for updating in the fallback path
Fixes: 4e3768914d ("zink: add ZINK_DESCRIPTORS env var to explicitly set a mode")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16927>
(cherry picked from commit ee1a0a0772)
The size must be the size of the total object, not the size
of the resource.
For instance, when using a single object for a multi-plane
format, the size of each plane should be equal to the size
of the underlying object to match libva's documentation:
/** Total size of this object (may include regions which are
* not part of the surface). */
uint32_t size;
Fixes: 13b79266e4 ("frontend/va: Setting the size of VADRMPRIMESurfaceDescriptor")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16813>
(cherry picked from commit bce227611d)
These do not convey any additional information, and fail to account for
shrinking. In particular, a 64-bit writemask with .keephi would fail to
disassemble and instead trip the assertion, since that would be the ZW
components. Just delete the broken code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>
(cherry picked from commit 8c11f4809b)
Otherwise, we can get vec3 with u2u32 with 64-bit sources which we need lowered.
Since our current approach is "scalarize all 64-bit ops", we need to check for
conversions too.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>
(cherry picked from commit 9e4b457958)
Conflicts:
src/panfrost/ci/panfrost-t860-fails.txt
Some blitter operations, like clear, doesn't require to save all the
states.
This is particular important because, besides saving time, the blitter
operation restores the state required for the operation, and if we saved
more states than those, these ones won't be restored and will be leak.
So this also fixes some leaks when running CTS tests.
CC: mesa-stable
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16837>
(cherry picked from commit 695f66cecd)
st_texture_release_all_sampler_views uses the validate_mutex,
but st_get_texture_sampler_view_from_stobj didn't.
Since they both modify stObj->view we could have threadA in
st_get_texture_sampler_view_from_stobj with a non-NULL sv,
so expecting sv->view to be non-NULL, while threadB was in
st_texture_release_all_sampler_views clearing sv->view.
It's also needed to protect st_sampler_view::private_refcount,
which is supposed to be used from the owning context thread,
but can also be used by any context in st_texture_release_all_sampler_views.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6088
Fixes: ef5d427413 ("st/mesa: add a mechanism to bypass atomics when binding sampler views")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16779>
(cherry picked from commit b51e40ebde)
Right now, we just consider the size of the accessed portion of the
push constant array, but it doesn't necessarily reflect the size
of the UBO we should declare.
Fixes: de1e941c59 ("microsoft/spirv_to_dxil: Lower push constant loads to UBO loads")
Reviewed-by: Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16703>
(cherry picked from commit 2feef505c1)
This expression accidentally performs a 32-bit sign-extension when
processing the second half of the expression (the low 16 bits).
Consider -7W, which is represented as 0xfff9fff9 in our encoding (the
16-bit word is replicated to both halves of the 32-bit dword).
Tigerlake's compaction stores the low 11-bits of an immediate as-is,
and replicates the 12th bit. So here, compacted_imm will be 0xff9.
( (int)(0xff9 << 20) >> 4) |
((short)(0xff9 << 4) >> 4))
0xfff90000 | (0xff90 >> 4)
0xfff90000 | 0xfffffff9 ...oops...
0xfffffff9
By casting the second line of the expression to unsigned short, we
prevent the sign-extension when it combines both parts, so we get:
0xfff90000 | 0x0000fff9
0xfff9fff9
Fixes: 12d3b11908 ("intel/compiler: Add instruction compaction support on Gen12")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16833>
(cherry picked from commit 26bb81f3f6)
This allows glamor to successfully compile its shaders on the GC400.
When running glamor using the GC400, Xorg reports that the compiled
shaders exceed the maximum allowed instructions because the value
reported from the kernel is halved.
Xserver[314]: etna_draw_vbo:318: compiled shaders are not okay
$ cat /sys/kernel/debug/dri/128/gpu | grep instruction_count
instruction_count: 256
However, the spec for the Unified vertex-fragment shader explicitly
lists 256 as the maximum number of instructions for each shader
("256 for vertex shaders; 256 for fragment shaders").
Signed-off-by: Kyle Russell <bkylerussell@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit aa29e0d858)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16661>
The BITFIELD_MASK() macro is intended for using with actual bitfields,
not with nir_component_mask_t. This means we do some extra work to
handle values that are invalid for nir_component_mask_t in the first
place.
This eliminates some warnings on Clang, where the compiler complains
about casting UINT32_MAX to UINT16_MAX.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15547>
this avoids looking at irrelevant 3d pixelstore params like
GL_PACK_IMAGE_HEIGHT when they don't apply, which will cause the storage
buffer to be incorrectly sized and break the operation
Fixes: e7b9561959 ("gallium: implement compute pbo download")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16728>
(cherry picked from commit 70fb3a4700)
During certain control-flow manipulation passes, we go out-of-SSA
temporarily in certain areas of the code to make control-flow
manipulation easier. This can result in registers being in phi sources
temporarily. If two sub-passes run before we get a chance to do
clean-up, we can end up doing some out-of-SSA and then a bit more
out-of-SSA and trigger this case. It's easy enough to handle.
Fixes: a620f66872 ("nir: Add a couple quick-and-dirty out-of-SSA helpers")
Fixes: 79a987ad2a ("nir/opt_if: also merge break statements with ones after the branch")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6370
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16111>
(cherry picked from commit 4a4d6cdc80)
VkAccelerationStructureBuildRangeInfoKHR spec:
If the geometry uses indices, primitiveCount × 3 indices are consumed from VkAccelerationStructureGeometryTrianglesDataKHR::indexData, starting at an offset of primitiveOffset. The value of firstVertex is added to the index values before fetching vertices.
If the geometry does not use indices, primitiveCount × 3 vertices are consumed from VkAccelerationStructureGeometryTrianglesDataKHR::vertexData, starting at an offset of primitiveOffset + VkAccelerationStructureGeometryTrianglesDataKHR::vertexStride × firstVertex.
Meaning: We always add firstVertex * vertexStride
to the vertex address and add primitiveOffset
either to the vertex address or the index address,
depending on wether indices are used.
Also add missing handling with instances.
Fixes: 0dad88b ("radv: Implement device-side BVH building.")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16719>
(cherry picked from commit 9be00573c4)
Conflicts:
src/amd/vulkan/radv_acceleration_structure.c
It turns out that we need a fragment shader for streamout. Whh? From
Lionel's reading of simulator sources, it seems the streamout unit is
looking at enabled next stages. It'll generate output to the clipper in
the following cases :
- 3DSTATE_STREAMOUT::ForceRendering = ON
- PS enabled
- Stencil test enabled
- depth test enabled
- depth write enabled
- some other depth/hiz clear condition
Forcing rendering without a PS seems like a recipe for hangs so it's
probably better to just enable the PS in this case.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
(cherry picked from commit 0d28de212a)
Starting with Ivy Bridge, we implement alpha-to-coverage by writting
gl_SampleMask with a pattern based on alpha. This will show up in
wm_prog_data::uses_omask so we don't need to look at the key.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
(cherry picked from commit 9fe6caf4e7)
According to the Vulkan spec 21.4 "Conditional Rendering",
only clearing attachments with vkCmdClearAttachments is subject to
conditional rendering.
Subpass clear and vkCmdClearColorImage / vkCmdClearDepthStencilImage
should always be executed even if it happens in a
conditional rendering block.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16654>
(cherry picked from commit 55466ca506)
It's meaningful for this intrinsic and so does not add noise to the
lowering pass.
(Although dual-source writes must be to RT 0, depth and stencil
writes, which store_combined_output_pan is also used for, can still be
done with MRT enabled.)
Fixes: 5c168f09eb ("nir: Eliminate store_combined_output_pan BASE")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16685>
(cherry picked from commit 9f9ed959bd)
D3D12 fences are capable of handling binary operations, but the
current dzn_sync implementation doesn't match vk_sync expectations
when sync objects are used to back semaphores. In that case, the wait
operation is supposed to set the sync object back to an unsignaled
state after the wait succeeded, but there's no way of knowing what
the sync object is used for, and this implicit-reset behavior is not
expected on fence objects, which also use the sync primitive.
That means we currently have a semaphore implementation that works
only once, and, as soon as the semaphore object has been signaled it
stays in a signaled state until it's destroyed.
We could extend the sync framework to pass an
implicit-reset-after-wait flag, but, given no one else seems to
need that, it's probably simpler to drop the binary sync
capability and rely on the binary-on-top-of-timeline emulation provided
by the core.
Fixes: a012b21964 ("microsoft: Initial vulkan-on-12 driver")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16629>
(cherry picked from commit 1eaba553e2)
Move can take in a vector and write a scalar, depending on the swizzle. We need
to handle this case. Split out mov and pack_32_2x16 so we can specify correct
behaviour for both. Also drop unused 1-bit boolean stuff which obscured the fix.
Fixes: 76cea8e27b ("panfrost: Fix pack_32_2x16 implementation")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>
(cherry picked from commit 9924e6f291)
SVGAv3 changes the PCI id due to differences in how PCI configuration
is handled - removal of VRAM and FIFO PCI resources, switch to MMIO
registers and MSI/MSI-X IRQ support but the 3D commands remain largely
the same.
This enables 3D/graphics acceleration support on SVGAv3.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 16019ff7cc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
SVGA device always supports direct maps which are preferable in all cases
because they avoid temporary surfaces and extra transfers. Furthermore
DMA transfers on devices with GB objects have undefined timing semantics.
Also the DMA transfers can not work on SVGAv3 because the device lacks
VRAM to be able to perform them.
Fix the last paths still using DMA transfers to make sure they're never
used on GB enabled configs. This fixes gnome-shell startup on SVGAv3.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Michael Banack <banackm@vmware.com>
(cherry picked from commit e5306d190a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
Flushing the command queue before mapping a resource is not enough
to guaruantee that the mapped content is not stale. We have to finish
to make sure that the gb readback actually updated the guest surface.
This fixes races in direct maps (map reads raced with gb readbacks)
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
(cherry picked from commit c7b0309723)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
svga used to use vmx backdoor directly to send logs to the host.
This functionality has been implemented in vmwgfx 2.17, but
to make sure we still work with old kernels the functionality
to use the backdoor directly has been kept.
There's no reason to port that code to arm since vmwgfx
implements it and arm64 (or other new platforms) would
depend on vmwgfx versions a lot newer than 2.17, so everywhere
but on x86/x64 it's fine to assume vmwgfx always support the host
logging ioctls.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
(cherry picked from commit 71a749bc7b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16564>
This reverts commit 68ef895674.
When trying out transcode_astc=true with BPTC on Asphalt 9, we observed
very poor image quality - to the point that basic UI icons were blocky,
and buttons with a black border had smeared pixels on the edges. Using
DXT5 had no such issues.
I originally suspected there was a bug in the BPTC encoder, but I now
believe the issue is deeper than that. The commit that introduced the
encoder, 17cde55c53, says:
"The compressor is written from scratch and takes a very simple
approach. It always uses a single mode of the BPTC format (4 for
unorm and 3 for half-floats) and picks the two endpoints by dividing
the texels into those which have more or less than the average
luminance of the block and then calculating an average color of the
texels within each division.
It's probably not really sensible to try to use BPTC compression at
runtime because for example with the Nvidia offline compression tool
it can take in the order of an hour to compress a full-screen image.
With that in mind I don't think it's worth having a proper compressor
in Mesa and this approach gives reasonable results for a usage that
is basically a corner case."
In other words, the reason our BPTC compressor was so fast is that it
only implements one of the modes and does a low quality approximation.
This honestly should probably be improved somewhat, but the original
use case was for online-compression, the uncommon but mandatory OpenGL
feature where you can supply uncompressed data and trust the driver to
compress it for you (at unknown and uncontrolled quality and speed).
Unfortunately, the compressor as it stands is simply not usable for
transcoding ASTC data where we want to preserve the underlying image
quality as much as possible.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16566>
(cherry picked from commit c54555c496)
without dynamic vertex input, pipeline vertex state must be recalculated
if buffer strides change or the enabled buffer mask changes in order
to accurately handle dynamic state stride VUs
cc: mesa-stable
fixes:
spec@!opengl 1.1@array-stride
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16563>
this is one of those cases where some bizarro format is being created
for sampling only, but gallium blasts out all the bind flags at once
trust that we're not going to do anything too crazy and let surface
usage pruning handle the rest
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16563>
the idea here is that if a resource is intended to be used solely as a rendertarget
and can't be blitted to or drawn to, then resource creation should fail
but if the resource might be e.g., a texture, then it can probably hit the subdata
path and be fine
Fixes: 37ac8647fc ("zink: reject resource creation if format features don't match attachment")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16563>
depth and stencil states should only be set if the corresponding attachment
is present, otherwise they should be ignored. this is different from
ignoring the entire VkPipelineDepthStencilStateCreateInfo struct, as
it's possible that only depth or only stencil may be present
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16457>
(cherry picked from commit 0b2d383316)
Previously this check was skipped for pools with
VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT unset, but after
96a240e1 we need to check this otherwise we risk overflowing
radv_descriptor_pool::entries into the host memory base
This fixes a crash to desktop when launching Dota 2, which overallocates
descriptor sets and expects an error to allocate another descriptor pool
Fixes: 96a240e176 ("radv: fix memory leak of descriptor set layout")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16490>
(cherry picked from commit 580046e49f)
this existed in order to reset query pools in advance so they
would never overflow. now the pools are reset every time the query
is started, so this behavior is no longer necessary
fixes#6475
Fixes: 57dd05616f ("zink/query: rewrite the query handling code to pass validation.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16484>
(cherry picked from commit 1c62c6bafd)
We no longer emit STATE_BASE_ADDRESS in every batch on XeHP, so the
decoder might not know what the various base addresses are if it's only
looking at a single batch. Fortunately, they also never change, so we
can just emit them once here.
On earlier platforms, initializing them here should be harmless. We'll
emit STATE_BASE_ADDRESS if we change them, which will update these.
Thanks to Iván Briano for catching this.
Fixes: 8831cb38aa ("anv: Stop updating STATE_BASE_ADDRESS on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16287>
(cherry picked from commit ad537edc7c)
the problem here is that this returns a vec2 instead of a vec5, which
throws all the existing calculations off
given that the shader is (still) expecting a vec2 return from this,
and there's no way to sanely rewrite with nir to be valid for both
sampler types as well as spirv translation, just pad out to a vec2
here and be done with it
Fixes: 73ef54e342 ("zink: handle residency return value from sparse texture instructions")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16456>
(cherry picked from commit 88912b3111)
If the secondary command buffer executed used push constants on a
different set of stages than the primary is using, we may end up not
reallocating them for the primary, getting misrender artifacts at best,
or a nice GPU hang at worst.
Fixes the tests from a CTS from the future:
dEQP-VK.dynamic_rendering.random.*
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16439>
(cherry picked from commit 2e46f38902)
Sampler views and samplers may not be the same limit; in fact one is 32
while the other is 128. The sampler_buffers field is tracking sampler
views (yes, naming is confusing) so we should use the right limit.
Fixes: e9c41b3214 ("gallium/u_threaded: add buffer lists - tracking of buffers referenced by tc")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15988>
(cherry picked from commit 620c5e9dd0)
If two instructions in a single bundle both write to a spilt
destination, then we need to reuse the fill and spill instructions,
otherwise the value will be overwritten.
This and the rest of this set of Midgard bug fixes were found from a
vertex shader in Firefox WebRender that is used when a video is
clipped, for example by setting the border-radius CSS property.
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>
(cherry picked from commit c65afe541b)
Check the bytemask against 0xFFFF rather than 0xF so that the fill is
skipped for a .xyzw write rather than a .x write.
Set the mask on the store to 0xF when doing a read so that all
components are written back.
Fixes: 31d26ebf1b ("pan/mdg: Fill from TLS before spilling non-SSA nodes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>
(cherry picked from commit c750ab8a38)
If a value is written in a vector CSEL but then written again by other
instructions, it still needs full alignment, so set min_alignment
using MAX2 to avoid ever reducing it.
Fixes: 1798f6bfc3 ("pan/midgard: Fix masks/alignment for 64-bit loads")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16382>
(cherry picked from commit b281843974)
When moving code into the main block or loop blocks, put the code into
its own :
if(true) { ... }
block so that we avoid break/continue/return issues.
v2: Also take care of the main block with return instructions
v3: Make deletion more obvious with dummy if blocks (Jason)
v4: Fixup assert for loops (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8dfb240b1f ("nir: Add raytracing shader call lowering pass.")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16036>
(cherry picked from commit 35d82ecf1e)
in a sequence where a driver saves 0 sampler/views before calling
u_blitter, the previous state of having 0 sampler/views bound would
not be restored as expected, resulting in stale sampler/views which
could affect behavior before new sampler/views were bound
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16178>
(cherry picked from commit 38ab178c4a)
nvidia can't do this, but also nothing uses it, so I've gone ahead and
done the bare minimum here to make cts pass
I think the work to do the shader rewrites should be easy, but without a test
case, I see no point in spending the time for it
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16100>
exporting all resources breaks suballocation, so instead just use the
existing heuristics and then forcibly rebind resources as needed
for this functionality
Reviewed-by: Dave Airlie <airlied@redhat.com>
We take a slight liberty here by allowing 0 to mean either MAILBOX or
IMMEDIATE, since Wayland (at least) doesn't have a true IMMEDIATE mode
at least MAILBOX won't throttle to vblank.
This only correctly handles intervals of 0 or 1 at the moment.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15800>
If the window is destroyed from underneath us while we happen to be in
xcb_wait_for_special_event, there's no recovery. The special event will
never match because the XID is no longer valid, and Present doesn't have
an in-band DestroyNotify. We're going to work around this by using the
poll API instead. If we get an event we short-circuit back to the top of
the "wait for available image" loop, so we drain the whole special event
queue before any other logic. Which means if we run out of special
events (and the connection and swapchain are still valid) that we
_don't_ have enough images available, so to hurry along any events that
the X server hasn't flushed out yet we call GetGeometry on the
swapchain's window. As a side effect this verifies that the window is
still alive.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15800>
On Gfx7 we can only give the sample location for a given multisample
number. This means everytime the multisampling value changes, we have
to re-emit the locations. It's fine because it's also where
(3DSTATE_MULTISAMPLE) the number of samples is stored.
On Gfx8+ though, 3DSTATE_MULTISAMPLE only holds the number of samples
and all the sample locations for all number of samples are located in
3DSTATE_SAMPLE_PATTERN. So to be more effecient there, we need to
track the locations for all sample numbers and compare new values with
the relevant sample count when touching the dynamic state.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
(cherry picked from commit 168b13364f)
It could happen that at the time of a live-range split,
a phi was not yet placed in the new instruction vector,
and thus, instead of renamed, a new phi was created.
Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16248>
(cherry picked from commit 58bd9a379e)
If wsi_configure_native_image() fails, it will call
wsi_destroy_image_info() itself, so let's try to not call it again from
wsi_wl_swapchain_destroy().
Fixes the CTS tests:
dEQP-VK.wsi.wayland.swapchain.simulate_oom.*
Fixes: b626a5be43 ("vulkan/wsi/wayland: Split image creation")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16257>
(cherry picked from commit bf04be17f7)
The VAR_TEX definition in ISA.xml only has a field for texture_index,
so trying to read sampler_index will return zero; read from
texture_index instead, and rename other fields for consistency.
The texture and sampler indices must be equal for VAR_TEX to be used,
so either name could be used for the field.
Fixes the wrong textures being used in Thief.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6219
Fixes: eb1479bda2 ("pan/bi: Support message preloading")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16255>
(cherry picked from commit 2864094f69)
This changes the intel_device_info calculation to call an additional
DRM query requesting the geometry topology from the kernel, which may
differ from the result of the current topology query on XeHP+
platforms with compute-only and 3D-only DSSes. This seems more
reliable than the current guesswork done in intel_device_info.c trying
to figure out which DSSes are available for the render CS.
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14143>
(cherry picked from commit 14cad38b19)
The nir_move/sink caused instructions to sink interleaved into the output
stores at the end of the shader. nouveau's RA doesn't track liveness of
FS outputs in registers after the export instruction, so they could end up
overwritten. To work around it, after normal NIR move/sink, move the
output stores back to the end of the shader.
Fixes: b1fa2068b8 ("nouveau/nir: Enable nir_opt_move/sink.")
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15949>
(cherry picked from commit 3ddc505400)
The ARB_shader_objects spec says the following:
> The error INVALID_VALUE is generated by any command that takes one or
> more handles as input, and one or more of these handles are not an
> object handle generated by OpenGL.
And a long, long time ago, we used do to just that for
glDeleteObjectARB... Until 9ac9605de1, all the way back in February 2006,
where the error condition was removed without explanation.
Let's restore it, because it should really be there.
This was noticed by running the tests that are in the mesa-demos
repository, that actually tested this condition.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16211>
(cherry picked from commit ba9c917149)
The input is an array so moving it to a single temporary value doesn't
seem to make much sense. I also don't see any piglit regressions when
not moving the value to a temporary.
Fixes: bc912bace1
virgl: Add workarounds for virglrenderer input/sv signedness bugs.
v2: remove unused enum for SAMPLEMASK (Emma)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15997>
(cherry picked from commit 89bba41d90)
is_pixmap is defined in kopper_allocate_textures() as being (!window && x11),
which is very different from this check, which determines whether the drawable
is a window
so rename it to keep things consistent
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16190>
Without this being atomically incremented and decremented, I observed
this assert triggering in debug builds:
src/vulkan/wsi/wsi_common_x11.c:x11_present_to_x11_dri3():
assert(chain->sent_image_count <= chain->base.image_count);
I think this was happening since,
src/vulkan/wsi/wsi_common_x11.c:x11_handle_dri3_present_event()
which decrements chain->sent_image_count may be run in a separate
thread.
Fixes: d0bc1ad377 ("vulkan/wsi/x11: add sent image counter")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15908>
(cherry picked from commit 212fb25b26)
When the divisor is 0, the compiler should generate a different VS
prolog instead of re-using a previous prolog that uses nontrivial
divisors. This is because divisor == 0 and divisor > 1 should use
a different path to guarantee that the index is correctly computed.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16009>
(cherry picked from commit f525706e77)
due to desync between the frontend and the driver, the size that the
depth buffer was created with may not match the size of the swapchain if
the window is being resized very quickly, so just go ahead and clobber
the existing depth buffer with a series of very illegal internal object
replacements to make everything match up
do not try at home.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16179>
We are underprovisioned for Windows, almost certainly not running wide
enough on the insufficient number of slots we do have, and there are
also indications that the machine itself is having physical issues.
Disable it until it's fixed.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16185>
My understanding of the signal masks is that they control what stages
must complete before the semaphore is signaled. Using 0 theoretically
means the semaphore could be signaled immediately without waiting on
anything. Use ~0 instead to say it depends on everything.
Fixes: 97f0a4494b ("vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16145>
(cherry picked from commit 02fea6c179)
Without this we might choose 8 or 16 width, while the app assumes 32.
With subgroup operations it may cause wrong calculations and thus bugs.
Examples of such games are Aperture Desk Job and DOOM Eternal.
v2: Make it a driconf option instead of applying unconditionally, move
from brw_required_dispatch_width to brw_compile_cs
v3: Rename allow_assuming_full_subgroups -> assume_full_subgroups.
Include assume_full_subgroups value in anv_pipeline_hash_compute().
v4: Move actual workaround code from brw_fs.c -> anv_pipeline.c.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6171
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15708>
(cherry picked from commit 28ca5636f6)
With mutable descriptor types, we can end up in a situation where a
binding can be, for instance, both a UBO and an acceleration
structure.
While we can promote the UBO to a binding table entry and the shader
can use it, this isn't true of acceleration structures that have no
surface state. In that case just skip the entry. The shader is already
compiled to use the descriptor entry.
In the non mutable case, the entry will not be created by
anv_nir_apply_pipeline_layout.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 63e91148b7 ("anv: Enable VK_VALVE_mutable_descriptor_type")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15969>
(cherry picked from commit fe413962b4)
XFB varyings are considered as always active IO to prevent them to
be removed or compacted. Though, if the NIR linker doesn't mark XFB
varyings as unmoveable it still possible to remap other varyings to
the same location/component.
Fixes KHR-Single-GL46.enhanced_layouts.xfb_override_qualifiers_with_api
with Zink and a bunch of other dEQP XFB tests.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6301
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16092>
(cherry picked from commit 4ebb5391ac)
Conflicts:
src/gallium/drivers/zink/ci/zink-radv-fails.txt
The Iris code that deals with implicit tracking is protected by
bufmgr->bo_deps_lock. Before this patch, we hold this lock during
update_batch_syncobjs() but don't keep it held until we actually
submit the batch in the execbuf ioctl. This can lead to the following
race condition:
- Context C1 generates a batch B1 that signals syncobj S1.
- Context C2 generates a batch B2 that depends on something that B1
from C1 is using, so we mark B2 as having to wait syncobj S1.
- C2 calls submit_batch() before C1 does it.
- The Kernel detects it was told to wait on syncobj S1 that was
never even submitted, so it returns EINVAL to the execbuf ioctl.
- We run abort() at the end of _iris_batch_flush().
- If DEBUG is defined, we also print:
iris: Failed to submit batchbuffer: Invalid argument
I couldn't figure out a way to reproduce this issue with real
workloads, but I was able to write a small reproducer to trigger this.
Basically it's a little GL program that has lots of contexts running
in different threads submitting compute shaders that keep using the
same SSBOs. I'll submit this as a piglit test. Edit: Tapani found a
dEQP test case which fails intermintently without this fix, so I'm not
sure a new Piglit is worth it now.
The solution itself is quite simple: just keep bo_deps_lock held all
the way from update_batch_syncobjs() until ioctl(). In order to make
that easier we just call update_batch_syncobjs() a little later. We
have to drop the lock as soon as the ioctl returns because removing
the references on the buffers would trigger other functions to try to
grab the lock again, leading to deadlocks.
Thanks to Kenneth Graunke for pointing out this issue.
This has also been confirmed to fix a dEQP test that was giving
intermittent failures:
dEQP-EGL.functional.sharing.gles2.multithread.random.images.copyteximage2d.12
v2: Move decode_batch() out, just to be safe (Jason).
v3: Do it all after assembling validation_list (Ken).
Cc: mesa-stable
Fixes: 89a34cb845 ("iris: switch to explicit busy tracking")
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14964>
(cherry picked from commit 3532c374de)
The `IMAGE_UNDER_TEST` variable set in `.b2c-test` got broken with
the merge of 7d474c1 (ci: Move most stuff out of root .gitlab-ci.yml).
During the shuffling, the `MESA_BASE_TAG` and `MESA_IMAGE_TAG`
variables were dropped, leading to `IMAGE_UNDER_TEST` being an
unexisting container.
To make this issue less likely to happen in the future, this patch
drops the code duplication that led to `IMAGE_UNDER_TEST` to be
the same as `MESA_IMAGE` and instead re-uses .use-debian/x86_test-vk
to generate `MESA_IMAGE`, which we then use verbatim in
`IMAGE_UNDER_TEST`.
The renaming is `MESA_IMAGE` into `IMAGE_UNDER_TEST` there to make the
distinction clear between the image run by gitlab-runner (what is
usually called `MESA_IMAGE` but we instead hardcode to valve-infra's
trigger container), and the image we are running on the test machines.
Fixes: 7d474c1 (ci: Move most stuff out of root .gitlab-ci.yml)
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Charlie Turner <cturner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15555>
(cherry picked from commit c672464844)
src/gallium/auxiliary/tgsi/tgsi_scan.c:287: scan_src_operand: Assertion `info->sampler_targets[index] == target' failed.
assert was being triggered by
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit
using the stencil fallback with zink.
Fixes: f05dfddeb1 ("u_blitter: fix stencil blit fallback for crocus.")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16069>
(cherry picked from commit 4b7ba3869b)
When setting the dst framebuffer width height, it might be silly
to constrain this beyond the dst resource, but at least constrain
it correctly to take account of x/y offsets.
This fixes some uses of this as a fallback for zink with
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_stencil_blit
Fixes: b4c07a8a87 ("gallium/util: allow scaling blits for stencil-fallback")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16069>
(cherry picked from commit dbc264f504)
this was an attempt to minimize the number of xfb barriers being emitted,
but really xfb barriers need to always be emitted in order for xfb to work
cc: mesa-stable
fixes (nv):
KHR-GL46.texture_view.reference_counting
KHR-GL46.transform_feedback_overflow_query_ARB.multiple-streams-multiple-buffers-per-stream
KHR-GL46.transform_feedback_overflow_query_ARB.multiple-streams-one-buffer-per-stream
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16065>
(cherry picked from commit e509598470)
Conflicts:
src/gallium/drivers/zink/ci/zink-nv-fails.txt
a read barrier is needed for resume, yes, but the counter buffer
is always being written to, so write access must always be set
cc: mesa-stable
fixes (nv):
KHR-GL46.transform_feedback.draw_xfb_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16065>
(cherry picked from commit fc5edf9b68)
Conflicts:
src/gallium/drivers/zink/ci/zink-nv-fails.txt
this was well-documented, but ultimately wrong: the synchronization
being used was for binding streamout buffers (not counter buffers) as
vertex buffers, which was already handled just fine in the normal
vertex buffer binding
drawing from streamout ONLY uses the counter buffer, which means
the counter buffer needs to be synchronized for reading
cc: mesa-stable
fixes (nv):
KHR-GL46.transform_feedback.draw_xfb_feedbackk_test
KHR-GL46.transform_feedback.draw_xfb_instanced_test
KHR-GL46.transform_feedback.draw_xfb_stream_instanced_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16065>
(cherry picked from commit a056cbc691)
Conflicts:
src/gallium/drivers/zink/ci/zink-nv-fails.txt
VkObjectType and VkDebugReportObjectTypeEXT has the same enum-values.
Why the Vulkan WG thought this was a good idea, beats me. But it's what
we have to live with now.
Anyway, instead of having a statement that implicitly casts two
different values from the former to the latter, let's fully relsove the
type as the former, and cast the value when using it instead.
Fixes: 41318a5819 ("vulkan: Use vk_object_base::type for debug_report")
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15547>
(cherry picked from commit b27a2ba4fc)
the pipe cap is used for gating wideline support, so this will always
be 1.0 when not supported
furthermore, the previous code wasn't accurately checking line width
for tess shaders, breaking tests
cc: mesa-stable
fixes (nv):
KHR-GL46.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15960>
(cherry picked from commit d8b66fcbf9)
if a rendertarget-specified image can't be a rendertarget or a blit dst
then it can't be used for the designated functionality and must be rejected
cc: mesa-stable
fixes hangs on various nv driver versions:
dEQP-GLES2.functional.texture.mipmap.2d.generate.rgba5551_fastest
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15960>
(cherry picked from commit 37ac8647fc)
This is used to determine the geometry shader info on GFX9, and it
looks like it was broken for topologies that use adjacency.
This is also used to remove PSIZ from shaders that don't need it.
Found by inspection.
fossils-db (Polaris10):
Totals from 140 (0.10% of 135960) affected shaders:
SGPRs: 10448 -> 9696 (-7.20%)
VGPRs: 4376 -> 4264 (-2.56%)
CodeSize: 164316 -> 161028 (-2.00%)
Instrs: 26449 -> 25767 (-2.58%)
Latency: 184448 -> 180468 (-2.16%)
InvThroughput: 80772 -> 79092 (-2.08%)
VClause: 337 -> 328 (-2.67%); split: -2.97%, +0.30%
SClause: 859 -> 813 (-5.36%); split: -5.70%, +0.35%
Copies: 1027 -> 790 (-23.08%)
PreSGPRs: 2751 -> 2331 (-15.27%)
PreVGPRs: 3887 -> 3836 (-1.31%)
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15948>
(cherry picked from commit ed7d831525)
When we put NIR in the compiler stack for r300, indirect addressing broke
for gallium nine. DX's array indirects round the float value, so the DX
shader gets mapped to a TGSI "ARR ADDR[0] src.x" instruction. Translating
that to NIR maps to r0[f2i32(fround(src.x))]. While we might hope that in
translation back using nir-to-tgsi after optimization we would recognize
the construct and emit ARR again, that's going to be error prone (think
"what if src.x is in a NIR register?") so we need a fallback plan. r300
will be able to handle this lowering, so get it in place first to fix the
regression.
Fixes: #6297
Fixes: 7d2ea9b0ed ("r300: Request NIR shaders from mesa/st and use NIR-to-TGSI.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15870>
(cherry picked from commit 6947016b46)
"""Exception raised when the Mesa CI script finds something in the logs that
is known to cause the LAVA job to eventually fail"""
pass
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.