This fixes several tests when the initially uploaded buffer
from CPU was being ignored because vc4_texture_subdata was not
marking the resource as written/initialized.
The usage flags management available at vc4_resource_transfer_map
is generalized into vc4_map_usage_prep and reused at
vc4_resource_transfer_map. This makes vc4 implementation more similar
to v3d.
This fixes 7 text in the following subgroups:
-dEQP-GLES2.functional.fbo.render.texsubimage.*
-dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.*
-spec@arb_clear_texture@arb_clear_texture-*
Cc: mesa-stable
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25297>
(cherry picked from commit cb96dab5c8)
Keep most cases using the stack as it's cheaper, but fall back to the
heap when the size gets too big.
This should fix a stack overflow reported by @rhezashan for a case
where we had lots of iris_screens.
Credits to Matt Turner and José Roberto de Souza for their work on
this issue, which led us to find its root cause.
Cc: mesa-stable
Reported-by: rheza shandikri (@rhezashan in gitlab)
Credits-to: José Roberto de Souza <jose.souza@intel.com>
Credits-to: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25236>
(cherry picked from commit 3cec15dd14)
I don't see why a p_logical_end is expected or required. It might not be
present in some situations, which causes an assertion failure:
s2: %19646:s[0-1] = p_reload %19701:v[8], 11
s2: %0:exec, s1: %8817:scc = s_andn2_b64 %19646:s[0-1], %0:exec
s2: %8818:s[20-21] = p_cbranch_z %0:exec BB1116, BB1114
No fossil-db changes (gfx1100).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25244>
(cherry picked from commit 6dd751b3b9)
This is when the copies actually happen, not at the branch.
fossil-db (gfx1100):
Totals from 1 (0.00% of 79332) affected shaders:
Instrs: 424 -> 423 (-0.24%)
CodeSize: 2172 -> 2168 (-0.18%)
Latency: 2899 -> 2896 (-0.10%)
Copies: 24 -> 23 (-4.17%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25244>
(cherry picked from commit 8d5bd3ca48)
The AMD Vulkan driver uses LLVM by default, but it is possible to build
the driver without the LLVM dependency. In this case, we must explicitly
disable LLVM support, or else meson will die after failing to find LLVM.
The Android build system already knows when to link libLLVM, so forward
that information to meson.
Cc: mesa-stable
Acked-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Change-Id: I7489d3811625b390aaaf2e84e666b4a8d98328b0
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24007>
(cherry picked from commit ec32619cb0)
problem: max_poc means the number of bits used in poc lsb
in slice header, and it should not be related to GOP
size. When large GOP size used, it could generate
corrupted video, as the POC could not be correctly
decoded.
solution: use fixed value of max_poc (16) for now.
Cc: mesa-stable
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25214>
(cherry picked from commit fb0f51bc64)
in a stream like:
* set fb state (A)
* flush
* set fb state (B)
* draw -> driver query
* flush
the "driver query" should return the tc info corresponding to the most
recent fb state (B). previously this would increment to C because
the flag for incrementing at the start of a batch was set
Fixes: 07017aa137 ("util/tc: implement renderpass tracking")
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25206>
(cherry picked from commit 9399165bd4)
The availability data would be written to a different location in
the user provided buffer depending on whether the query for a given
index was available. Fix this by using fixed indicies when writing
the query and availability data.
Fixes conformance failures seen in the
dEQP-VK.query_pool.occlusion_query.get_reset_* test group when
implementing VK_EXT_host_query_reset.
Fixes: 279c7c6d5a ("pvr: Implement vkGetQueryPoolResults API.")
Signed-off-by: Vlad Schiller <vlad-radu.schiller@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25116>
(cherry picked from commit ca9734c223)
For some specific texture sizes, notably some texture sizes with width
4096, block stride calculation could end up calculating stride 256 which
is an invalid value.
In those specific cases, this could cause rendering artifacts or
application/driver crashes.
Cc: mesa-stable
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25084>
(cherry picked from commit cb1c88d41f)
to avoid splitting renderpasses, this subdata optimization handles the usual
driver dance of staging buffer -> gpu copy
if the pbo stride doesn't match the image format's stride, however, then
a direct copy will yield broken pixels and the image will misrender. to avoid this,
detect stride mismatch and translate the single subdata call into a sequence
of non-overlapping subdata calls that the driver can magically figure out
while continuing to not split renderpasses
fixes#9589
cc: mesa-stable
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24849>
(cherry picked from commit 51ad269198)
These patterns are broken in the following scenario:
%1 = f2fmp %0
%2 = fddx %1
%3 = ... // non quad uniform
if %3 {
%4 = f2f32 %2
...
}
Which would turn into
%3 = ...
if %3 {
%4 = fddx %0
...
}
Yet another example that shows why derivative instructions should be
be intrinsics, not alu.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25014>
(cherry picked from commit 136a698251)
Issue: For texture with multiple planes, the planes will point to the
same BO with the total size, so current vcn dt_size is incorrect.
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[0]))
...
buf = 0x5555558daa30,
gpu_address = 0xffff800101000000,
bo_size = 0xa2000,
...
}
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[1]))
...
buf = 0x5555558daa30,
gpu_address = 0xffff800101000000,
bo_size = 0xa2000,
...
}
This is because: in function static struct si_texture *si_texture_create_object(),
if (plane0) {
/* The buffer is shared with the first plane. */
resource->bo_size = plane0->buffer.bo_size;
...
radeon_bo_reference(sscreen->ws, &resource->buf, plane0->buffer.buf);
resource->gpu_address = plane0->buffer.gpu_address;
}
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9728
Cc: mesa-stable
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25013>
(cherry picked from commit 7876a2f685)
Make sure that both per-vertex and per-primitive attribute
ring stores are finished before position or primitive export
instructions are executed.
This is necessary because we need to ensure that mesh shader
waves work correctly when they have either vertex-only or
primitive-only waves.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit 93b4f200de)
Cleanup the code that generates the two channels of the
primitive export instruction, and move storing the built-in
per-primitive outputs out to match how vertex attributes work.
Prepares the mesh shader lowering for a workaround that
affect export instructions.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit 0721784b78)
This is a HW bug workaround for some (all?) GFX11 chips.
On these chips, rasterization can start before the attribute ring
stores are finished, which can cause issues.
As a workaround, wait for attribute ring stores to finish
before doing the position export.
Mesh shaders will be taken care of in another commit.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24574>
(cherry picked from commit edd51655f0)
It's not disallowed by spec to load instance-related data in case of a
miss where no instance was ever visited. Such loads make no sense, so we
can return garbage, but it mustn't hang the GPU. Initialize the instance
addresses to the TLAS base to make sure we always have valid memory to load from.
Partially fixes GPU hangs in RTX Remix games.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24971>
(cherry picked from commit 728f6c0b70)
There is no restriction for query per sample positions from the
interpolator when in non-per-sample dispatch mode. But apparently
that's not giving us the expected values for fragment shaders compiled
without per-sample dispatch knowledge (graphics pipeline libraries).
So when per-sample dispatch is dynamic and we're doing at_sample
interpolation, turn the interpolation back into at_offset at runtime
when we detect that the fragment shader is not run per sample.
Fixes a bunch of dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>
(cherry picked from commit 68027bd38e)
Huge thanks to Tapani Pälli for debugging this issue, figuring out
what was going wrong, proposing fixes, and walking me through where
things were going off the rails.
BLORP always disables tessellation and geometry shaders. Our handling
tried to look at ice->shaders.uncompiled[] to determine whether the next
draw needed those shaders. If not, we can leave BLORP's residual state
that disabled those stages in place, and skip looking at it.
Unfortunately, predicting the future is a bit fraught, in part due to
the uncompiled[] and prog[] arrays being slightly out of sync at times.
Consider the following case:
1. Draw with tessellation shaders in place
=> uncompiled[TES] and prog[TES] will both point at valid shaders.
2. Gallium calls pipe->bind_tes_state(NULL).
=> This makes uncompiled[TES] point at NULL, and flags
IRIS_STAGE_DIRTY_UNCOMPILED_TES.
Because iris_update_compiled_shaders() hasn't happened yet,
uncompiled[TES] is NULL but prog[TES] has the stale TES from
the previous draw still.
3. BLORP operations happen
=> BLORP sees uncompiled[TES] == NULL and decides that tessellation
is off for the upcoming draws. So it skips flagging tess state.
4. Gallium calls pipe->bind_tes_state(shader from step #1).
=> uncompiled[TES] points at the original shader.
IRIS_STAGE_DIRTY_UNCOMPILED_TES gets flagged again.
5. Draw again
=> This calls iris_update_compiled_shaders(), which sees that
a TES is bound, and calls iris_update_compiled_tes(). But
because the same shader was bound as before, the program it
comes up with is identical to the one already bound at
ice->shaders.prog[TES]. So, it thinks it doesn't have to
flag any tessellation state dirty because it was already
set up for the last draw.
This random unbind and rebind between draws leads to a situation
where, at step #3, BLORP thinks it can skip flagging tessellation
state (nothing is bound), and at step #5, normal state handling
thinks it can skip flagging tessellation state (nothing changed
since last time). So nobody does, and things break.
This unbind appears to be happening when st_release_variants()
decides it wants to free some shaders. Then a rebind happens to
put back the actual shader for the draw. So, it's not theoretical.
To fix this, we change BLORP to look at ice->shaders.prog[] rather
than uncompiled[]. This is equivalent to thinking about the previous
draw, rather than the next. If the last draw had tessellation off,
then BLORP's disabling was a no-op, and the GPU is still in the same
state as the previous draw. This is more reliable than predicting
the future.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8308
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9678
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24880>
(cherry picked from commit d693027a00)
The msm driver reserves the actual DMABUF size in the memory map, while
TU can request smaller memory chunk to be allocated. This potentially
can lead to a situation when next allocation IOVA will be in the middle
of the address space which is reserved for the DMABUF. Pass the
`real_size' to TU allocator instead, so that kernel and userspace have
the same picture of memory allocations.
Cc: mesa-stable
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24861>
(cherry picked from commit 2fdcc00b01)
The vulkan spec says all conversions are correctly rounded, so if the input
is larger than the largest fp16 value, we need to return MAX_FLOAT/inf
instead of cutting off the msbs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24826>
(cherry picked from commit 6d949e18fd)
The blob replicates both the value mask as well as the stencil reference
of the back-facing stencil to the front-facing stencil. This fixes the
remaining failures in the following dEQPs:
dEQP-GLES2.functional.fbo.render.*_stencil_index8
Fixes: c8ccd63911 ("etnaviv: Fix depth stencil ops on GC880/GC2000")
Signed-off-by: Marek Vasut <marex@denx.de>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4867>
(cherry picked from commit ef4cb2431d)
For z surfaces, flags.texture should be based on
RADEON_SURF_TC_COMPATIBLE_HTILE alone. Otherwise, addrlib could pick a
_X/_T swizzle mode for a MSAA depth texture, which is said to be broken:
When _X/_T swizzle mode was used for MSAA depth texture, TC will get zplane
equation from wrong address within memory range a tile covered and use the
garbage data for compressed Z reading which finally leads to corruption.
Fixes: de0885cdb8 ("amd/surface: add RADEON_SURF_NO_TEXTURE flag")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24767>
(cherry picked from commit e74c3dbb70)
It uses a poll function that waits for a second hoping for another thread
to catch up, which is not a reliable way to do synchronization. The test
has been spuriously failing merges on a regular basis recently.
This is issue #9222, which I'm leaving open until the author can fix the test.
Fixes: 3b69b67545 ("util/fossilize_db: add runtime RO foz db loading via FOZ_DBS_DYNAMIC_LIST")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24755>
(cherry picked from commit 4dfd306454)
It's unnecessary because earlier parts of the pass will ensure that a
mov of undef is turned into an undef. It's also wrong because
nir_op_mov has different semantics from nir_op_vecN when it comes to how
sources map to destination components.
Fixes: 5f26c21e62 ("nir: Expand opt_undef to handle undef channels in a store intrinsic")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24704>
(cherry picked from commit 408929289a)
These work in some circumstances (dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_float_16_to_64.scalar9_tessc),
but I'm not sure if they work in all, blending certainly doesn't work and
this probably wasn't intended in the first place.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 01bd012edd ("amd: fix 64-bit integer color image clears")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24400>
(cherry picked from commit 405f3bf990)
This reverts commit 6b494745be.
The logic is not entirely correct: the comparison is between two
static-analysis estimates of a dynamic system with variables that aren't
captured by the shader source, so using ">" will always have greater potential
to cause regressions whenever the performance difference between the two builds
is something not captured by the static model, no matter how much the model is
improved.
Reference: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9262
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24615>
(cherry picked from commit d142c845d0)
Adding GEM handles to the global list is necessary to allow
maintaining a single reference count for handles that are shared
between multiple buffer objects.
Since exported handles can end up being shared with other buffer
objects, as in the case that drmPrimeHandleToFD() and gbm_bo_import()
are called externally to Mesa, they too must be added to the global
list.
Unfortunately, doing this properly requires a new libdrm API. Use
the best possible option for now.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9552
Signed-off-by: Dor Askayo <dor.askayo@gmail.com>
Acked-by: Karol Herbst <git@karolherbst.de>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24648>
(cherry picked from commit daa1f789b5)
The buffer data is not directly accessible to application and it's
internally used to only store VACodedBufferSegment struct.
Ignore the size requested by application and instead allocate
sizeof(VACodedBufferSegment). Use calloc to zero out the struct.
This can save significant amount of memory, for example FFmpeg
will request up to tens of MB for single buffer.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6462
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24410>
(cherry picked from commit 7bcbfae87c)
The presentation timing extension is used for doing WaitForPresent
properly, but we accidentally bind it after an early return intended to
stop us from binding dmabuf when software rendering.
Remove the early return.
cc: mesa-stable
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24588>
(cherry picked from commit 5ba5bcf2b6)
vkQueuePresentKHR might return VK_SUBOPTIMAL_KHR which is not VK_SUCCESS
but presentation succeeded anyway. We should capture a trace even if
VK_SUBOPTIMAL_KHR is returned.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24052>
(cherry picked from commit b8edd19358)
as in the producer case, big io needs to reserve the appropriate number
of slots
fixes:
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-float-index-rd-after-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-float-index-wr-before-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec2-index-rd-after-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec2-index-wr-before-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec3-index-rd-after-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec3-index-wr-before-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec4-index-rd-after-barrier,Fail
spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec4-index-wr-before-barrier,Fail
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24568>
(cherry picked from commit ee6ba2bb57)
in this scenario, sample counting must happen before a2c, as a2c may eliminate
coverage if alpha is zero, leading to a sample count of zero
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_alpha_to_coverage_samples_4_maintenance5
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24589>
(cherry picked from commit ce09458917)
We don't support any ASTC formats yet, but the textureCompressionASTC_LDR
feature was incorrectly set to true. Fix this by setting it to false and
don't advertise ASTC support for texture compression.
Fixes dEQP-VK.api.info.format_properties.compressed_formats
Fixes: 8991e646 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24448>
(cherry picked from commit c5a6e88c4e)
this otherwise underflows the array and provides a (probably huge) garbage
value for the binding id, which then causes the driver to massively overallocate
both the layout and set/pool/buffer
the main result of this is that on radv any simple test that should be near-instant
takes 2-3 seconds to execute, which somehow nobody noticed
Fixes: e3b746e3a3 ("zink: use GPL to handle (simple) separate shader objects")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24501>
(cherry picked from commit 652e87bc5d)
When we switch the channels by re-creating vec4 values we have to
take into account that the source values may be used in an ALU op,
and with that we have to take read-port limitations into account.
Fixes: 18a8d148d8
r600/sfn: Cleanup copy-prop into vec4 source values
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24519>
(cherry picked from commit 807c0d6bb7)
This fixes some CACHE_ERROR caused by proper multi-threading support. The
bug is a bit older though, just never triggered because there was only one
push buffer to begin with.
Without this change the compute initialization stayed unpushed in the
screen push buffer causing random issues.
Fixes: ff72440b40 ("nv50: implement a basic compute support")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24496>
(cherry picked from commit a9a30a7e09)
The GL_EXT_multisampled_render_to_texture spec explicitly allow reading
from these FBOs.
"Similarly, for ReadPixels:
'An INVALID_OPERATION error is generated if the value of READ_-
FRAMEBUFFER_BINDING (see section 9) is non-zero, the read framebuffer is
framebuffer complete, and the value of SAMPLE_BUFFERS for the read
framebuffer is one.'
These errors do not apply to textures and renderbuffers that have
associated multisample data specified by the mechanisms described in
this extension, i.e., the above operations are allowed even when
SAMPLE_BUFFERS is non-zero for renderbuffers created via Renderbuffer-
StorageMultisampleEXT or textures attached via FramebufferTexture2D-
MultisampleEXT."
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10747>
(cherry picked from commit d7b9da2673)
For EXT_multisampled_render_to_texture, we store the number of samples
in Attachment->NumSamples instead of Renderbuffer->NumSamples. This
meant that the previous code ignored that the framebuffer was
multisampled. Because of this, pipe_rasterizer_state::multisample is set
incorrectly, leading to visual artifacts on drivers that support MS-RTT
extension, such as panfrost.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10747>
(cherry picked from commit 4d1a07c1a0)
For TXQ we know make sure that we at least add one source. If the nir
instruction however didn't had any sources, we inserted a fake 0 source
ending up with two 0s for TXQ.
It's unclear to me if we have other ops where this would be necessary.
Fixes: 85a31fa1fc ("nv50/ir/nir: fix txq emission on MS textures")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Acked-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24373>
(cherry picked from commit 8d7f682bdb)
In GL and a lot of Vulkan if we end up with either a lod or an ms index.
Sadly in Vulkan we can end up with both and have to choose properly. For
TXQ we have to emit a zero LOD. For TXF we have to emit the ms index.
Fixes: bb032d8b62 ("nv50/ir/nir: implement nir_instr_type_tex")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24343>
(cherry picked from commit 85a31fa1fc)
nir_const_value_for_int asserts signed bounds on the input, but we pass in an
unsigned value that would be out-of-bounds for 32-bit channels, causing the
assert to fail for 32-bit channel formats.
Fixes dEQP-VK.pipeline.monolithic.logic_op.r32_uint.* on AGXV (and probably
PanVK).
Fixes: dbd0615e7a ("nir/lower_blend: Avoid useless iand with logic ops")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24252>
(cherry picked from commit 9c0740211d)
ISL's state-machine of CCS_D describes full resolves as leaving the aux
buffer in the pass-through state. Hardware doesn't behave this way on
gfx8 however. On that platform, full resolves transition the aux buffer
to the resolved state. This was verified by dumping the CCS before and
after a full resolve on BDW (gfx7 is simply assumed to behave the same).
Ambiguate after resolving to match driver expectations.
Prevents iris from failing piglit's fcc-write-after-clear on BDW with a
future patch which relies on fast-clear encodings being removed after a
resolve. The avoided failure is:
Testing implicit read of partial block UNORM -> SNORM
Probe color at (0,1,0)
Expected: 1.000000 1.000000 1.000000 1.000000
Observed: 0.000000 0.000000 0.000000 0.000000
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
(cherry picked from commit 1d12b29b3f)
In the case of:
halt
// succs: b9
if %618 {
block b3:// preds:
break
// succs: b6
} else {
block b4: // preds: , succs: b5
}
block b5: // preds: b4
32 %556 = iadd %617, %2 (0x1)
opt_constant_if() doesn't work because stitch_blocks() can't join blocks if the
before ends in a jump and the after isn't empty.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24235>
(cherry picked from commit 21f0aca948)
Same cause as for other R8G8 formats - msaa resolve via
blit event causes gpu fault.
Fixes:
dEQP-VK.api.image_clearing.*.clear_color_attachment.*.r8g8_srgb_*
Fixes: 029919f3c8
("tu: allow using resolve engine for SRGB MSAA resolves")
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24277>
(cherry picked from commit eeb1fd90fc)
Before this change, anv_get_image_format_features2 reported support for
ASTC formats with any modifier (even those not supported by anv). But,
we didn't intend to support that compressed image format with modifiers.
With this change, the format feature function reports no support for
modifiers on ASTC-formatted images.
This prevents the next patch from causing assertion failures due to
unsupported modifiers.
Fixes: 355f318843 ("anv: Allow transfer-only linear ASTC images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24120>
(cherry picked from commit e50af52e3d)
Even on Valhall, vertex_id is zero-based in a transform feedback program. Lower
that for transform feedback programs properly since it wouldn't happen
automatically on Valhall. Fixes assertion fails.
Fixes: 91ffd10351 ("pan/bi: Lower gl_VertexID in NIR")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24198>
(cherry picked from commit 64ff2b3ed6)
Without SP_FS_CTRL_REG0.LODPIXMASK quad ops don't get values from
helper invocations, but from the current one.
Fixes:
dEQP-VK.glsl.derivate.dfdxsubgroup.*
dEQP-VK.glsl.derivate.dfdysubgroup.*
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24211>
(cherry picked from commit a0d426370d)
That's the "real" name of the field.
It enables ALL helper invocations in a quad, which is necessary for
fine derivatives and quad subgroup ops.
While PIXLODENABLE by itself enables only 3 out 4 fragments in a quad.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24211>
(cherry picked from commit 696f37f5c3)
Some Wayland compositors, notably Exo, do not always release buffers
fast enough, and not in sync with their frame callbacks, to guarantee
that a free buffer is available the next time a client calls
`eglSwapBuffers()`.
This currently leads to a crash in `dri2_wl_swrast_get_backbuffer_data()`
with the swrast backend. To avoid this, simply block until the
compositor releases a buffer eventually.
While arguably compositors should release buffers they don't need any
more for the next frame, this can be quite complex depending on
the architecture - notably multi-process/IPC in case of Exo.
cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24091>
(cherry picked from commit 74451ed3f0)
It happened because glCallList was restoring varying_vp_inputs, which
caused every glCallList to process the state change again.
This loosely reverts commit 3a294ff01f
"mesa: move the _mesa_set_varying_vp_inputs call to where the state changes".
Fixes: 3a294ff01f - "mesa: move the _mesa_set_varying_vp_inputs call to where the state changes"
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24165>
(cherry picked from commit c97961a855)
nir_opt_mov and nir_op_vecN are only the same if the mov is only a
single component. Otherwise the vec loop will try to access src[c]
where c > 0 which breaks for nir_op_mov. It's uncommon but scalar
back-ends can see vector movs so we need to handle this correctly.
Fixes: 6513c675ad ("nv50/ir/nir: implement nir_alu_instr handling")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24167>
(cherry picked from commit 259ba104f7)
utrace submits can either have a batch or not.
When there is a batch, the utrace vk_sync is signaled by the utrace
batch (because utrace does a timestamp buffer copy using its own
batch). When there is no batch, the utrace vk_sync should be signaled
by the application batch (no timestamp copy required, utrace can read
the timestamps when the application batch has completed).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: fdea48df5e ("anv: Implement Xe version of anv_queue_exec_locked() and queue_exec_trace()")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24085>
(cherry picked from commit a85b84ba1e)
submit_count is used to disambiguate a batch_id based on the generation
id of a given batch: this value is incremented once on submit and once on
reset such that the diff of the values is > 1 any time the batch does not
represent the fence it was last submitted with
in the case of a batch's first use, however, this value was being incorrectly
incremented such that the first submit would cause disambiguation checks
to erroneously determine that the batch had already completed, breaking synchronization
fixes#9313
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24016>
(cherry picked from commit 3c520892b1)
This issue is mainly a consequence of a call to util_blitter_clear()
with unnecessary blitter states, these states are never freed.
This change is inspired from radeonsi and r600.
Note: PAN_SAVE_FRAGMENT_STATE is added and always enabled
at this stage.
For instance, this issue is triggered on Mali-T720 with
"piglit/bin/fcc-read-after-clear sample tex -auto -fbo", "piglit/bin/cubemap -auto"
and "piglit/bin/fbo-srgb -auto" or on Mali-T820 with "piglit/bin/longprim -auto -fbo"
and "piglit/bin/ext_render_snorm-render -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22522>
(cherry picked from commit 689f38b2b4)
Kopper requires that any depth/stencil images are created through winsys which
was not taken into account by the WGL frontend causing it to hit an assert:
'Assertion failed: ctx->fb_state.zsbuf->texture->bind & PIPE_BIND_DISPLAY_TARGET'
fixes#9256
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24055>
(cherry picked from commit d3662ba461)
Instead of manually indexing a single-dimensional array as 2-dimensional
(and using the wrong stride for the outer array) just actually make it
a 2-dimensional array.
Fixes: 7edae456 ("d3d12: Track up to 16 contexts worth of batch references locally in bos")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24041>
(cherry picked from commit a6740ee7a4)
Most users of nir_foreach_function actually want the nir_function_impl, not the
nir_function, and want to skip empty functions (though some graphics-specific
passes sometimes fail to do that part). Add a nir_foreach_function_impl macro
to make that case more ergonomic.
nir_foreach_function_impl(impl, shader) {
...
foo(impl)
}
is equivalent to:
nir_foreach_function(func, shader) {
if (func->impl) {
...
foo(func->impl);
}
}
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23807>
(cherry picked from commit 19daa9283c)
Indeed, util_blitter_clear() requires a call to
util_blitter_save_fragment_constant_buffer_slot(),
but most other blitter functions do not.
For instance, this issue is triggered with:
"piglit/bin/object-namespace-pollution glDrawPixels buffer -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 03bc7503d4 ("radeonsi: save the fs constant buffer to the util blitter context")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23856>
(cherry picked from commit 80ccc3f822)
There is no mention in spec about subtract one of the number of
threads, also Iris and blorp code don't subtract.
Alchemist PRMs: Volume 2a: Command Reference: Instructions: CFE_STATE: Maximum Number of Threads:
Normally set to the maximum number of threads: (# EUs) * (# threads/EU)
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23973>
(cherry picked from commit c142736f52)
If a then/else block ends in a jump, the phi nodes do not necessarily
have to reference the always taken branch because they are dead code.
Avoid crashing in this case by only rewriting phis, if the block does
not end in a jump.
cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23150>
(cherry picked from commit e379b9ad8c)
1. Always calculate when asked. This is the sort of optimization that just
introduces bugs. Like one I hit when shuffling register indices around with
the register access changes.
2. Ask before using in RA.
3. Account for precoloured blend inputs.
Small shader-db hit, didn't investigate too much.
total instructions in shared programs: 1518017 -> 1518168 (<.01%)
instructions in affected programs: 2895 -> 3046 (5.22%)
helped: 0
HURT: 24
Instructions are HURT.
total bundles in shared programs: 646756 -> 646782 (<.01%)
bundles in affected programs: 1119 -> 1145 (2.32%)
helped: 1
HURT: 19
Bundles are HURT.
total quadwords in shared programs: 1133694 -> 1133728 (<.01%)
quadwords in affected programs: 1736 -> 1770 (1.96%)
helped: 0
HURT: 20
Quadwords are HURT.
total registers in shared programs: 90596 -> 90612 (0.02%)
registers in affected programs: 108 -> 124 (14.81%)
helped: 0
HURT: 16
Registers are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
(cherry picked from commit 23010acc10)
this fixes the refcount for the separate shader program to not have a leaked ref
and then fixes the owned program to have the expected number of refs
this happened to work some of the time before because there was an arbitrary unref
in replace_separable_prog(), but this shouldn't have been necessary
Fixes: e3b746e3a3 ("zink: use GPL to handle (simple) separate shader objects")
fixes#9274
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23888>
(cherry picked from commit 4e38061643)
I've recently learned that this is necessary to get
correct results from divergence analysis.
No Fossil DB stats changes on GFX10.3.
Note, when backporting this patch to stable, make sure
the call to nir_convert_to_lcssa is before nir_divergence_analysis,
which may be located in a different place in the stable branch.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23946>
(cherry picked from commit ddeabcc19b)
On Alchemist, the FF_MODE2 documentation says that we must set the
FF_MODE2 timer values for GS and HS to 224. The hardware performance
tuning guide also recommends setting the TDS timer to 4.
On Tigerlake, i915 applies workarounds to set the GS timer to 224
(failing to do so can cause HS/DS unit hangs), and the TDS timer to 4
(for performance). It doesn't currently apply a HS timer there, and
I'm not sure if it's strictly necessary, but given that Alchemist
needed it, and the other two settings matched, let's assume that it
ought to match as well.
Unfortunately, there has been a bug in the i915 workarounds
infrastructure for non-masked context registers where writing one
field of the register zeroes out all the others. So, I believe the
Tigerlake TDS timer value of 4 isn't being applied correctly there,
though the register is also not readable on that platform which
makes it hard to verify. So, this may also speed up tessellation.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9233
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23839>
(cherry picked from commit 1b3669a1ed)
This enables L3 partial write merging for a number of cases that seem
to be getting accidentally disabled by the kernel, which was causing a
serious performance bottleneck on DG2 and MTL platforms. The
"Compressible Partial Write Merge Enable", "Coherent Partial Write
Merge Enable" and "Cross-Tile Partial Write Merge Enable" bits in
L3SQCREG5 were expected to be enabled by default (and confusingly,
they even read off as enabled if you ran 'intel_reg read 0xb158' on an
idle system), but they are getting clobbered during 3D context
initialization by an i915 workaround.
Enabling L3 partial write merging of compressible surfaces in
particular seems to increase rendering fillrate by over 3x in some
cases (e.g. the
"VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0"
fillrate-bound microbenchmarks). Significant improvements can also be
reproduced in most real-world workloads we've tested so far,
e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider
improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 --
Thanks a lot to Caleb Callaway for these figures. No regressions have
been observed so far.
Even though this patch might strike as surprisingly simple for such a
large payoff, it's the result of Felix DeGrood and I trying to
root-cause the rendering performance gap of DG2 on Linux vs Windows on
and off during the last year, and some of the OA statistics captured
by Felix early this month were greatly helpful for me to connect the
last few dots, so Felix deserves a big chunk of the credit for this
work.
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783>
(cherry picked from commit 427fee3507)
- Fix null deref. VkPipelineLayoutCreateInfo::pSetLayouts is allowed to
contain VK_NULL_HANDLE.
- The loop 'break' was misplaced.
Fixes crash in
dEQP-VK.pipeline.pipeline_library.graphics_library.fast.0_00_11_11 after
VK_EXT_graphics_pipeline_library is enabled in a later patch.
Fixes: 91966f2eff ("venus: extend lifetime of push descriptor set layout")
Signed-off-by: Lina Versace <linyaa@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Dawn Han <dawnhan@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23810>
(cherry picked from commit 98c8d7b7cf)
Venus sync feedback design relies on concurrent host device resource
access. To avoid device flush overwriting host writes, we must
suballocate the slots with a minimum size of the buffer alignment.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23633>
(cherry picked from commit 2f729ff6aa)
The compile error when compiled with "-Dglx=xlib -D shared-glapi=disabled":
check_table.cpp:1133:37: error: ‘struct _glapi_table’ has no member named ‘DrawArraysInstancedARB’; did you mean ‘DrawArraysInstanced’?
1133 | { "glDrawArraysInstancedARB", _O(DrawArraysInstancedARB) },
Fixes: 5679ef99b8 ("glapi: remove EXT and ARB suffixes from Draw functions")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23793>
(cherry picked from commit 7af2c45947)
Previously the set of dynamic offsets were being reused per each
binding. That's now fixed, by using an offset to determine where
each binding's dynamic offsets reside.
Tests fixed:
dEQP-VK.binding_model.descriptor_copy.{compute,graphics}
.{uniform,storage}_buffer_dynamic_0
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Fixes: aa791961a8 ("pvr: Add support for dynamic buffers descriptors")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23587>
(cherry picked from commit b902fb2e14)
The driver can merge subpasses within a render pass into a single
hw render. While doing so it makes the assumption that the subpasses
in an hw render will all be submitted in a single job.
On vkCmdPipelineBarrier() the driver was previously incorrectly
inserting an event sub-cmd on a merged subpass, breaking that
assumption leading to incorrect values for input attachments.
Signed-off-by: Soroush Kashani <soroush.kashani@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Fixes: 6d672e0336 ("pvr: Add initial vkCmdPipelineBarrier skeleton.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23693>
(cherry picked from commit 4071b8e7f3)
Vulkan allows empty descriptor sets to be created. When we setup
the descriptor set addresses table we fill in the address of the
`bo` for each valid/currently bound desc set. For empty desc sets
there is no `bo` which was causing a seg fault. Now skip them,
leaving their address set to `~0`.
Reported-by: Simon Perretta <simon.perretta@imgtec.com>
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Fixes: ce67f5ac94 ("pvr: Write descriptor set addrs table dev addr into shareds")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23692>
(cherry picked from commit 822dc384b1)
In the following sequence :
- write buffer B with a shader
- barrier on buffer from shader-write to transfer
- vkCmdCopyQueryPoolResults to buffer B
The barrier should take care of ordering things between the shader
writes and vkCmdCopyQueryPoolResults.
The problem is that vkCmdCopyQueryPoolResults runs on the command
streamer and that is not coherent or synchronized in the same way as
shaders.
This change marks the barrier has potentially containing pending
buffer writes for queries so that we can insert the necessary flush
for vkCmdCopyQueryPoolResults later.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9013
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23675>
(cherry picked from commit cab8495625)
there are only 2 ubos that can be emitted, except the emitted ubos
can start at an offset based on the first-used ubo, which means this
has to support the full range of ubo indices
fixes oob access in game Beyond All Reason
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23729>
(cherry picked from commit 9803174942)
Indeed, the vertex state was restored using a specific
condition at the util_blitter_restore_vertex_states()
level. This change ensures that the condition is the
same when the vertex state is saved.
The function util_blitter_clear_buffer() is only called
by the r600 driver on pre-evergreen gpus.
This issue is triggered on a rv770 gpu with "piglit/bin/fbo-1d -auto -fbo"
or "piglit/bin/draw_buffers_gles2 -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 5f566faa46 ("radeonsi: don't save and restore vertex buffers and elements for u_blitter")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23721>
(cherry picked from commit 23c003b88c)
For 32bpc formats, the ICL+ sampler fetches the raw clear color dwords
used for rendering instead of the converted pixel dwords typically used
for sampling. The CLEAR_COLOR struct page documents this for 128bpp
formats, but not for 32bpp and 64bpp formats.
In blorp_copy, map R11G11B10_FLOAT to R8G8B8A8_UINT instead of R32_UINT.
This will cause the sampler to fetch the clear color pixel, allowing
drivers to keep clear color support enabled during copies.
If iris is forced to convert blits to copies, this patch fixes the
following test on gfx12:
dEQP-GLES3.functional.fbo.color.repeated_clear.blit.rbo.r11f_g11f_b10f
At the moment, both iris and anv won't hit this issue outside of
blorp_copy. This is due to the read/write access restrictions they
currently place on texture views that reinterpret the surface format.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8964
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23604>
(cherry picked from commit f0b6b57c77)
Instead of on Android. Which allows an end user to turn off expat
without breaking or disabling Intel support. I've additionally
refactored to separate expat and xmlconfig a bit more in the root
meson.build
This does make expat a hard dependency for building Intel tools, despite
the fact that only aubinator actually requires it. This simplifies the
build for the common case, and in the event that someone wants to build
the Intel tools and doesn't have libexpat, they can fall back to the
meson wrap for expat instead.
fixes: 75276deebc
closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8791
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23605>
(cherry picked from commit ce07aabab1)
This change fixes a buffer overflow by implementing the
special swizzles. This behavior is already available with
evergreen_convert_border_color().
For instance, this issue is triggered on a cayman gpu with
"piglit/bin/texwrap bordercolor -auto -fbo" or "piglit/bin/max-samplers -auto -fbo":
==5610==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000012d20 at pc 0x7fb798cb876f bp 0x7ffd78670460 sp 0x7ffd78670458
READ of size 4 at 0x603000012d20 thread T0
#0 0x7fb798cb876e in cayman_convert_border_color ../src/gallium/drivers/r600/evergreen_state.c:2444
#1 0x7fb798cb876e in evergreen_emit_sampler_states ../src/gallium/drivers/r600/evergreen_state.c:2539
#2 0x7fb7989e6cb2 in r600_emit_atom ../src/gallium/drivers/r600/r600_pipe.h:655
#3 0x7fb7989e6cb2 in r600_draw_vbo ../src/gallium/drivers/r600/r600_state_common.c:2333
#4 0x7fb7985082c7 in u_vbuf_draw_vbo ../src/gallium/auxiliary/util/u_vbuf.c:1497
#5 0x7fb796ef2eda in cso_draw_vbo ../src/gallium/auxiliary/cso_cache/cso_context.h:262
#6 0x7fb796ef2eda in st_draw_gallium_multimode ../src/mesa/state_tracker/st_draw.c:170
#7 0x7fb7970d9cfd in vbo_exec_vtx_flush ../src/mesa/vbo/vbo_exec_draw.c:341
#8 0x7fb7970d32d7 in vbo_exec_FlushVertices_internal ../src/mesa/vbo/vbo_exec_api.c:693
#9 0x7fb7970d32d7 in vbo_exec_FlushVertices ../src/mesa/vbo/vbo_exec_api.c:1193
#10 0x7fb7975f237c in enable_texture ../src/mesa/main/enable.c:337
Fixes: 923d635357 ("r600: fix some border color swizzles on CAYMAN")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23435>
(cherry picked from commit 4284705733)
Indeed, this function was not processing the linked
allocated list.
For instance, this issue is triggered with "piglit/bin/hiz-depth-read-fbo-d24-s0 -auto":
Indirect leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f6795638987 in calloc (/usr/lib64/libasan.so.6+0xb1987)
#1 0x7f678bac13b9 in nouveau_heap_alloc ../src/gallium/drivers/nouveau/nouveau_heap.c:64
#2 0x7f678bb6c7e4 in nv50_program_upload_code ../src/gallium/drivers/nouveau/nv50/nv50_program.c:490
#3 0x7f678bb83b92 in nv50_vertprog_validate ../src/gallium/drivers/nouveau/nv50/nv50_shader_state.c:161
#4 0x7f678bba3000 in nv50_state_validate ../src/gallium/drivers/nouveau/nv50/nv50_state_validate.c:552
#5 0x7f678bba3c4d in nv50_state_validate_3d ../src/gallium/drivers/nouveau/nv50/nv50_state_validate.c:575
#6 0x7f678b9e3e92 in nv50_blit_3d ../src/gallium/drivers/nouveau/nv50/nv50_surface.c:1444
#7 0x7f678b9e3e92 in nv50_blit ../src/gallium/drivers/nouveau/nv50/nv50_surface.c:1832
#8 0x7f678a0b378a in blit_to_staging ../src/mesa/state_tracker/st_cb_readpixels.c:337
#9 0x7f678a0b7358 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:516
#10 0x7f6789f82005 in read_pixels ../src/mesa/main/readpix.c:1178
#11 0x7f6789f82005 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1195
#12 0x7f6789f82ac0 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1210
...
SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
Fixes: 67635a0a71 ("nouveau: get rid of tabs")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23592>
(cherry picked from commit 1980934d0d)
split ANV_PIPE_RENDER_TARGET_BUFFER_WRITES into separate CS_STALL,
RT_FLUSH & TILE_FLUSH flags in order to have finer control over cache
coherency.
Tigerlake CS has it's own cache fetching directly from the memory controller,
so we need to do a tile flush to ensure the query data is visible.
This fixes test_resolve_non_issued_query_data in vkd3d on TGL.
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Fixes: 3c4c18341a ("anv: narrow flushing of the render target to buffer writes")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23500>
(cherry picked from commit d0e0ba897f)
There was a race between the worker thread and flush, which could lead to
the last event flushed getting its status set to CL_SUCCESS before any
other event.
Just wait on all flushed events in order to solve this.
The current queue/event implementation isn't the best and we want to
rework it, so this is good enough for now.
Fixes: 47a80d7ff4 ("rusticl/event: proper eventing support")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23578>
(cherry picked from commit da4b27452b)
When the RS uses the pixel pipes it seems to destroy/invalidate any
content sitting in the color and depth caches from a previous draw.
Always flush the color and depth cache before using the RS to make
sure that any cache content written by the PE is properly flushed
to memory.
Fixes spec@!opengl 1.0@gl-1.0-drawpixels-depth-test and probably a
few others that are suffering from corruption of PE writes.
CC: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23530>
(cherry picked from commit b1cd5780d6)
conditional render is only supposed to be enabled during renderpasses,
and this ends up doing mismatched start/stop in and out of renderpasses
affects:
GTF-GL46.gtf30.GL3Tests.conditional_render*
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23511>
(cherry picked from commit 6aa9e95021)
These functions are used by gl[Get]TexImage, which imposes no
alignment restructions on the void *pixels parameter.
This fixes an unaligned access in GTK's "gtk:gdk / memorytexture" unit
test on SPARC, which causes the test to fail.
Fixes: 45ae4434b5 ("util: Use bitshift arithmetic to unpack pixels.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23482>
(cherry picked from commit 19092576ce)
So the shuffleup tests did a shuffle up with const 5,
we'd use invocation id (0..8) shuffle it down by 5,
get (-5..3), then call llvmshufflevector with that
which is totally illegal.
There might be a nicer way to fix this, but I can't see
it straight away, just bail on the fast path.
Fixes:
dEQP-VK.subgroups.shuffle.compute.subgroupshuffleup*
Cc: mesa-stable
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23484>
(cherry picked from commit c20df7e22e)
Instead of incrementing the seqno, just directly invalidate any existing
tex cache entries when we update a sampler view.
No reason not to just directly clear stale entries, and avoiding to re-
assign the seqno will simplify the next patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23470>
(cherry picked from commit 3f00f4ac30)
The previous implementation was copying the data using the
aligned length (size_dw). The aligned length could overflow
the original buffer size.
For instance, this issue is triggered with "piglit/bin/draw-batch -auto -fbo":
==5736==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff139c77e8 at pc 0x7f25b350a9a0 bp 0x7fff139c6cb0 sp 0x7fff139c6460
READ of size 8 at 0x7fff139c77e8 thread T0
#0 0x7f25b350a99f in __interceptor_memcpy (/usr/lib64/libasan.so.6+0x3c99f)
#1 0x7f25a8fcdf24 in radeon_emit_array ../src/gallium/include/winsys/radeon_winsys.h:760
#2 0x7f25a8fcdf24 in r600_draw_vbo ../src/gallium/drivers/r600/r600_state_common.c:2448
#3 0x7f25a8ae7ba1 in u_vbuf_draw_vbo ../src/gallium/auxiliary/util/u_vbuf.c:1791
#4 0x7f25a7bc18ca in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:1696
#5 0x7f25a7bc7e53 in _mesa_DrawElements ../src/mesa/main/draw.c:1824
Fixes: 0cf5d1f226 ("gallium: remove PIPE_CAP_INFO_START_WITH_USER_INDICES and fix all drivers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23436>
(cherry picked from commit 340311dac9)
if a DS3 pipeline enabling dynamic samples is not bound when samples
are set dynamically, then such a pipeline is later bound, min samples
would have been incorrectly set to 1
instead, flag the update for later and do it just before draw
cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23368>
(cherry picked from commit cc9e958053)
In commit 284f0c9a57 I refactored the
handling of the data source to just call a helper rather than special
casing opcodes with 0 or 2 sources. Unfortunately, I also dropped the
"else return 1", creating a fallthrough for all sources other than
SURFACE_LOGICAL_SRC_ADDRESS and SURFACE_LOGICAL_SRC_DATA.
The case below happened to return the correct value for all cases except
SURFACE_LOGICAL_SRC_SURFACE, which has been returning 2 instead of 1
since that commit.
Restore the else case. Thanks to Marcin Ślusarz for catching this.
Fixes: 284f0c9a57 ("intel/compiler: Add an lsc_op_num_data_values() helper")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23347>
(cherry picked from commit 2d9a3bb093)
Otherwise this can cause optimizations to fight resulting in infinite
optimization loops with opt_algebraic, constant_folding, and copy_prop.
Fixes: 368be872 ("nir/algebraic: shrink 64-bit bitwise operations with 0/-1 constant half")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23192>
(cherry picked from commit 6c62eaf22d)
For instance, this is triggered with "piglit/bin/ext_direct_state_access-named-program -auto -fbo":
==5695==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x606000050031 at pc 0x7f78dfca8d46 bp 0x7ffd9043b4a0 sp 0x7ffd9043ac50
READ of size 50 at 0x606000050031 thread T0
#0 0x7f78dfca8d45 (/usr/lib64/libasan.so.6+0x3fd45)
#1 0x7f78d450b18f in set_program_string ../src/mesa/main/arbprogram.c:385
#2 0x7f78d3fdbd3e in execute_list ../src/mesa/main/dlist.c:13025
#3 0x7f78d40c2564 in _mesa_CallList ../src/mesa/main/dlist.c:13451
#4 0x7f78d42f380a in _mesa_unmarshal_CallList ../src/mesa/main/glthread_list.c:43
#5 0x7f78d38e85c5 in glthread_unmarshal_batch ../src/mesa/main/glthread.c:122
#6 0x7f78d38ea20d in _mesa_glthread_finish ../src/mesa/main/glthread.c:382
#7 0x7f78d38ea20d in _mesa_glthread_finish ../src/mesa/main/glthread.c:347
#8 0x7f78d3d73f69 in _mesa_marshal_IsProgramARB src/mapi/glapi/gen/marshal_generated2.c:4256
Fixes: 0b196b40a3 ("mesa: don't compute the same SHA1 twice in glShaderSource")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23295>
(cherry picked from commit 44b960a645)
As Matt Turner pointed out, the commit here fixed breaks in Iris and
ANV in kernel versions without support for DRM_I915_QUERY_ENGINE_INFO.
As compute engines are only present in gfx12 and newer, and support
for DRM_I915_QUERY_ENGINE_INFO was added before any gfx12 platform,
we can check for gfx version before trying to get engine info.
For ANV, this is done by checking if engine_info is not NULL, like in
other places in the ANV source code.
Fixes: a364f23a6c ("intel: Make gen12 URB space reservation dependent on compute engine presence")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9099
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23257>
(cherry picked from commit 42f707e459)
This issue is happening on radeonsi. The reference was allocated
via _mesa_get_bufferobj_reference() with setup_arrays().
The same reference was never freed.
For instance, this issue is triggered on radeonsi with
"piglit/bin/gl-1.0-rendermode-feedback -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: ff8c2a1748 ("mesa/bufferobj: rename bufferobj functions to be more consistent.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22921>
(cherry picked from commit dc07e0d3fe)
When in an msaa feedback loop and when the image does not have tc-compat
cmask, we have to decompress and expand fmask. This can happen on gfx9
when sample count > 2 or when RADV_DEBUG=notccompatcmask is specified.
Fixes: a38de4c011 ("radv: disable tc_compatible_cmask on GFX9 in some cases")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23331>
(cherry picked from commit 6cb5185916)
==10199== Conditional jump or move depends on uninitialised value(s)
==10199== at 0xA107B13: radv_resume_queries (radv_meta.c:93)
==10199== by 0xA108097: radv_meta_restore (radv_meta.c:225)
==10199== Uninitialised value was created by a stack allocation
==10199== at 0xA1145B2: fill_buffer_shader (radv_meta_buffer.c:171)
saved_state is never memset, so the value should be inited.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23327>
(cherry picked from commit 54ceec8d9e)
The GS and the FS needs to agree on the driver_location. But we just
used the num_outputs variable for the GS instead of matching the logic
from lower_aaline_instr in nir_draw_helpers.c.
This does that, but cleans up our copy slightly to avoid computing the
needless location, as well as using unsigned values.
This used to *mostly* work before, but only because we were lucky and
not too much crazy stuff went on with the inputs / outputs in
smooth-line cases.
Fixes: edecb66b01 ("nir: avoid generating conflicting output variables")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23316>
(cherry picked from commit ffc77d5262)
global load/store (or A64 messages) need the NIR bound checking which
is enabled by "robust" behavior even when robust behavior is disabled.
Many thanks to Christopher Snowhill for pointing out the pushed
constant related issue with the initial version of this patch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
(cherry picked from commit efcda1c530)
This seems to be the best compromise I can come up with so far.
I can't figure out to get the tier2 programming to match between
264 and 265, maybe they are just programmed different here, good
old firmware.
Fixes: 1693c03a39 ("radv/video: add initial h264 decoder for VCN")
Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23227>
(cherry picked from commit b5963fc1f0)
radv_destroy_shader_upload_queue waits for a semaphore, which will in turn
call query_reset_status on hw_ctx that will fail due to being already
destroyed.
Fix radv/amdgpu: amdgpu_cs_query_reset_state2 failed. (-9) spam in the logs
with RADV_PERFTEST=dmashaders.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23276>
(cherry picked from commit 978d80fbe2)
The existing guardband region calculation was mixing up x/y_min with
x/y_max in cmd_buffer_emit_viewport(), causing the calculated viewport
area to always be an empty region. Luckily intel_calculate_guardband_size()
returns a non-empty but bogus guardband region in that case, so this
doesn't seem to have led to conformance regressions, but the
off-center guardbands could potentially impact performance in
geometry-heavy rendering.
Fixes: 893fa30afe ("anv: Include scissors in viewport calculations")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23174>
(cherry picked from commit 9c26a6b3bb)
We don't need to add an offset in the buffer, because we submit
the offset where the data was written to to the host. The
correction of this offset is also not needed and results in draw
errors.
Fixes: 0cf5d1f226
gallium: remove PIPE_CAP_INFO_START_WITH_USER_INDICES and fix all drivers
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23196>
(cherry picked from commit d4fc359748)
Fix issue by handling the OpString instructions when walking through
the preamble for validation.
The gl_spirv_validation() creates a vtn_builder() and walks the
instructions looking for a subset of the information. However
our current way to walk the instructions will also perform tracking
of OpLine/OpNoLine, that may make references to OpString instructions
that were being previously ignored by gl_spirv_validation().
This would cause the parsing to fail.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9004
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22973>
(cherry picked from commit 1b31d528b9)
If two threads deserialize the raw object at the same time, the
refcount could be more than 1 temporarily.
This can be reproduced with Granite during the multi-threaded pipeline
cache pre-warm on startup, and also with Dota2.
Fixes: cbab396f54 ("vulkan/pipeline_cache: replace raw data objects on cache insertion of real objects")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22853>
(cherry picked from commit 8126e0287d)
Indeed, the locally allocated "stimg" reference was not freed
on a specific code path.
For instance, this issue is triggered on radeonsi or r600 with:
"piglit/bin/egl-ext_egl_image_storage -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 6a3f5c6512 ("mesa: simplify st_egl_image binding process for texture storage")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23165>
(cherry picked from commit 83cd7d23a2)
When the fragment shader reads the VRS builtin, VRS flat shading
shouldn't be enabled, otherwise the value might not be what the FS
expects.
Fixes dEQP-VK.fragment_shading_rate.renderpass2.monolithic.multipass.*
on RDNA2 (VRS flat shading isn't yet enabled on RDNA3).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23187>
(cherry picked from commit b439bd5a58)
With Anv/Zink, the piglit test :
arb_shader_storage_buffer_object-max-ssbo-size -auto -fbo fsexceed
is failing validation after copy propagation :
load_payload(8) vgrf15:F, vgrf1+0.12<0>:F, vgrf1+0.0<0>:F, vgrf1+0.4<0>:F, vgrf1+0.8<0>:F, vgrf1+0.12<0>:F
../src/intel/compiler/brw_fs_validate.cpp:191: A <= B failed
A = inst->src[i].offset / REG_SIZE + regs_read(inst, i) = 2
B = alloc.sizes[inst->src[i].nr] = 1
In most cases it works because src[0] would be at offset 0 and so
reading a full reg passes validation, but Anv/Zink started emitting
slightly different code adding an offset maybe the size read 2 GRFs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23126>
(cherry picked from commit 21c7b55f6f)
has_work controls whether a flush can be deferred, i.e., when unset
a flush may be deferred
since a promoted cmd must still be flushed to take effect, ensure this
is always set when promoted cmds are pending
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
(cherry picked from commit 0f510040dc)
when needs_present_readback is set, reordering is disabled without hitting
the path that would normally disable promotion for the resource, so this
needs to be changed manually to avoid layout desync on the swapchain
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
(cherry picked from commit 3c010319bb)
in a scenario where an ordered read op occurs for an image,
successive read-only barriers SHOULD be able to be promoted
...but they can't, because there isn't yet a mechanism for handling layout
transitions between the unordered cmdbuf and the ordered cmdbuf,
meaning that promoting e.g., a SHADER_READ_ONLY barrier after a TRANSFER_SRC
barrier will leave the image with the wrong layout for the transfer op:
TRANSFER_SRC(unordered) -> COPY(ordered) -> SHADER_READ_ONLY(unordered)
becomes
TRANSFER_SRC(unordered) -> SHADER_READ_ONLY(unordered) -> COPY(ordered)
ideally I'll get around to figuring this out at some point
affects:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.texture2d_array_to_renderbuffer
Fixes: bf0af0f8ed ("zink: move all barrier-related functions to c++")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
(cherry picked from commit 9c8b6754b0)
in the case where a draw is triggered after a flush, zink_update_descriptor_refs
will be called to set batch tracking for descriptors. this function also
handles refs for fb attachments, and everything is usually fine there
the problem with this approach is that tracking is no longer set on view
objects at renderpass begin, which makes them susceptible to early deletion
if a rp isn't started from a draw call
instead, apply batch tracking to fb attachment resources on renderpass
begin if the BATCH_CHANGED flag is set (need to rename this at some point)
in order to guarantee that the resource (object) lifetime will match the
cmdbuf runtime [since imageviews are now only freed upon batch completion]
fixes#9059
Fixes: f6bbd7875a ("zink: remove batch tracking/usage from view types"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23132>
(cherry picked from commit 62961b172f)
Yuzu is running into a segfault because it writes the push descriptor
twice with 2 different layouts, but without a draw/dispatch in
between.
First vkCmdPushDescriptorSetKHR() writes descriptor 0 & 1 with a
uniform buffer. We toggle the 2 first bits of
anv_descriptor_set::generate_surface_states.
Second vkCmdPushDescriptorSetKHR() writes descriptor 0 with uniform
buffer and descriptor 1 with an image view. The first bit of
anv_descriptor_set::generate_surface_states stays, but the second bit
was already set before and it should now be off.
When we finally flush the push descriptor, we try to generate a
surface state for descriptor 1, but there is no valid buffer view for
it, we access an invalid pointer and segfault.
This fix resets the anv_descriptor_set::generate_surface_states when
the descriptor layout changes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b49b18f0b7 ("anv: reduce BT emissions & surface state writes with push descriptors")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23156>
(cherry picked from commit cab7ba00e2)
This shouldn't have been enabled at all. Depth-stencil formats were
accidentally disabled but not depth-only or stencil-only formats.
This doesn't seem allowed by DX12 and both AMD/NVIDIA don't enable it.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23122>
(cherry picked from commit dda7400c0b)
Super sampling on a 4K screen could hit this. 16k seems pretty big
but this image is only created on RDNA2 and on-demand if VRS attachments
are used without depth-stencil attachments, which should be rare
enough to care.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23105>
(cherry picked from commit 3adc9b6722)
We only support 32-bit versions of ufind_msb, find_lsb, and bit_count,
so we need to lower them via nir_lower_int64.
Previously, we were failing to do so on platforms older than Icelake
and let those operations fall through to nir_lower_bit_size, which
used a callback to determine it should lower them for bit_size != 32.
However, that pass only emulates small bit-size operations by promoting
them to supported, larger bit-sizes (i.e. 16-bit using 32-bit). It
doesn't support emulating larger operations (i.e. 64-bit using 32-bit).
So nir_lower_bit_size would just u2u32 the 64-bit source, causing us to
flat ignore half of the bits.
Commit 78a195f252 (intel/compiler: Postpone most int64 lowering to
brw_postprocess_nir) provoked this bug on Icelake and later as well,
by moving the nir_lower_int64 handling for ufind_msb until late in
compilation, allowing it to reach nir_lower_bit_size which broke it.
To fix this, we always set int64 lowering for these opcodes, and also
correct the nir_lower_bit_size callback to ignore 64-bit operations.
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>
(cherry picked from commit a2d384a5c0)
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT is also set in all memory types of
integrated GPUs.
This flag means that memory will be allocated in the most efficient
place for the GPU to access, which is true in integrated GPUs.
However, this was causing ANV_BO_ALLOC_WRITE_COMBINE to be set in
integrated GPUs in the block right below when allocating in the non-cached memory type.
But the comment only talks about lmem, so to still keep the write
combine behavior for iGPUs it was used VkMemoryPropertyFlags in mmap_calc_flags().
Additionally, this was causing anv_bo.has_implicit_ccs to always be
set, which could change the expected behavior of
anv_BindImageMemory2() in MTL.
Fixes: fbd32a04da ("anv: add a third memory type for LLC configuration") added a new heap
Fixes: 582bf4d9f7 ("anv: flag BO for write combine when CPU visible and potentially in lmem")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22483>
(cherry picked from commit a6c5746b37)
On RDNA1&2, the driver needs to support both NGG and legacy for
primitives generated query because we can't know that before starting
queries.
To get the query pool results, we check the availability bit wrote by
the SAMPLE_STREAMOUTSTATS packet but the GDS copy was emitted after,
which means the availability bit might be TRUE before the GDS copy is
actually done.
Fix this by emitting the GDS copy before to ensure the availability is
TRUE for both results.
This fixes recent updates in
dEQP-VK.transform_feedback.primitives_generated_query.* because the
tests no longer wait for the fence.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23080>
(cherry picked from commit 9ba41ed70a)
Commit c65bde7b1e introduced a regression where under certain
circumstances `front` may be NULL, thus leading to a crash. It's not
currently known what exactly causes `front` to become NULL, nor we can
revert the offending commit, because there had been too many unrelated
changes that now depend on this commit.
So until someone comes up with a proper fix, let's add a workaround so
instead of crashing we just return from the function early.
This commit was tested with the bug `8982` and helps with the crash
with no other noticeable problems.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8982
Fixes: c65bde7b1e ("frontend/dri: inline __DRIdrawable in dri_drawable, make __DRIdrawable opaque")
Cc: mesa-stable
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23093>
(cherry picked from commit 275cf62e20)
Then we can guarantee the settings correct, otherwise the 'screen->info.have_EXT_extended_dynamic_state3 = false' and 'screen->info.have_EXT_vertex_input_dynamic_state = false'
will be enable, but actually we should disable it when 'have_EXT_extended_dynamic_state2 = false'.
Fixes: d5cf6f7d2f ("zink: disable dynamic state exts if the previous ones aren't present")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23046>
(cherry picked from commit 47f0801949)
This change is inspired from iris_destroy_context().
For instance, this issue is triggered with
"piglit/bin/glsl-1.50-gs-max-output -scan 1 20 -auto -fbo":
Direct leak of 320 byte(s) in 2 object(s) allocated from:
#0 0x7f34fc769987 in calloc (/usr/lib64/libasan.so.6+0xb1987)
#1 0x7f34f4fa168a in bo_calloc ../src/gallium/drivers/crocus/crocus_bufmgr.c:288
#2 0x7f34f4fa168a in alloc_fresh_bo ../src/gallium/drivers/crocus/crocus_bufmgr.c:350
#3 0x7f34f4fa168a in bo_alloc_internal ../src/gallium/drivers/crocus/crocus_bufmgr.c:419
#4 0x7f34f4fe50a9 in crocus_get_scratch_space ../src/gallium/drivers/crocus/crocus_program.c:2678
#5 0x7f34f55e8954 in crocus_upload_dirty_render_state ../src/gallium/drivers/crocus/crocus_state.c:6871
#6 0x7f34f55e8954 in crocus_upload_render_state ../src/gallium/drivers/crocus/crocus_state.c:7812
#7 0x7f34f5d9f680 in crocus_simple_draw_vbo ../src/gallium/drivers/crocus/crocus_draw.c:332
#8 0x7f34f5d9f680 in crocus_draw_vbo ../src/gallium/drivers/crocus/crocus_draw.c:438
#9 0x7f34f1d2eeba in tc_call_draw_single ../src/gallium/auxiliary/util/u_threaded_context.c:3735
#10 0x7f34f1d12e03 in batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:394
#11 0x7f34f1d12e03 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:445
#12 0x7f34f1d22c9a in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:680
#13 0x7f34f1d238f8 in tc_texture_map ../src/gallium/auxiliary/util/u_threaded_context.c:2754
#14 0x7f34f120b9d9 in pipe_texture_map_3d ../src/gallium/auxiliary/util/u_inlines.h:579
#15 0x7f34f120b9d9 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:530
#16 0x7f34f10d7355 in read_pixels ../src/mesa/main/readpix.c:1178
#17 0x7f34f10d7355 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1195
#18 0x7f34f10d7e10 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1210
Fixes: f3630548f1 ("f3630548f1da crocus: initial gallium driver for Intel gfx 4-7")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23019>
(cherry picked from commit 6ee0bba3ae)
while swapchains themselves are protected against early deletion
during presentation, there is nothing protecting them from
deletion while they are rendering if a swapchain updates
while rendering but before presentation
to address this, add batch usage to swapchains which can be
checked during pruning to ensure a rendering swapchain isn't
pruned
Fixes: dc8c9d2056 ("zink: prune old swapchains on present")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22962>
(cherry picked from commit 47d9eaa0f1)
some resources may not be destroyed immediately and may instead be
queued for deletion onto the current batch state, so ensure that the
current state is the last one to be destroyed so that all deferred resources
are also destroyed
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23033>
(cherry picked from commit 58532057c5)
The helper was creating input locations for some builtin bariables.
This caused validation errors in zink because those builtins can't be
used as input.
Fixes: e2220ee55e ("zink: filled quad emulation gs generation function")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22871>
(cherry picked from commit 71107b6dc8)
The helper was creating input locations for some builtin bariables.
This caused validation errors in zink because those builtins can't be
used as input.
Fixes: d0342e28b3 ("nir: Add helper to create passthrough GS shader")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22871>
(cherry picked from commit 83692bfe30)
With the shader cache enabled, intel_clc attempts to write to ~/.cache.
Many distributions' build systems limit file-system access, and will
kill the process thus causing the build to fail.
Fixes: 639665053f ("anv/grl: Build OpenCL kernels")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22968>
(cherry picked from commit 435a607909)
The 'struct drm_amdgpu_cs_chunk_fence' is processed as
'struct drm_amdgpu_cs_chunk_data' which is a union.
This change ensures the proper alignment for this structure
to be processed as 'struct drm_amdgpu_cs_chunk_data'.
The presence of __u64 as one member of
'struct drm_amdgpu_cs_chunk_data' makes the
whole structure expected to be 64-bit aligned.
This is a minor issue detected by the gcc sanitizer (ubsan), for instance at the libdrm library:
../amdgpu/amdgpu_cs.c:937:26: runtime error: member access within misaligned address 0x63100001484c for type 'struct drm_amdgpu_cs_chunk_data', which requires 8 byte alignment
0x63100001484c: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
Fixes: ae7e4d7619 ("amd: rename ring_type --> amd_ip_type and match the kernel enum values")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22920>
(cherry picked from commit acdd6a2a6c)
if a sampler is never used (no derefs) then its binding will never be
applied here, leaving it with binding=0. this will clobber the real binding=0
sampler in driver backends, leading to errors, so try to iterate using
the same criteria as above and apply bindings in the same way
fixes#8974
cc: mesa-stable
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22902>
(cherry picked from commit ccbfcf3933)
the inner conditional here didn't include uncached readback, meaning
that many (most?) buffers allocated with uncached memory (i.e., BAR) were
being read back directly instead of using staging resources to be faster
at some point this inner conditional should be reevaluated to determine
whether it still does anything, but this is not that time
fixes, among other things, loading in DOOM2016 on some GPUs
Fixes: 52f27cda05 ("zink: allow direct memory mapping for any COHERENT+CACHED buffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22907>
(cherry picked from commit 24350064ca)
in the case where a cmdbuf was submitted with write access and the subsequent
batch promotes an op to unordered, it's important for associated barriers
to happen-before those ops to guarantee synchronization
the fixes tag is wrong on this, but it's all in the same release
Fixes: bf0af0f8ed ("zink: move all barrier-related functions to c++")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22907>
(cherry picked from commit 6452849b11)
Otherwise accesses to non-0 views of input attachments may be considered
out-of-bounds and return 0. This should've been removed when enabling
multiview for GMEM, not sure how it was missed.
Fixes: def56b531c ("tu: Support GMEM with layered rendering and multiview")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20304>
(cherry picked from commit f6902bf425)
We go to initialize the disk cache before we've compiled any shaders so
agx_compiler_debug is 0 at this point. Don't try to read it, instead go through
sa safe getter that will do the right thing.
Fixes: 5e9538c12e ("agx: isolate compiler debug flags")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
(cherry picked from commit e9b471d1b3)
With the following test :
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_out_of_bounds_load
There is a :
shader_start:
... <- no control flow
g0 = some_alu
g1 = fbl
g2 = broadcast g3, g1
g4 = get_buffer_size g2
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
eliminate_find_live_channel will remove the fbl/broadcast because it
assumes lane0 is active at get_buffer_size :
shader_start:
... <- no control flow
g0 = some_alu
g4 = get_buffer_size g0
... <- no control flow
halt <- on some lanes
g5 = send <surface>, g4
But then the instruction scheduler will move the get_buffer_size after
the halt :
shader_start:
... <- no control flow
halt <- on some lanes
g0 = some_alu
g4 = get_buffer_size g0
g5 = send <surface>, g4
get_buffer_size pulls the surface index from lane0 in g0 which could
have been turned off by the halt and we end up accessing an invalid
surface handle.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20765>
(cherry picked from commit 9471ffa70a)
Fixes:
```
[829/1646] Compiling C object src/panfrost/vulkan/libpanvk_v6.a.p/panvk_vX_meta_clear.c.o
In function 'panvk_meta_clear_zs_img',
inlined from 'panvk_v6_CmdClearDepthStencilImage' at ../src/panfrost/vulkan/panvk_vX_meta_clear.c:457:7:
../src/panfrost/vulkan/panvk_vX_meta_clear.c:415:26: warning: storing the address of local variable 'view' in '((struct pan_fb_info *)((char *)commandBuffer + 144))[23].zs.view.zs' [-Wdangling-pointer=]
415 | fbinfo->zs.view.zs = &view;
| ~~~~~~~~~~~~~~~~~~~^~~~~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c: In function 'panvk_v6_CmdClearDepthStencilImage':
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'view' declared here
393 | struct pan_image_view view = {
| ^~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'commandBuffer' declared here
[844/1646] Compiling C object src/panfrost/vulkan/libpanvk_v7.a.p/panvk_vX_meta_clear.c.o
In function 'panvk_meta_clear_zs_img',
inlined from 'panvk_v7_CmdClearDepthStencilImage' at ../src/panfrost/vulkan/panvk_vX_meta_clear.c:457:7:
../src/panfrost/vulkan/panvk_vX_meta_clear.c:415:26: warning: storing the address of local variable 'view' in '((struct pan_fb_info *)((char *)commandBuffer + 144))[23].zs.view.zs' [-Wdangling-pointer=]
415 | fbinfo->zs.view.zs = &view;
| ~~~~~~~~~~~~~~~~~~~^~~~~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c: In function 'panvk_v7_CmdClearDepthStencilImage':
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'view' declared here
393 | struct pan_image_view view = {
| ^~~~
../src/panfrost/vulkan/panvk_vX_meta_clear.c:393:26: note: 'commandBuffer' declared here
```
Cc: mesa-stable
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22829>
(cherry picked from commit 2b4ce498ee)
../src/microsoft/vulkan/dzn_image.c: In function ‘dzn_GetImageMemoryRequirements2’:
../src/microsoft/vulkan/dzn_image.c:918:91: error: passing argument 6 of ‘dzn_ID3D12Device12_GetResourceAllocationInfo3’ from incompatible pointer type [-Werror=incompatible-pointer-types]
918 | &image->castable_format_count, &image->castable_formats,
| ^~~~~~~~~~~~~~~~~~~~~~~~
| |
| DXGI_FORMAT **
In file included from ../src/microsoft/vulkan/dzn_private.h:67,
from ../src/microsoft/vulkan/dzn_image.c:24:
../src/microsoft/vulkan/dzn_abi_helper.h:64:107: note: expected ‘const DXGI_FORMAT * const*’ but argument is of type ‘DXGI_FORMAT **’
64 | const UINT *num_castable_formats, const DXGI_FORMAT *const *castable_formats,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
ninja: build stopped: subcommand failed.
Fixes: 71dbb3120a ("dzn: Use GetResourceAllocationInfo3 for castable formats")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22877>
(cherry picked from commit cb4e4fc5de)
These are unneeded. Events can't be used to indefinitely stall work
like you can with a semaphore or timeline semaphore. The signals
still need to happen so that execution will modify the state that
can be polled from the CPU though.
Fixes dEQP-VK.synchronization.basic.event.single_submit_multi_command_buffer
Fixes: 04fa6c71 ("dzn: Batch command lists together")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22842>
(cherry picked from commit be34257197)
with tc, query results can be fetched async, and it's impossible to
sync tc in this scenario. to avoid needing to sync when a sync is not
possible, sync ahead of time in all cases
Fixes: 7c96e98975 ("zink: always start/stop/resume queries inside renderpasses")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22827>
(cherry picked from commit 2e3ce614b9)
UE4's Vulkan backend uses vkCmdWriteTimestamp with TOP_OF_PIPE
to measure how long a workload took in the GPU Benchmark. This is wrong
and writes the timestamp before the workload is actually finished,
making it seem like the GPU is much faster than it actually is.
This caused subsequent benchmark passes to contain way too big workloads,
which caused soft hangs on slower GPUs.
Fixes GPU hangs with Splitgate during automatic settings configuration.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22823>
(cherry picked from commit 0b251d4362)
This change makes usage bits verification closer to the Vulkan spec.
i.e. VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT does not always require all formats
to support all the requested usage bits.
Also, VK_IMAGE_CREATE_EXTENDED_USAGE_BIT, when combined with
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT can relax the requirements for the
usage supported by the original image format.
v2: Removed strict verification of the format_list_info formats usage
per chadversary's suggestion. Other minor style/comments tweaks.
v3: Added checking of all compatible formats when
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT and VK_IMAGE_CREATE_EXTENDED_USAGE_BIT
are specified, but no list of possible formats was given.
v4: Add VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT handling.
Cc: 22.2 <mesa-stable>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6031
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17182>
(cherry picked from commit 697ed61e7c)
This value depends on the per-sample value which can be unknown at
compile time with graphics pipeline libraries. So we need to have this
dynamic has well and pick the right value when generating the
3DSTATE_PS/3DSTATE_WM packet.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22728>
(cherry picked from commit 5489033fa8)
Indeed, these references are not freed.
For instance, this issue is triggered on an evergreen card with
"piglit/bin/shader_runner tests/spec/arb_shader_atomic_counter_ops/execution/all_touch_test.shader_test -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 06993e4ee3 ("r600: add support for hw atomic counters. (v3)")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22798>
(cherry picked from commit 4ca8be82d5)
Indeed, the objects are not freed when the function returns NULL.
"psurf->texture = tex;" is redundant with
"pipe_resource_reference(&psurf->texture, tex);".
For instance, this issue is triggered with
"piglit/bin/ext_texture_array-compressed teximage pbo -fbo -auto"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22799>
(cherry picked from commit d615dfca40)
We track fences in a global list and have a per context "current" fence
which we randomly attach things to. If we take such a fence and emit it
without also creating a new fence for future tasks we can get out of sync
leading to random failures.
Some of our queries could trigger such cases and even though this issues
appears to be triggered by the MT rework, I'm convinced that this was only
made more visible by those fixes and we had this bug lurking for quite a
while.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7429
Fixes: df0a4d02f2 ("nvc0: make state handling race free")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Acked-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22722>
(cherry picked from commit 37c6c5c624)
It might happen that a raw data object (from pipeline cache creation)
was never looked up, and thus never deserialized, before it gets
inserted again into the cache. In this case, the deserialized object
got replaced by the raw data object.
Instead, replace the raw data object with the real object in the cache.
Fixes: 8b13ee75ba ('vulkan: Fall back to raw data objects when deserializing if ops == NULL')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22735>
(cherry picked from commit cbab396f54)
The Vulkan spec says:
"If the depth clamping state is changed dynamically, and the pipeline
was not created with VK_DYNAMIC_STATE_DEPTH_CLIP_ENABLE_EXT enabled,
then depth clipping is enabled when depth clamping is disabled and
vice versa"
Fixes: e48c0fbd8f ("radv: add support for dynamic depth clamp enable")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22777>
(cherry picked from commit e25e4c81de)
Commit f9a074dd55 ("dri2/android: Bypass throttling") dropped
unnecessary throtting in the SwapBuffers() path for android. But
unfortunately MSAA resolve got tangled up in the throttle reason
flag. So add a new flag that indicates "no throttingling, but yes
please do MSAA resolve".
Fixes: f9a074dd55 ("dri2/android: Bypass throttling")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22719>
(cherry picked from commit 08ffa8e0d2)
Indeed, the hardcoded framebuffer cleanup doesn't handle "resolve".
For instance, this issue is triggered with "piglit/bin/glx-copy-sub-buffer -samples=2 -auto"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: f5bde99cbd ("gallium: plumb resolve attachments through from frontends -> pipe_framebuffer_state")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22554>
(cherry picked from commit c39a2d67ea)
0: KIL -none.1111
Negate is not allowed for texturing opcodes, so the incorrect swizzle
was detected, however later optimization, where we try to rewrite incorrect
swizzles from constant (immediate) registers by adding a new ones with
correct order was interfering and not handling this correctly, so we
ended with
CONST[0] = { -1.0000 -1.0000 -1.0000 -1.0000 }
0: KIL const[0].xyz-w;
Even if it would get the swizzle right, texturing opcodes can't read from
constant registers, so just skip it and let this be handled by a later
part which inserts an extra mov instead.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Fixes: a8e1e5b5c2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22704>
(cherry picked from commit db6c3cd13d)
Instead of forcing vertex buffer stride to be 4 byte aligned only,
DX10 actually allows the stride to be non 4-byte aligned but the
alignment of an element must be the nearest power of 2 greater or equal to the
width of the element's format, or 4, whichever is less. So the requirement is
better met with PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY which if set to
TRUE, the sum of vertex element offset + vertex buffer offset + vertex buffer
stride must be aligned to the vertex attributes component size.
Note: PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY cannot be set
with other alignment-requiring CAPs, so we have to return 0 for all the
other alignement CAPs.
This avoids some unnecessary software vertex translate fallback.
cc: mesa-stable
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22689>
(cherry picked from commit c661f38342)
Indeed, the function nir_to_tgsi() returns an ureg_get_tokens() allocated
object which is assigned locally. The ureg_get_tokens() allocated object
should be freed.
For instance, this issue is triggered with a llvm enabled lima,
"piglit/bin/gl-1.0-rendermode-feedback -auto -fbo":
Direct leak of 512 byte(s) in 1 object(s) allocated from:
#0 0x7faeaa4500 in __interceptor_realloc (/usr/lib64/libasan.so.6+0xa4500)
#1 0x7fa4a88f1c in tokens_expand ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:239
#2 0x7fa4a88f1c in get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:262
#3 0x7fa4a900f4 in copy_instructions ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2079
#4 0x7fa4a900f4 in ureg_finalize ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2129
#5 0x7fa4a91dfc in ureg_get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2206
#6 0x7fa4b20a2c in nir_to_tgsi_options ../src/gallium/auxiliary/nir/nir_to_tgsi.c:4011
#7 0x7fa4a0c914 in draw_create_vertex_shader ../src/gallium/auxiliary/draw/draw_vs.c:77
Fixes: b5e782f5f4 ("aux/draw: use nir_to_tgsi for draw shader in llvm path")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21924>
(cherry picked from commit 6a8e6716ac)
We can now no longer rely on certain dirty bits to re-trigger draw time
resource tracking. We need to use the new fd_dirty*_resource() APIs.
Fixes `org.skia.skqp.SkQPRunner#gles_recordopts` on android 9.
Fixes: 0a62a874fc ("freedreno: Re-work dirty-resource tracking")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22683>
(cherry picked from commit d437e389e0)
Previously it was assumed that between the and the variable there was
only one deref.
To handle all cases a new function is introduced that recreates a chain
of derefs.
Fixes: 5a4083349f ("zink: add provoking vertex mode lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22678>
(cherry picked from commit 39770c6503)
The first attempt at the sprintf would have consumed part of va, so if
we're going to recurse on overflow to try again in a new allocation then
we have to do our work on a copy.
This was a common failure mode for MESA_GLSL=source, where it would just print:
Mesa: info: GLSL source for fragment shader 1:
Mesa: info: (null)
Fixes: 7a18a1712a ("util/log: improve logger_file newline handling")
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22618>
(cherry picked from commit 7f99cbf25e)
The VK pipeline cache passes a NULL bytes with a nonzero size to a
NULL-data blob to set up the size of the blob. In this case, we don't
actually execute the memcpy, so the non-existent "bytes" doesn't need to
have defined contents. Avoids a valgrind warning:
==972858== Unaddressable byte(s) found during client check request
==972858== at 0x147F4166: blob_write_bytes (blob.c:165)
==972858== by 0x147F4166: blob_write_bytes (blob.c:158)
==972858== by 0x14695FFF: vk_pipeline_cache_object_serialize (vk_pipeline_cache.c:240)
[...]
==972858== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Fixes: 591da98779 ("vulkan: Add a common VkPipelineCache implementation")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22617>
(cherry picked from commit ae2784b832)
Before testing the waddr for SFU we should first validate this
is indeed a valid (not NOP) magic write. Use the helper we have for
this which gets this right.
total instructions in shared programs: 12898957 -> 12850958 (-0.37%)
instructions in affected programs: 4328937 -> 4280938 (-1.11%)
helped: 19974
HURT: 439
Instructions are helped.
total max-temps in shared programs: 2211503 -> 2210893 (-0.03%)
max-temps in affected programs: 12924 -> 12314 (-4.72%)
helped: 509
HURT: 20
Max-temps are helped.
total sfu-stalls in shared programs: 22233 -> 21975 (-1.16%)
sfu-stalls in affected programs: 722 -> 464 (-35.73%)
helped: 297
HURT: 54
Sfu-stalls are helped.
total inst-and-stalls in shared programs: 12921190 -> 12872933 (-0.37%)
inst-and-stalls in affected programs: 4337977 -> 4289720 (-1.11%)
helped: 20015
HURT: 404
Inst-and-stalls are helped.
total nops in shared programs: 333743 -> 305911 (-8.34%)
nops in affected programs: 86902 -> 59070 (-32.03%)
helped: 14545
HURT: 76
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22593>
(cherry picked from commit 18a3a0d915)
Previously, whenever a vertex was emitted immediately after emitting a
primitive, that vertex would not use the attributes that where assigned
last because the position variable got set.
Now the temporary attributes array is treated as a ring buffer and
whenever the position is set to 0 it's previous value is used as an
offset when accessing it. This way when a new primitive is created the
attributes at index 0 correspond to the last attributes written.
Fixes: 5a4083349f ("zink: add provoking vertex mode lowering")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22599>
(cherry picked from commit 89077b866c)
otherwise the pool is freed before the query and zink will
give the vulkan driver NULL query pool which can make it crash.
this was seen when running the following cases with
primitivesGeneratedQueryWithRasterizerDiscard and color write
features disabled:
dEQP-GL45.functional.tessellation.invariance.outer_triangle_set.triangles_fractional_odd_spacing
dEQP-GL45.functional.tessellation.invariance.outer_triangle_set.triangles_fractional_even_spacing
dEQP-GL45.functional.tessellation.invariance.outer_triangle_set.quads_equal_spacing
dEQP-GL45.functional.tessellation.invariance.outer_triangle_set.quads_fractional_odd_spacing
dEQP-GL45.functional.tessellation.invariance.outer_triangle_set.quads_fractional_even_spacing
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_equal_spacing_ccw
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_equal_spacing_ccw_point_mode
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_equal_spacing_cw
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_equal_spacing_cw_point_mode
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_fractional_odd_spacing_ccw
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_fractional_odd_spacing_ccw_point_mode
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_fractional_odd_spacing_cw
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_fractional_odd_spacing_cw_point_mode
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_fractional_even_spacing_ccw
dEQP-GL45.functional.tessellation.invariance.tess_coord_component_range.triangles_fractional_even_spacing_ccw_point_mode
Fixes: e5d517f362 ("zink: rework query pool overflow")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22575>
(cherry picked from commit 3291050cc1)
Initially this was just adding a missing popd, but actually there's no
reason to pushd into the build dir, so let's just pass the build dir as
arguments to cmake & ninja instead.
`--arch x64` was also dropped as it only applies to Windows builds,
which this script doesn't support anyway.
Fixes: 512f1c160a ("ci/zink: Add coverage using the vulkan validation layer on lvp.")
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22488>
(cherry picked from commit 3c64f3dcbc)
This is really noticeable for games that resolve a bunch of occlusion
queries (in this case 4096) because it seems that emitting 4096
WAIT_REG_MEM packets can stall more than expected. Fixes this by
waiting for queries in the resolve query shader.
This improves performance of an unreleased game by +~10% (71->78 FPS).
RADV should now be really close to Windows performance for that title.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22630>
Previously, continue preambles and postambles were added directly to
the CS array which means all BOs were correctly added to the BO list,
and this has been broken recently. IB BOs need to be added to the list.
When a BO isn't added to the list as part of a submission, it might
randomly VM faults.
This fixes VM faults and random GPU hangs on NAVI21 in Mesa CI.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8849
Fixes: 41a9bced31 ("radv: Fill continue preambles and postambles properly.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22625>
(cherry picked from commit 84d8ea6e2b)
For MTL (verx10 == 125), float64 is supported, but int64 is not.
Therefore we need to lower cluster broadcast using 32-bit int ops.
For gfx12.5+ platforms that support int64, the register regions
used by cluster broadcast aren't supported by the 64-bit pipeline.
On MTL, dEQP-VK.subgroups.clustered.*_double* and
dEQP-VK.subgroups.clustered.*_dvec* were failing to validate the
compiled shader in debug mode, and reportedly gpu-hanging in release
mode.
With this change dEQP-VK.subgroups.clustered.*_double* passed all 48
tests and dEQP-VK.subgroups.clustered.*_dvec* passed all 140 tests on
MTL.
Rework:
* Move from generator to brw_fs_lower_regioning.cpp. (Suggested by
Francisco)
* Apply to verx10 >= 125.. (Suggested by Francisco)
Cc: 23.1 <mesa-stable>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> (v1)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22569>
(cherry picked from commit fcb72ffd0c)
So far we were only considering the number of vertices to draw to
compute the offset in a stream output buffer.
But this is not correct, as it depends on the primitive type too. For
instance, with 4 vertices, if we use a triangle strip primitive, then 2
triangles are generated from those 4 vertices, so 6 vertices will be
captured.
This fixes spec@!opengl es
3.0@gles-3.0-transform-feedback-uniform-buffer-object.
CC: 23.1
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22607>
(cherry picked from commit a86d18a8c4)
It appears to be possible that IDLE is observed before COMPLETE.
In this case, an application may access present_id in subsequent
QueuePresentKHR and race against the fence worker reading present_id.
Solve this by adding a separate signal_present_id that is used when
completing to avoid the race.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 32f7ff2c20)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22637>
The buffer max_index value in translate_generic struct is relevant for
indexed draw only. So do not clamp the element index in generic_run() as it
is called for non-indexed draw only.
This patch passes index_size to the common generic_run_one function
so index clamping is only performed when a non-zero index_size is specified.
This fixes a text selection bug with kitty terminal emulator running on ARM
when it falls back to the generic translate path for unsigned byte vertex
array.
cc: mesa-stable
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22568>
(cherry picked from commit 13e885842a)
if an attachment other than the msrtss blit attachment has clears pending,
unbinding the other attachment will trigger a clear flush, which will then
recurse into the msrtss blit that's being triggered
instead, save/restore these clears around the msrtss blit since they
can be executed during the normal renderpass
(cherry picked from commit 82add9f2e9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22598>
with the new zsbuf elimination handling, the fb state calculated in
u_blitter's fb restore may be incorrect if the zsbuf has indeed been
eliminated, so ensure the right fb is stored to be reapplied so that
misrenders will be avoided
fixes some crashes/misrenders in webgl
(cherry picked from commit ec0860b401)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22598>
To implement some of the features of the layer, we need to enable some
of the feature bits at device/command_buffer creation. To do so, we
need to edit some of the structures coming from the application. Most
of those are const so we need to clone them before edition.
This change disables some of the layer features if we run into a
situation where one of the structure we need to clone is unknown such
that we can't make a copy of it (since we don't know its size).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7677
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19897>
(cherry picked from commit b30a75a195)
Indeed, the reference was overwritten.
For instance, this issue is triggered with:
"piglit/bin/shader_runner tests/spec/arb_shader_image_load_store/execution/write-to-rendered-image.shader_test -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: a6b3792843 ("r600: add core pieces of image support.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22394>
(cherry picked from commit 4f42d3b843)
This is the analogous of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9490 but for
r600.
Discoloration of NV12 video frames was observed in Chrome/ChromeOS and
the problem was tracked down to the fact that Mesa was following the
PIPE_FORMAT_R8_G8B8_420_UNORM/lower_yuv_external() path. The symptom is
that (for an unknown reason) the YUV-to-RGB conversion is using the
value of Y as the value of Y, U, and V. So, for example, if the input
value is YUV = (50, 120, 130), then what actually gets converted to RGB
is YUV = (50, 50, 50).
Considering that PIPE_FORMAT_R8_G8B8_420_UNORM was introduced for
freedreno
(https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6693) and it
is already being reported as unsupported for radeonsi, it's reasonable
to assume that GPUs targeted by r600 don't support this path either.
Note: I tested this patch with an AMD Palm device which follows the
evergreen_is_format_supported() path. I did not have access to a device
to test the r600_is_format_supported() path.
v2: Changed >= 2 to > 1.
Fixes: 826a10255f ("st/mesa: Add NV12 lowering to PIPE_FORMAT_R8_G8B8_420_UNORM")
Tested-by: Andres Calderon Jaramillo <andrescj@chromium.org>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22511>
(cherry picked from commit 4405e8a9e1)
previously there was a fallback path here (broken by f6d3a5755f)
which would attempt to demote BAR allocations to other heaps on failure
to avoid oom
this was great, but it's not the most robust solution, which is to iterate
all the memory types matching the given heap and try them in addition to having
a demotion fallback
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22567>
This fixes some CTS compiler tests where they relied on the cl_kernel
object to be released in time so it can recompile a program without
throwing CL_INVALID_OPERATION due to still having active kernel objects.
Fixes: 47a80d7ff4 ("rusticl/event: proper eventing support")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22510>
(cherry picked from commit 60cfe15d79)
For instance, this issue is triggered with: "piglit/bin/useprogram-flushverts-2 -auto -fbo" or
"piglit/bin/primitive-restart-draw-mode line_loop -auto"
while setting GALLIUM_REFCNT_LOG=refcnt.log.
Fixes: 27dcb46629 ("gallium: add take_ownership param into set_vertex_buffers to eliminate atomics")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22395>
(cherry picked from commit 28cb33fada)
With the on_demand shaders feature, meta pipelines are only created
when they are used, otherwise they are NULL. Though, inside secondary
cmdbuffers, the graphics pipeline might be also NULL. In this specific
case, radv_is_{dcc,fmask}_decompress_pipeline() would return
TRUE because these pipelines are NULL too...
This fixes flakes with tests that use secondary cmdbuffers with
TC-compat images.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22440>
(cherry picked from commit 0b4e7491f3)
This feature is supported, but enabling it accidentally enables
OpenGL 3.0 & 3.1, which are not supported.
The feature is enabled in main and discussion to fix the issue is
underway, but we're disabling it in releases until we find a better
solution.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22260>
2023-04-13 23:16:11 +01:00
6657 changed files with 445206 additions and 927832 deletions
# Bring artifacts back from the NFS dir to the build dir where gitlab-runner
# will look for them.
cp -Rp /nfs/results/. results/
if[ -f "${STRUCTURED_LOG_FILE}"];then
cp -p ${STRUCTURED_LOG_FILE} results/
echo"Structured log file is available at https://${CI_PROJECT_ROOT_NAMESPACE}.pages.freedesktop.org/-/${CI_PROJECT_NAME}/-/jobs/${CI_JOB_ID}/artifacts/results/${STRUCTURED_LOG_FILE}"
# Ephemeral packages (installed for this script and removed again at the end)
EPHEMERAL=(
libssl-dev
)
STABLE_EPHEMERAL="libssl-dev"
apt-get -y install "${EPHEMERAL[@]}"
apt-get -y install "$STABLE_EPHEMERAL"
. .gitlab-ci/container/build-mold.sh
apt-get purge -y "${EPHEMERAL[@]}"
apt-get purge -y "$STABLE_EPHEMERAL"
. .gitlab-ci/container/cross_build.sh
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.