The workaround address is used as a source for push constants when
there's no resource available, that address must be 32B aligned.
This fixes invalid address being used for buffers in
3DSTATE_CONSTANT_* packets.
Now that intel_debug_write_identifiers() already add the padding,
there's no need to include extra "+ 8" to the offset.
Thanks to Xiaoming Wang that contributed to find and fix this issue.
Fixes: 2a4c361b06 ("iris: add identifier BO")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21479>
(cherry picked from commit a4a0417263)
When establishing a texture transfer map, if there is any pending changes on the
texture, instead of trying direct map with DONTBLOCK first, just
use the upload buffer path.
Fixes piglit tests gen-teximages, arb_copy_images-formats
Cc: mesa-stable
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 1b9b060f0e)
When an EGLImage is created from a 2D texture and used for texture sharing,
the texture surface might not have been created with the SHARED bind flag.
To allow these surfaces for sharing, this patch sets the USAGE SHARED bit
for surfaces that can be potentially used for sharing even when the SHARED
bind flag is not originally set. Instead of unconditionally enabling the
SHARED bind flag for all surfaces and unnecessarily bypass the surface cache
optimization, this patch only enables the USAGE SHARED bit for surfaces
that also have the RENDER TARGET bind flag.
When the surface handle is inquired and if the surface is currently
marked as cachable, we will need to unset the cachable bit so
the surface handle will not be recycled again.
This patch fixes an assertion in svga_resource_get_handle() when the
EGL_MESA_image_dma_buf_export extension is used in webgl benchamrk running
from firefox in Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
(cherry picked from commit 75b7296fc3)
This change is wrong for two reasons:
- it hangs most of the time maybe, because changing PSTATE when the
application is running is broken somehow
- it increases the time between triggering and generating the capture
considerably, because there is a delay for changing PSTATE
This restores previous logic where PSTATE is set to profile_peak at
logical device creation. Though, it also re-introduces an issue with
multiple logical devices (kernel returns -EBUSY) but this will be
fixed in the next commit.
This fixes GPU hangs when trying to record RGP captures on my NAVI21.
Note that profile_peak is only required for some RDNA2 chips (including
VanGogh).
Cc: mesa-stable
This reverts commit 923a864d94.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21222>
(cherry picked from commit 663877e894)
We loop, effectively, over two stacks: ready and to_do and finish only
when both are empty. In the case where ready is empty, we pull one off
of to_do, add a copy to a temporary, and push it onto the ready stack.
Previously, we assumed that we would never get to the temporary copy
case if to_do has exactly one entry because that would imply that there
was only one copy left which means there can't possibly be a cycle to
break. This was true until c7fc44f9eb ("nir/from_ssa: Respect and
populate divergence information") which changed things such that
temporary copies sometimes get added in the case where a convergent
value is copied both to convergent and divergent destinations.
This patch adjusts our loop iteration to always attempt to clear the
ready stack before checking if there's anything left on the to_do stack.
I also added an assert to make the exit condition more clear.
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 4e09d37f3b)
There is an optimization in the parallel copy algorithm where, after a
copy has been performed, we can treat the destination as the new source
for future copies of the same source. In particular, consider the
following parallel copy: A -> B, C -> A, A -> C. In this case, after we
have done the A -> B copy, we can make note that the value in A is now
in B and emit the sequence: A -> B, C -> A, B -> C. This allows us to
resolve the swap cycle between A anc C without allocating a temporary
register because we know B is also a copy of A.
When one of the registers involved is convergent and the other is
divergent, this optimization is problematic because, while convergent to
divergent copies are fine, we can't re-use the divergent copy in later
copies if any of those copies are to a convergent variable. We could,
but it would require a read_first_invocation which would get messy. In
In c7fc44f9eb ("nir/from_ssa: Respect and populate divergence
information"), we attempted to deal with this by limiting the rename
optimization to the case where the divergence matched.
The problem is that we did the re-name part whenever the divergence
matched but only marked it as ready if the thing being copied was a
destination. (We actually left two instances of loc[a] = b, one which
always happened and one which only happened if we also wanted to flag
the source as being ready to use as a destination.) While this
technically doesn't cause any problems, it may result in more inter-mov
dependencies which hurts instruction scheduling. For example, if we had
the parallel copy A -> B, A -> C, A -> D, we now end up emitting the
sequence A -> B, B -> C, C -> D which has many more data hazards between
instructions caused by the constant shuffling.
This commit restores the original logic in which we only perform the
rename optimization if the rename would free up a register we will later
use as a destination. This isn't entirely optimal as it still doesn't
prove that there is a cycle involved first, but it should lead to a
reduction in unnecessary dependencies.
No shader-db changes on SKL or DG2
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
(cherry picked from commit 5afba073c6)
base_mip_width is used in si_compute_copy_image when the
SI_IMAGE_ACCESS_BLOCK_FORMAT_AS_UINT flag is used.
width = tex->surface.u.gfx9.base_mip_width;
This will be incorrect if we don't adjust it. For instance,
with a 260x256 image, surf_pitch and base_mip_width are
320 before surf_pitch is updated to be 192.
Both need to match, or computing the width from base_mip_width
leads to incorrect result.
Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21253>
(cherry picked from commit affa8a9fb2)