Compare commits

..

45 Commits

Author SHA1 Message Date
Dylan Baker
df2977f871 Bump version for 20.2.2 release 2020-11-06 15:40:35 -08:00
Dylan Baker
92a401d4a3 docs: add release notes for 20.2.2 2020-11-06 15:40:06 -08:00
Lionel Landwerlin
f17e6fcb7a blorp: allow blits with floating point source layers
The current blorp API only allows source layers for 3D images to be
integers. That is causing problems with the Vulkan API where we need
to be able to use a 3D layer that could be in between 2 layers.

This change allows a floating point value to be passed for blits and
internally sets up the input parameters to pass floating point values
to kernels.

v2: Use tex op to determinate what types are the coordinates (Jason)
    Drop setting params->z (Lionel)

v3: Fix nir_texop_txf_ms_mcs op not considered as having integer coords (Lionel)

v4: Fix incorrect test on nir_texop_txf_ms_mcs (Ivan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
(cherry picked from commit 87934f02f9)
2020-11-02 07:48:29 -08:00
Lionel Landwerlin
35b93d6f8a anv: fix source/destination layers for 3D blits
When blitting from source depth range [0-3] into destination depth
range [0-2], we'll have to use a source layer that is in between 2
layers of the 3D source image.

Other than having an incorrect formula, we're also using integer which
prevent us from using the right source layer.

v2: Drop + 0.5 on application offsets

v3: Reuse num_layers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
(cherry picked from commit ea32691257)
2020-11-02 07:48:26 -08:00
Marek Olšák
a7e169380d winsys/amdgpu: remove incorrect assertion check against max_check_space_size
Fixes: 114a899cc8 "winsys/amdgpu: cs_check_space sets the minimum IB size for future IBs"

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7056>
(cherry picked from commit 095ee8f867)
2020-11-02 07:48:26 -08:00
Bas Nieuwenhuizen
52ef9c22a0 radv: Fix variable name collision.
idx was aliased, and eb104e949e started
using the outer var in the inner scope ...

Fixes: eb104e949e
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3701
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7388>
(cherry picked from commit 8943c80c9b)
2020-11-02 07:48:26 -08:00
Michael Tretter
e8b625bc96 etnaviv: free tgsi tokens when shader state is deleted
The tokens are allocated using tgsi_dup_tokens when the shader state is
created, so we need to free them explicitly when deleting the shader state.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7367>
(cherry picked from commit 98db7c4841)
2020-11-02 07:48:26 -08:00
Lucas Stach
25bc222815 etnaviv: blt: properly program surface TS offset for clears
We clear the wrong TS region for != level 0 surfaces or TS buffers
with a internal offset.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7367>
(cherry picked from commit 3ba753d9f5)
2020-11-02 07:48:26 -08:00
Lucas Stach
fd3c49bb78 etnaviv: drm: fix BO refcount race
There is a race where the BO refcount might drop to 0 before the
dmabuf/name import paths had a chance to grab a reference for a
BO found in the handle_table. The easiest solution is to keep the
refcount stable as long as the table_lock is held.

While a more involved scheme of rechecking the refcount before
actually destroying the BO might also work, the bo_del path isn't
called very often, so micro-optimizing a single mutex_lock seems
to be over-engineered, so go for the easy solution.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7367>
(cherry picked from commit 866bb22d6b)
2020-11-02 07:48:26 -08:00
Lionel Landwerlin
833d68899a intel/dev: Bump Max EU per subslice/dualsubslice
This isn't a problem right now because the previous max would give the
same result when aligned to a byte (8bits).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7288>
(cherry picked from commit b03c86a71f)
2020-11-02 07:48:26 -08:00
Dave Airlie
f401af6f18 gallivm: zero init the temporary register storage.
Due to flow control we can end up with random values in here having
side effects.

This fixes a crash in gtk4-demo.

Fixes: 44a6b0107b ("gallivm: add nir->llvm translation (v2)")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7327>
(cherry picked from commit f7d1460418)
2020-11-02 07:48:26 -08:00
Marcin Ślusarz
3e2245c454 intel/tools: fix invalid type in argument to printf
$2 is exp2, exp2 is defined to be llint and llint is defined to be
unsigned long long int.

Fixes error reported by Coverity:
CID 1451141: Invalid type in argument to printf format specifier (PRINTF_ARGS)

Fixes: 70308a5a8a ("intel/tools: New i965 instruction assembler tool")

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7351>
(cherry picked from commit e96f33cd30)
2020-11-02 07:48:26 -08:00
Bas Nieuwenhuizen
b540aaa0a8 radv: Do not access set layout during vkCmdBindDescriptorSets.
The spec says:

"
VkDescriptorSetLayout objects may be accessed by commands that operate on descriptor sets allocated using that layout
"

So our behavior is valid here, but this is a temporary workaround for an issue with Baldur's Gate 3.

CC: mesa-stable
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3607
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7207>
(cherry picked from commit eb104e949e)
2020-11-02 07:48:26 -08:00
Bas Nieuwenhuizen
7ccf2cf839 radv: Fix 1D compressed mipmaps on GFX9.
Partial rollback as GFX9 really requires height = 1 to work.

The two substantial parts of the fix remaining:

1) Deal with views with multiple levels.
2) Limit the expansion to the base mip pitch/height. On GFX9 this
   is exactly equal to the surf_pitch that was used before. I've
   done some investigation to make sure that on GFX10 this always
   results in the right physical layout.

Remaining stupid question is how the actual extents for bounds
checking never end up too low when the size gets clamped, but
this change and the previous change don't change that ...

Fixes: 1fb3e1fb70 "radv: Fix mipmap extent adjustment on GFX9+."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7245>
(cherry picked from commit 29999e6b9d)
2020-11-02 07:48:26 -08:00
Rhys Perry
562e89ff0d aco: ignore the ACO-inserted continue in create_continue_phis()
Otherwise, for loops without continue_or_break, create_continue_phis()
always returns an undef operand.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 638cbc21a1 ("aco: handle when ACO adds new continue edges")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2848
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7148>
(cherry picked from commit 26e53e3afa)
2020-11-02 07:48:26 -08:00
Rhys Perry
1ddbe3aa11 aco: update phi_map in add_subdword_operand()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 56345b8c61 ("aco: allow reading/writing upper halves/bytes when possible")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
(cherry picked from commit d4503a9020)
2020-11-02 07:48:26 -08:00
Dylan Baker
63754d2b77 .pick_status.json: Update to 8077f3f4c4 2020-11-02 07:48:26 -08:00
Rob Clark
bfa8ac8c67 freedreno: Disallow tiled if SHARED and not QCOM_COMPRESSED
If the user is not aware of modifiers, and wants to allocate a shared
resource, we shouldn't leave them with tiled.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3678
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7308>
(cherry picked from commit 67238f95b5)
2020-11-02 07:48:26 -08:00
Tapani Pälli
5afe855fde iris: fix the order of src and dst for fence memcpy
This fixes random failures with "deqp-egl --deqp-case=*multithread*":
   iris: Failed to submit batchbuffer: No such file or directory

Fixes: 6b1a56b908 ("iris: Drop stale syncobj references in fence_server_sync")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7289>
(cherry picked from commit cb6ce4a265)
2020-11-02 07:48:26 -08:00
Michel Dänzer
cacca2aaa1 loader/dri3: Allocate up to 4 back buffers for page flips
With swap interval 0, i.e. sync-to-vblank disabled.

This can be necessary for unthrottled drawing with Xwayland:

1) One buffer can be scanned out
2) One buffer can be pending in the kernel for a page flip
3) One buffer can be pending in the Wayland compositor

Therefore, with 3 buffers, the frame-rate could be capped much lower
than the throughput the GPU is capable of, in the worst case at the
Wayland compositor refresh rate.

(The native Wayland EGL backend always uses up to 4 buffers)

Leave the maximum number of buffers at 3 for swap interval != 0, it's
sufficient in that case to always be able to queue one frame ahead of
time.

https://gitlab.gnome.org/GNOME/mutter/-/issues/1455
https://gitlab.gnome.org/GNOME/mutter/-/issues/1462

Cc: mesa-stable
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7033>
(cherry picked from commit 31e9de9c8a)
2020-11-02 07:48:26 -08:00
Michel Dänzer
fa73f34f79 loader/dri3: Keep current number of back buffers if frame was skipped
We'd previously take the copy path. If we were actually flipping (in
which case skipped frames are more likely to occur), we'd ping-pong
between a smaller and larger number of back buffers, and frame-rate
could vary / take a dip due to the buffer management overhead.

While I'm not sure this is actually possible to hit at this point, it
definitely will be with the next change.

Cc: mesa-stable
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7033>
(cherry picked from commit 16a7cc4d44)
2020-11-02 07:48:26 -08:00
Michel Dänzer
86a542b0ab loader/dri3: Only allocate additional buffers if needed
Previously, we would always allocate 3 buffers for page flipping. But 2
buffers can suffice for clients which always wait for buffer swaps to
complete before starting a new frame.

Therefore, keep track of the maximum number of buffers separately from
the current number, and only bump the latter if both current buffers are
busy.

Cc: mesa-stable
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7033>
(cherry picked from commit 60585fc4e3)
2020-11-02 07:48:26 -08:00
Samuel Pitoiset
e12a581a3a aco: fix determining if LOD is zero for nir_texop_txf/nir_texop_txs
txf/txs expects LOD to be a 32-bit unsigned integer while other
texture operations expects a float.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3668
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7256>
(cherry picked from commit 4e2fe34aa9)
2020-10-26 10:34:25 -07:00
Tapani Pälli
2f09ff3fc8 gallivm/nir: handle nir_op_flt in lp_build_nir_llvm
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3663
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7248>
(cherry picked from commit c83d6ffa32)
2020-10-26 10:34:24 -07:00
Dylan Baker
4f0bee6775 .pick_status.json: Update to b92eadb29c 2020-10-26 10:34:22 -07:00
Ryan Neph
17e7be7c79 virgl: Fixes portal2 binary name in tweak config
Portal 2 on virgl w/ GLES host requires bgraswz and emubgra tweaks.
Application binary name matching mismatch caused tweaks to default
to a disabled state.

Signed-off-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Fixes: 9760a7ed91 ("virgl: apply bgra dest swizzle and add Portal 2")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7246>
(cherry picked from commit b2c737cf57)
2020-10-21 16:16:34 -07:00
Thong Thai
b9693aa851 frontends/va/postproc: Un-break field flag
Fixes an issue where deinterlaced videos would play at half the
framerate, since only one field was repeated, instead of using both
fields. Reverts a change I made previously which broke this.

Fixes: 78786a219e ("frontends/va: Fix deinterlace bottom field first flag")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3621
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7194>
(cherry picked from commit 354e375c9c)
2020-10-21 16:16:32 -07:00
Timothy Arceri
a1f8f5e241 glsl: relax rule on varying matching for shaders older than 4.00
Please see new code commment for full justification.

Fixes: 18004c338f ("glsl: fail when a shader's input var has not an equivalent out var in previous")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3648

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7184>
(cherry picked from commit c54c42321e)
2020-10-21 16:16:30 -07:00
Dylan Baker
bfa702dd0b .pick_status.json: Update to 025050bae7 2020-10-21 16:16:29 -07:00
Marcin Ślusarz
0bc9a81733 vulkan/wsi: fix possible random stalls in wsi_display_wait_for_event
pthread_cond_broadcast man page says this:
"The pthread_cond_broadcast() or pthread_cond_signal() functions may
 be called by a thread whether or not it currently owns the mutex that
 threads calling pthread_cond_wait() or pthread_cond_timedwait() have
 associated with the condition variable during their waits; however,
 if predictable scheduling behavior is required, then that mutex shall
 be locked by the thread calling pthread_cond_broadcast() or
 pthread_cond_signal()."

Found by reading the code.
Compile tested only.

Fixes: da997ebec9 ("vulkan: Add KHR_display extension using DRM [v10]")

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7197>
(cherry picked from commit 4408131142)
2020-10-20 10:02:11 -07:00
Dylan Baker
cdcf4542bd .pick_status.json: Update to d0f8fe5909 2020-10-20 10:02:08 -07:00
Nanley Chery
1219eace69 isl: Fix the aux-map encoding for D24_UNORM_X8
Bspec: 53911 now defines the encoding for this format.

Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7198>
(cherry picked from commit 3c87ac1f60)
2020-10-19 09:53:35 -07:00
Marek Olšák
07e1df3cd1 Revert "radeonsi/gfx10: disable vertex grouping"
This reverts commit 42f921387b.

It causes GPU hangs on gfx10.3.

Fixes: a23802bcb9 - ac,radeonsi: start adding support for gfx10.3

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7172>
(cherry picked from commit 6810e6e4d0)
2020-10-19 09:53:34 -07:00
Nanley Chery
ae6b7d1e3e intel/isl: Drop redundant unpack of unorm channels
Fixes: 09ced65420 ("intel/isl: Add format conversion code")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7168>
(cherry picked from commit 5e27e04322)
2020-10-19 09:53:32 -07:00
Nanley Chery
c6396afbac st/mesa: Add missing sentinels in format_map[]
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7169>
(cherry picked from commit cf11ebfbc2)
2020-10-19 09:53:27 -07:00
Dylan Baker
a4e81dd7a2 .pick_status.json: Update to 3c87ac1f60 2020-10-19 09:53:24 -07:00
Dylan Baker
00cde89303 .pick_status.json: Update to 7c5129985b 2020-10-16 09:40:02 -07:00
Rhys Perry
6f4d937c1f aco: add missing SCC clobber in get_buffer_size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: fcd6d83245 ("aco: fix imageSize()/textureSize() with large buffers on GFX8")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7162>
(cherry picked from commit fdb65b8b23)
2020-10-15 17:17:57 -07:00
Jose Maria Casanova Crespo
a6f351a666 vc4: Enable nir_lower_io for uniforms
Altough the driver isn't expected to receive nir_var_uniform types
from GLSL this happens currently for one of the internal driver shaders.

At vc4_get_yuv_fs at vc4_blit.c there is a "stride" nir_var_uniform
variable that needs to be lowered so the shader can be compiled.

This regression was affecting several piglit tests under
spec/ext_image_dma_buf_import and at least MythTV application.

Fixes: 96d99f2ecc ("vc4: Only call nir_lower_io on shader_in/out")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3536
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Piotr Oniszczuk <piotr.oniszczuk@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7160>
(cherry picked from commit d91cb31a2a)
2020-10-15 17:17:56 -07:00
Jose Maria Casanova Crespo
3f1601a1e3 vc4: Add missing load_ubo set_align in yuv_blit fs.
Fixes: e78a7a1825 ("nir: Assert memory loads are aligned")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Piotr Oniszczuk <piotr.oniszczuk@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7160>
(cherry picked from commit 4cfdd425b6)
2020-10-15 17:17:55 -07:00
Tony Wasserka
ca34f519ec aco/isel: Always export position data from VS/NGG
AMD ISA docs explicitly require this for VS, and this likely extends to
NGG too.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3615
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>
(cherry picked from commit bf51b11c04)
2020-10-15 17:17:54 -07:00
Rhys Perry
fa22dff663 nir/opt_load_store_vectorize: don't vectorize stores across demote
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: ce9205c03b ("nir: add a load/store vectorization pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7163>
(cherry picked from commit f8e971f511)
2020-10-15 17:17:52 -07:00
Dylan Baker
a3b0904eb9 .pick_status.json: Update to aea74eac3d 2020-10-15 17:17:50 -07:00
Dylan Baker
9b7dfc0a61 .pick_status.json: Update to f29c81f863 2020-10-14 10:41:09 -07:00
Dylan Baker
f0498ea8f5 docs: add SHA256 sums for 20.2.1 2020-10-14 10:33:42 -07:00
39 changed files with 6267 additions and 99 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
20.2.1
20.2.2

View File

@@ -19,7 +19,7 @@ SHA256 checksum
::
TBD.
d1a46d9a3f291bc0e0374600bdcb59844fa3eafaa50398e472a36fc65fd0244a mesa-20.2.1.tar.xz
New features

147
docs/relnotes/20.2.2.rst Normal file
View File

@@ -0,0 +1,147 @@
Mesa 20.2.2 Release Notes / 2020-11-06
======================================
Mesa 20.2.2 is a bug fix release which fixes bugs found since the 20.2.1 release.
Mesa 20.2.2 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 20.2.2 implements the Vulkan 1.2 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA256 checksum
---------------
::
TBD.
New features
------------
- None
Bug fixes
---------
- anv: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.3d* failures
- anv: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.3d* failures
- radv/aco: Vertex explosion on RPCS3
- Gnome 3.38 with Xwayland has screen corruption for X11 apps.
- RADV: Death Stranding glitchy sky rendering
- Crash in glDrawArrays on Intel iris
- deinterlace_vaapi=rate=field does not double output's actual frame rate on AMD
- Steam game Haydee leans on implementation-dependent behavior
- vc4 in 20.2-rc has regression causing app to crash
- [RADV/ACO] Star Citizen Lighting/Shadow Issue
Changes
-------
Bas Nieuwenhuizen (3):
- radv: Fix 1D compressed mipmaps on GFX9.
- radv: Do not access set layout during vkCmdBindDescriptorSets.
- radv: Fix variable name collision.
Dave Airlie (1):
- gallivm: zero init the temporary register storage.
Dylan Baker (9):
- docs: add SHA256 sums for 20.2.1
- .pick_status.json: Update to f29c81f863c9879a6a87724cbdae1e1818f3f6b4
- .pick_status.json: Update to aea74eac3d7706ed8d870504b163356e3f104a4c
- .pick_status.json: Update to 7c5129985bcac75053823a31674e8a1e2629230c
- .pick_status.json: Update to 3c87ac1f60875b5bbd4facca22fc426ee747997a
- .pick_status.json: Update to d0f8fe5909107aa342f62813ced9ce535ed6da32
- .pick_status.json: Update to 025050bae73d0598d788e3c307328670a3bf51c1
- .pick_status.json: Update to b92eadb29cc8ef09096d9196434d49e35a3eccaf
- .pick_status.json: Update to 8077f3f4c4a3d8007caa30eed93fed1c6bbf3c5a
Jose Maria Casanova Crespo (2):
- vc4: Add missing load_ubo set_align in yuv_blit fs.
- vc4: Enable nir_lower_io for uniforms
Lionel Landwerlin (3):
- intel/dev: Bump Max EU per subslice/dualsubslice
- anv: fix source/destination layers for 3D blits
- blorp: allow blits with floating point source layers
Lucas Stach (2):
- etnaviv: drm: fix BO refcount race
- etnaviv: blt: properly program surface TS offset for clears
Marcin Ślusarz (2):
- vulkan/wsi: fix possible random stalls in wsi_display_wait_for_event
- intel/tools: fix invalid type in argument to printf
Marek Olšák (2):
- Revert "radeonsi/gfx10: disable vertex grouping"
- winsys/amdgpu: remove incorrect assertion check against max_check_space_size
Michael Tretter (1):
- etnaviv: free tgsi tokens when shader state is deleted
Michel Dänzer (3):
- loader/dri3: Only allocate additional buffers if needed
- loader/dri3: Keep current number of back buffers if frame was skipped
- loader/dri3: Allocate up to 4 back buffers for page flips
Nanley Chery (3):
- st/mesa: Add missing sentinels in format_map[]
- intel/isl: Drop redundant unpack of unorm channels
- isl: Fix the aux-map encoding for D24_UNORM_X8
Rhys Perry (4):
- nir/opt_load_store_vectorize: don't vectorize stores across demote
- aco: add missing SCC clobber in get_buffer_size
- aco: update phi_map in add_subdword_operand()
- aco: ignore the ACO-inserted continue in create_continue_phis()
Rob Clark (1):
- freedreno: Disallow tiled if SHARED and not QCOM_COMPRESSED
Ryan Neph (1):
- virgl: Fixes portal2 binary name in tweak config
Samuel Pitoiset (1):
- aco: fix determining if LOD is zero for nir_texop_txf/nir_texop_txs
Tapani Pälli (2):
- gallivm/nir: handle nir_op_flt in lp_build_nir_llvm
- iris: fix the order of src and dst for fence memcpy
Thong Thai (1):
- frontends/va/postproc: Un-break field flag
Timothy Arceri (1):
- glsl: relax rule on varying matching for shaders older than 4.00
Tony Wasserka (1):
- aco/isel: Always export position data from VS/NGG

View File

@@ -6194,7 +6194,7 @@ void get_buffer_size(isel_context *ctx, Temp desc, Temp dst, bool in_elements)
Temp size = emit_extract_vector(ctx, desc, 2, s1);
Temp size_div3 = bld.vop3(aco_opcode::v_mul_hi_u32, bld.def(v1), bld.copy(bld.def(v1), Operand(0xaaaaaaabu)), size);
size_div3 = bld.sop2(aco_opcode::s_lshr_b32, bld.def(s1), bld.as_uniform(size_div3), Operand(1u));
size_div3 = bld.sop2(aco_opcode::s_lshr_b32, bld.def(s1), bld.def(s1, scc), bld.as_uniform(size_div3), Operand(1u));
Temp stride = emit_extract_vector(ctx, desc, 1, s1);
stride = bld.sop2(aco_opcode::s_bfe_u32, bld.def(s1), bld.def(s1, scc), stride, Operand((5u << 16) | 16u));
@@ -8514,9 +8514,7 @@ void visit_tex(isel_context *ctx, nir_tex_instr *instr)
has_bias = true;
break;
case nir_tex_src_lod: {
nir_const_value *val = nir_src_as_const_value(instr->src[i].src);
if (val && val->f32 <= 0.0) {
if (nir_src_is_const(instr->src[i].src) && nir_src_as_uint(instr->src[i].src) == 0) {
level_zero = true;
} else {
lod = get_ssa_temp(ctx, instr->src[i].src.ssa);
@@ -9433,7 +9431,7 @@ static Operand create_continue_phis(isel_context *ctx, unsigned first, unsigned
continue;
}
if (block.kind & block_kind_continue) {
if ((block.kind & block_kind_continue) && block.index != last) {
vals[idx - first] = header_phi->operands[next_pred];
next_pred++;
continue;
@@ -10083,6 +10081,11 @@ static void create_vs_exports(isel_context *ctx)
ctx->outputs.temps[VARYING_SLOT_LAYER * 4u] = as_vgpr(ctx, get_arg(ctx, ctx->args->ac.view_index));
}
/* Hardware requires position data to always be exported, even if the
* application did not write gl_Position.
*/
ctx->outputs.mask[VARYING_SLOT_POS] = 0xf;
/* the order these position exports are created is important */
int next_pos = 0;
bool exported_pos = export_vs_varying(ctx, VARYING_SLOT_POS, true, &next_pos);

View File

@@ -38,8 +38,10 @@
namespace aco {
namespace {
struct ra_ctx;
unsigned get_subdword_operand_stride(chip_class chip, const aco_ptr<Instruction>& instr, unsigned idx, RegClass rc);
void add_subdword_operand(chip_class chip, aco_ptr<Instruction>& instr, unsigned idx, unsigned byte, RegClass rc);
void add_subdword_operand(ra_ctx& ctx, aco_ptr<Instruction>& instr, unsigned idx, unsigned byte, RegClass rc);
std::pair<unsigned, unsigned> get_subdword_definition_info(Program *program, const aco_ptr<Instruction>& instr, RegClass rc);
void add_subdword_definition(Program *program, aco_ptr<Instruction>& instr, unsigned idx, PhysReg reg, bool is_partial);
@@ -352,8 +354,22 @@ unsigned get_subdword_operand_stride(chip_class chip, const aco_ptr<Instruction>
return 4;
}
void add_subdword_operand(chip_class chip, aco_ptr<Instruction>& instr, unsigned idx, unsigned byte, RegClass rc)
void update_phi_map(ra_ctx& ctx, Instruction *old, Instruction *instr)
{
for (Operand& op : instr->operands) {
if (!op.isTemp())
continue;
std::unordered_map<unsigned, phi_info>::iterator phi = ctx.phi_map.find(op.tempId());
if (phi != ctx.phi_map.end()) {
phi->second.uses.erase(old);
phi->second.uses.emplace(instr);
}
}
}
void add_subdword_operand(ra_ctx& ctx, aco_ptr<Instruction>& instr, unsigned idx, unsigned byte, RegClass rc)
{
chip_class chip = ctx.program->chip_class;
if (instr->format == Format::PSEUDO || byte == 0)
return;
@@ -376,7 +392,9 @@ void add_subdword_operand(chip_class chip, aco_ptr<Instruction>& instr, unsigned
}
return;
} else if (can_use_SDWA(chip, instr)) {
convert_to_SDWA(chip, instr);
aco_ptr<Instruction> tmp = convert_to_SDWA(chip, instr);
if (tmp)
update_phi_map(ctx, tmp.get(), instr.get());
return;
} else if (rc.bytes() == 2 && can_use_opsel(chip, instr->opcode, idx, byte / 2)) {
VOP3A_instruction *vop3 = static_cast<VOP3A_instruction *>(instr.get());
@@ -2233,7 +2251,7 @@ void register_allocation(Program *program, std::vector<TempSet>& live_out_per_bl
if (op.isTemp() && op.isFirstKill() && op.isLateKill())
register_file.clear(op);
if (op.isTemp() && op.physReg().byte() != 0)
add_subdword_operand(program->chip_class, instr, i, op.physReg().byte(), op.regClass());
add_subdword_operand(ctx, instr, i, op.physReg().byte(), op.regClass());
}
/* emit parallelcopy */
@@ -2366,19 +2384,9 @@ void register_allocation(Program *program, std::vector<TempSet>& live_out_per_bl
aco_ptr<Instruction> tmp = std::move(instr);
Format format = asVOP3(tmp->format);
instr.reset(create_instruction<VOP3A_instruction>(tmp->opcode, format, tmp->operands.size(), tmp->definitions.size()));
for (unsigned i = 0; i < instr->operands.size(); i++) {
Operand& operand = tmp->operands[i];
instr->operands[i] = operand;
/* keep phi_map up to date */
if (operand.isTemp()) {
std::unordered_map<unsigned, phi_info>::iterator phi = ctx.phi_map.find(operand.tempId());
if (phi != ctx.phi_map.end()) {
phi->second.uses.erase(tmp.get());
phi->second.uses.emplace(instr.get());
}
}
}
std::copy(tmp->operands.begin(), tmp->operands.end(), instr->operands.begin());
std::copy(tmp->definitions.begin(), tmp->definitions.end(), instr->definitions.begin());
update_phi_map(ctx, tmp.get(), instr.get());
}
instructions.emplace_back(std::move(*it));

View File

@@ -3842,7 +3842,6 @@ radv_bind_descriptor_set(struct radv_cmd_buffer *cmd_buffer,
radv_set_descriptor_set(cmd_buffer, bind_point, set, idx);
assert(set);
assert(!(set->layout->flags & VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR));
if (!cmd_buffer->device->use_global_bo_list) {
for (unsigned j = 0; j < set->buffer_count; ++j)
@@ -3873,17 +3872,17 @@ void radv_CmdBindDescriptorSets(
radv_get_descriptors_state(cmd_buffer, pipelineBindPoint);
for (unsigned i = 0; i < descriptorSetCount; ++i) {
unsigned idx = i + firstSet;
unsigned set_idx = i + firstSet;
RADV_FROM_HANDLE(radv_descriptor_set, set, pDescriptorSets[i]);
/* If the set is already bound we only need to update the
* (potentially changed) dynamic offsets. */
if (descriptors_state->sets[idx] != set ||
!(descriptors_state->valid & (1u << idx))) {
radv_bind_descriptor_set(cmd_buffer, pipelineBindPoint, set, idx);
if (descriptors_state->sets[set_idx] != set ||
!(descriptors_state->valid & (1u << set_idx))) {
radv_bind_descriptor_set(cmd_buffer, pipelineBindPoint, set, set_idx);
}
for(unsigned j = 0; j < set->layout->dynamic_offset_count; ++j, ++dyn_idx) {
for(unsigned j = 0; j < layout->set[set_idx].dynamic_offset_count; ++j, ++dyn_idx) {
unsigned idx = j + layout->set[i + firstSet].dynamic_offset_start;
uint32_t *dst = descriptors_state->dynamic_buffers + idx * 4;
assert(dyn_idx < dynamicOffsetCount);
@@ -3912,8 +3911,7 @@ void radv_CmdBindDescriptorSets(
}
}
cmd_buffer->push_constant_stages |=
set->layout->dynamic_shader_stages;
cmd_buffer->push_constant_stages |= layout->set[set_idx].dynamic_offset_stages;
}
}
}

View File

@@ -432,10 +432,16 @@ VkResult radv_CreatePipelineLayout(
layout->set[set].layout = set_layout;
layout->set[set].dynamic_offset_start = dynamic_offset_count;
layout->set[set].dynamic_offset_count = 0;
layout->set[set].dynamic_offset_stages = 0;
for (uint32_t b = 0; b < set_layout->binding_count; b++) {
dynamic_offset_count += set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
dynamic_shader_stages |= set_layout->dynamic_shader_stages;
layout->set[set].dynamic_offset_count +=
set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
layout->set[set].dynamic_offset_stages |= set_layout->dynamic_shader_stages;
}
dynamic_offset_count += layout->set[set].dynamic_offset_count;
dynamic_shader_stages |= layout->set[set].dynamic_offset_stages;
_mesa_sha1_update(&ctx, set_layout, set_layout->layout_size);
}

View File

@@ -89,7 +89,9 @@ struct radv_pipeline_layout {
struct {
struct radv_descriptor_set_layout *layout;
uint32_t size;
uint32_t dynamic_offset_start;
uint16_t dynamic_offset_start;
uint16_t dynamic_offset_count;
VkShaderStageFlags dynamic_offset_stages;
} set[MAX_SETS];
uint32_t num_sets;

View File

@@ -1597,6 +1597,11 @@ radv_image_view_init(struct radv_image_view *iview,
iview->aspect_mask = pCreateInfo->subresourceRange.aspectMask;
iview->multiple_planes = vk_format_get_plane_count(image->vk_format) > 1 && iview->aspect_mask == VK_IMAGE_ASPECT_COLOR_BIT;
iview->base_layer = range->baseArrayLayer;
iview->layer_count = radv_get_layerCount(image, range);
iview->base_mip = range->baseMipLevel;
iview->level_count = radv_get_levelCount(image, range);
iview->vk_format = pCreateInfo->format;
/* If the image has an Android external format, pCreateInfo->format will be
@@ -1652,21 +1657,43 @@ radv_image_view_init(struct radv_image_view *iview,
*
* This means that mip2 will be missing texels.
*
* Fix it by taking the actual extent addrlib assigned to the base mip level.
* Fix this by calculating the base mip's width and height, then convert
* that, and round it back up to get the level 0 size. Clamp the
* converted size between the original values, and the physical extent
* of the base mipmap.
*
* On GFX10 we have to take care to not go over the physical extent
* of the base mipmap as otherwise the GPU computes a different layout.
* Note that the GPU does use the same base-mip dimensions for both a
* block compatible format and the compressed format, so even if we take
* the plain converted dimensions the physical layout is correct.
*/
if (device->physical_device->rad_info.chip_class >= GFX9 &&
vk_format_is_compressed(image->vk_format) &&
!vk_format_is_compressed(iview->vk_format) &&
iview->image->info.levels > 1) {
iview->extent.width = iview->image->planes[0].surface.u.gfx9.base_mip_width;
iview->extent.height = iview->image->planes[0].surface.u.gfx9.base_mip_height;
}
}
vk_format_is_compressed(image->vk_format) &&
!vk_format_is_compressed(iview->vk_format)) {
/* If we have multiple levels in the view we should ideally take the last level,
* but the mip calculation has a max(..., 1) so walking back to the base mip in an
* useful way is hard. */
if (iview->level_count > 1) {
iview->extent.width = iview->image->planes[0].surface.u.gfx9.base_mip_width;
iview->extent.height = iview->image->planes[0].surface.u.gfx9.base_mip_height;
} else {
unsigned lvl_width = radv_minify(image->info.width , range->baseMipLevel);
unsigned lvl_height = radv_minify(image->info.height, range->baseMipLevel);
iview->base_layer = range->baseArrayLayer;
iview->layer_count = radv_get_layerCount(image, range);
iview->base_mip = range->baseMipLevel;
iview->level_count = radv_get_levelCount(image, range);
lvl_width = round_up_u32(lvl_width * view_bw, img_bw);
lvl_height = round_up_u32(lvl_height * view_bh, img_bh);
lvl_width <<= range->baseMipLevel;
lvl_height <<= range->baseMipLevel;
iview->extent.width = CLAMP(lvl_width, iview->extent.width,
iview->image->planes[0].surface.u.gfx9.base_mip_width);
iview->extent.height = CLAMP(lvl_height, iview->extent.height,
iview->image->planes[0].surface.u.gfx9.base_mip_height);
}
}
}
bool disable_compression = extra_create_info ? extra_create_info->disable_compression: false;
for (unsigned i = 0; i < (iview->multiple_planes ? vk_format_get_plane_count(image->vk_format) : 1); ++i) {

View File

@@ -875,10 +875,40 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
/* Check for input vars with unmatched output vars in prev stage
* taking into account that interface blocks could have a matching
* output but with different name, so we ignore them.
*
* Section 4.3.4 (Inputs) of the GLSL 4.10 specifications say:
*
* "Only the input variables that are actually read need to be
* written by the previous stage; it is allowed to have
* superfluous declarations of input variables."
*
* However it's not defined anywhere as to how we should handle
* inputs that are not written in the previous stage and it's not
* clear what "actually read" means.
*
* The GLSL 4.20 spec however is much clearer:
*
* "Only the input variables that are statically read need to
* be written by the previous stage; it is allowed to have
* superfluous declarations of input variables."
*
* It also has a table that states it is an error to statically
* read an input that is not defined in the previous stage. While
* it is not an error to not statically write to the output (it
* just needs to be defined to not be an error).
*
* The text in the GLSL 4.20 spec was an attempt to clarify the
* previous spec iterations. However given the difference in spec
* and that some applications seem to depend on not erroring when
* the input is not actually read in control flow we only apply
* this rule to GLSL 4.00 and higher. GLSL 4.00 was chosen as
* a 3.30 shader is the highest version of GLSL we have seen in
* the wild dependant on the less strict interpretation.
*/
assert(!input->data.assigned);
if (input->data.used && !input->get_interface_type() &&
!input->data.explicit_location)
!input->data.explicit_location &&
(prog->data->Version >= (prog->IsES ? 0 : 400)))
linker_error(prog,
"%s shader input `%s' "
"has no matching output in the previous stage\n",

View File

@@ -1219,6 +1219,11 @@ handle_barrier(struct vectorize_ctx *ctx, bool *progress, nir_function_impl *imp
case nir_intrinsic_discard:
modes = nir_var_all;
break;
case nir_intrinsic_demote_if:
case nir_intrinsic_demote:
acquire = false;
modes = nir_var_all;
break;
case nir_intrinsic_memory_barrier_buffer:
modes = nir_var_mem_ssbo | nir_var_mem_global;
break;

View File

@@ -355,6 +355,8 @@ nir_schedule_intrinsic_deps(nir_deps_state *state,
case nir_intrinsic_discard:
case nir_intrinsic_discard_if:
case nir_intrinsic_demote:
case nir_intrinsic_demote_if:
/* We are adding two dependencies:
*
* * A individual one that we could use to add a read_dep while handling

View File

@@ -257,11 +257,15 @@ void etna_bo_del(struct etna_bo *bo)
struct etna_device *dev = bo->dev;
if (!p_atomic_dec_zero(&bo->refcnt))
return;
pthread_mutex_lock(&etna_drm_table_lock);
/* Must test under table lock to avoid racing with the from_dmabuf/name
* paths, which rely on the BO refcount to be stable over the lookup, so
* they can grab a reference when the BO is found in the hash.
*/
if (!p_atomic_dec_zero(&bo->refcnt))
goto out;
if (bo->reuse && (etna_bo_cache_free(&dev->bo_cache, bo) == 0))
goto out;

View File

@@ -555,6 +555,7 @@ static LLVMValueRef do_alu_action(struct lp_build_nir_context *bld_base,
case nir_op_flog2:
result = lp_build_log2_safe(&bld_base->base, src[0]);
break;
case nir_op_flt:
case nir_op_flt32:
result = fcmp32(bld_base, PIPE_FUNC_LESS, src_bit_size[0], src);
break;
@@ -1975,8 +1976,8 @@ bool lp_build_nir_llvm(
nir_foreach_register(reg, &func->impl->registers) {
LLVMTypeRef type = get_register_type(bld_base, reg);
LLVMValueRef reg_alloc = lp_build_alloca_undef(bld_base->base.gallivm,
type, "reg");
LLVMValueRef reg_alloc = lp_build_alloca(bld_base->base.gallivm,
type, "reg");
_mesa_hash_table_insert(bld_base->regs, reg, reg_alloc);
}
nir_index_ssa_defs(func->impl);

View File

@@ -229,7 +229,7 @@ etna_blit_clear_color_blt(struct pipe_context *pctx, struct pipe_surface *dst,
if (surf->surf.ts_size) {
clr.dest.use_ts = 1;
clr.dest.ts_addr.bo = res->ts_bo;
clr.dest.ts_addr.offset = 0;
clr.dest.ts_addr.offset = surf->level->ts_offset;
clr.dest.ts_addr.flags = ETNA_RELOC_WRITE;
clr.dest.ts_clear_value[0] = new_clear_value;
clr.dest.ts_clear_value[1] = new_clear_value >> 32;
@@ -308,7 +308,7 @@ etna_blit_clear_zs_blt(struct pipe_context *pctx, struct pipe_surface *dst,
if (surf->surf.ts_size) {
clr.dest.use_ts = 1;
clr.dest.ts_addr.bo = res->ts_bo;
clr.dest.ts_addr.offset = 0;
clr.dest.ts_addr.offset = surf->level->ts_offset;
clr.dest.ts_addr.flags = ETNA_RELOC_WRITE;
clr.dest.ts_clear_value[0] = surf->level->clear_value;
clr.dest.ts_clear_value[1] = surf->level->clear_value;

View File

@@ -445,6 +445,7 @@ etna_delete_shader_state(struct pipe_context *pctx, void *ss)
etna_destroy_shader(t);
}
tgsi_free_tokens(shader->tokens);
ralloc_free(shader->nir);
FREE(shader);
}

View File

@@ -933,8 +933,12 @@ fd_resource_create_with_modifiers(struct pipe_screen *pscreen,
* should.)
*/
bool allow_ubwc = drm_find_modifier(DRM_FORMAT_MOD_INVALID, modifiers, count);
if (tmpl->bind & PIPE_BIND_SHARED)
if (tmpl->bind & PIPE_BIND_SHARED) {
allow_ubwc = drm_find_modifier(DRM_FORMAT_MOD_QCOM_COMPRESSED, modifiers, count);
if (!allow_ubwc) {
linear = true;
}
}
allow_ubwc &= !(fd_mesa_debug & FD_DBG_NOUBWC);

View File

@@ -154,7 +154,7 @@ clear_stale_syncobjs(struct iris_batch *batch)
if (syncobj != nth_syncobj) {
*syncobj = *nth_syncobj;
memcpy(nth_fence, fence, sizeof(*fence));
memcpy(fence, nth_fence, sizeof(*fence));
}
}
}

View File

@@ -668,23 +668,25 @@ static void gfx10_emit_ge_cntl(struct si_context *sctx, unsigned num_patches)
if (sctx->ngg) {
if (sctx->tes_shader.cso) {
ge_cntl = S_03096C_PRIM_GRP_SIZE(num_patches) |
S_03096C_VERT_GRP_SIZE(256) | /* 256 = disable vertex grouping */
S_03096C_VERT_GRP_SIZE(0) |
S_03096C_BREAK_WAVE_AT_EOI(key.u.tess_uses_prim_id);
} else {
ge_cntl = si_get_vs_state(sctx)->ge_cntl;
}
} else {
unsigned primgroup_size;
unsigned vertgroup_size = 256; /* 256 = disable vertex grouping */
;
unsigned vertgroup_size;
if (sctx->tes_shader.cso) {
primgroup_size = num_patches; /* must be a multiple of NUM_PATCHES */
vertgroup_size = 0;
} else if (sctx->gs_shader.cso) {
unsigned vgt_gs_onchip_cntl = sctx->gs_shader.current->ctx_reg.gs.vgt_gs_onchip_cntl;
primgroup_size = G_028A44_GS_PRIMS_PER_SUBGRP(vgt_gs_onchip_cntl);
vertgroup_size = G_028A44_ES_VERTS_PER_SUBGRP(vgt_gs_onchip_cntl);
} else {
primgroup_size = 128; /* recommended without a GS and tess */
vertgroup_size = 0;
}
ge_cntl = S_03096C_PRIM_GRP_SIZE(primgroup_size) | S_03096C_VERT_GRP_SIZE(vertgroup_size) |

View File

@@ -1243,7 +1243,7 @@ static void gfx10_shader_ngg(struct si_screen *sscreen, struct si_shader *shader
S_03096C_VERT_GRP_SIZE(shader->ngg.max_gsprims + 2);
} else {
shader->ge_cntl = S_03096C_PRIM_GRP_SIZE(shader->ngg.max_gsprims) |
S_03096C_VERT_GRP_SIZE(256) | /* 256 = disable vertex grouping */
S_03096C_VERT_GRP_SIZE(shader->ngg.hw_max_esverts) |
S_03096C_BREAK_WAVE_AT_EOI(break_wave_at_eoi);
/* Bug workaround for a possible hang with non-tessellation cases.

View File

@@ -299,6 +299,7 @@ static void *vc4_get_yuv_fs(struct pipe_context *pctx, int cpp)
nir_ssa_dest_init(&load->instr, &load->dest, load->num_components, 32, NULL);
load->src[0] = nir_src_for_ssa(one);
load->src[1] = nir_src_for_ssa(nir_iadd(&b, x_offset, y_offset));
nir_intrinsic_set_align(load, 4, 0);
nir_builder_instr_insert(&b, &load->instr);
nir_store_var(&b, color_out,

View File

@@ -2472,7 +2472,8 @@ vc4_shader_state_create(struct pipe_context *pctx,
if (s->info.stage == MESA_SHADER_VERTEX)
NIR_PASS_V(s, nir_lower_point_size, 1.0f, 0.0f);
NIR_PASS_V(s, nir_lower_io, nir_var_shader_in | nir_var_shader_out,
NIR_PASS_V(s, nir_lower_io,
nir_var_shader_in | nir_var_shader_out | nir_var_uniform,
type_size, (nir_lower_io_options)0);
NIR_PASS_V(s, nir_lower_regs_to_ssa);

View File

@@ -321,7 +321,7 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver *drv, vlVaContext *contex
VAProcFilterParameterBufferDeinterlacing *deint = buf->data;
switch (deint->algorithm) {
case VAProcDeinterlacingBob:
if (deint->flags & VA_DEINTERLACING_BOTTOM_FIELD_FIRST)
if (deint->flags & VA_DEINTERLACING_BOTTOM_FIELD)
deinterlace = VL_COMPOSITOR_BOB_BOTTOM;
else
deinterlace = VL_COMPOSITOR_BOB_TOP;
@@ -333,7 +333,7 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver *drv, vlVaContext *contex
case VAProcDeinterlacingMotionAdaptive:
src = vlVaApplyDeint(drv, context, param, src,
!!(deint->flags & VA_DEINTERLACING_BOTTOM_FIELD_FIRST));
!!(deint->flags & VA_DEINTERLACING_BOTTOM_FIELD));
break;
default:

View File

@@ -810,7 +810,6 @@ static bool amdgpu_get_new_ib(struct amdgpu_winsys *ws, struct amdgpu_cs *cs,
ib_size = ib->big_ib_buffer->size - ib->used_ib_space;
ib->base.current.max_dw = ib_size / 4 - amdgpu_cs_epilog_dws(cs);
assert(ib->base.current.max_dw >= ib->max_check_space_size / 4);
ib->base.gpu_address = info->va_start;
return true;
}
@@ -1178,7 +1177,6 @@ static bool amdgpu_cs_check_space(struct radeon_cmdbuf *rcs, unsigned dw,
ib->base.current.buf = (uint32_t*)(ib->ib_mapped + ib->used_ib_space);
ib->base.current.max_dw = ib->big_ib_buffer->size / 4 - cs_epilog_dw;
assert(ib->base.current.max_dw >= ib->max_check_space_size / 4);
ib->base.gpu_address = va;
amdgpu_cs_add_buffer(&cs->main.base, ib->big_ib_buffer,

View File

@@ -63,7 +63,7 @@ void
brw_blorp_surface_info_init(struct blorp_context *blorp,
struct brw_blorp_surface_info *info,
const struct blorp_surf *surf,
unsigned int level, unsigned int layer,
unsigned int level, float layer,
enum isl_format format, bool is_render_target)
{
memset(info, 0, sizeof(*info));

View File

@@ -133,7 +133,7 @@ enum blorp_filter {
void
blorp_blit(struct blorp_batch *batch,
const struct blorp_surf *src_surf,
unsigned src_level, unsigned src_layer,
unsigned src_level, float src_layer,
enum isl_format src_format, struct isl_swizzle src_swizzle,
const struct blorp_surf *dst_surf,
unsigned dst_level, unsigned dst_layer,

View File

@@ -56,7 +56,7 @@ brw_blorp_blit_vars_init(nir_builder *b, struct brw_blorp_blit_vars *v,
LOAD_INPUT(discard_rect, glsl_vec4_type())
LOAD_INPUT(rect_grid, glsl_vec4_type())
LOAD_INPUT(coord_transform, glsl_vec4_type())
LOAD_INPUT(src_z, glsl_uint_type())
LOAD_INPUT(src_z, glsl_float_type())
LOAD_INPUT(src_offset, glsl_vector_type(GLSL_TYPE_UINT, 2))
LOAD_INPUT(dst_offset, glsl_vector_type(GLSL_TYPE_UINT, 2))
LOAD_INPUT(src_inv_size, glsl_vector_type(GLSL_TYPE_FLOAT, 2))
@@ -154,8 +154,13 @@ blorp_create_nir_tex_instr(nir_builder *b, struct brw_blorp_blit_vars *v,
* more explicit in the future.
*/
assert(pos->num_components >= 2);
pos = nir_vec3(b, nir_channel(b, pos, 0), nir_channel(b, pos, 1),
nir_load_var(b, v->v_src_z));
if (op == nir_texop_txf || op == nir_texop_txf_ms || op == nir_texop_txf_ms_mcs) {
pos = nir_vec3(b, nir_channel(b, pos, 0), nir_channel(b, pos, 1),
nir_f2i32(b, nir_load_var(b, v->v_src_z)));
} else {
pos = nir_vec3(b, nir_channel(b, pos, 0), nir_channel(b, pos, 1),
nir_load_var(b, v->v_src_z));
}
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(pos);
@@ -2319,7 +2324,7 @@ do_blorp_blit(struct blorp_batch *batch,
void
blorp_blit(struct blorp_batch *batch,
const struct blorp_surf *src_surf,
unsigned src_level, unsigned src_layer,
unsigned src_level, float src_layer,
enum isl_format src_format, struct isl_swizzle src_swizzle,
const struct blorp_surf *dst_surf,
unsigned dst_level, unsigned dst_layer,

View File

@@ -61,7 +61,7 @@ struct brw_blorp_surface_info
struct isl_view view;
/* Z offset into a 3-D texture or slice of a 2-D array texture. */
uint32_t z_offset;
float z_offset;
uint32_t tile_x_sa, tile_y_sa;
};
@@ -70,7 +70,7 @@ void
brw_blorp_surface_info_init(struct blorp_context *blorp,
struct brw_blorp_surface_info *info,
const struct blorp_surf *surf,
unsigned int level, unsigned int layer,
unsigned int level, float layer,
enum isl_format format, bool is_render_target);
void
blorp_surf_convert_to_single_slice(const struct isl_device *isl_dev,
@@ -148,7 +148,7 @@ struct brw_blorp_wm_inputs
/* Minimum layer setting works for all the textures types but texture_3d
* for which the setting has no effect. Use the z-coordinate instead.
*/
uint32_t src_z;
float src_z;
/* Pad out to an integral number of registers */
uint32_t pad[1];

View File

@@ -38,7 +38,7 @@ struct drm_i915_query_topology_info;
#define GEN_DEVICE_MAX_SLICES (6) /* Maximum on gen10 */
#define GEN_DEVICE_MAX_SUBSLICES (8) /* Maximum on gen11 */
#define GEN_DEVICE_MAX_EUS_PER_SUBSLICE (10) /* Maximum on Haswell */
#define GEN_DEVICE_MAX_EUS_PER_SUBSLICE (16) /* Maximum on gen12 */
#define GEN_DEVICE_MAX_PIXEL_PIPES (2) /* Maximum on gen11 */
/**

View File

@@ -2969,7 +2969,7 @@ isl_format_get_aux_map_encoding(enum isl_format format)
case ISL_FORMAT_R32_SINT: return 0x12;
case ISL_FORMAT_R32_UINT: return 0x13;
case ISL_FORMAT_R32_FLOAT: return 0x11;
case ISL_FORMAT_R24_UNORM_X8_TYPELESS: return 0x11;
case ISL_FORMAT_R24_UNORM_X8_TYPELESS: return 0x13;
case ISL_FORMAT_B5G6R5_UNORM: return 0xA;
case ISL_FORMAT_B5G6R5_UNORM_SRGB: return 0xA;
case ISL_FORMAT_B5G5R5A1_UNORM: return 0xA;

View File

@@ -1272,7 +1272,6 @@ unpack_channel(union isl_color_value *value,
switch (layout->type) {
case ISL_UNORM:
unpacked.f32 = _mesa_unorm_to_float(packed, layout->bits);
if (colorspace == ISL_COLORSPACE_SRGB) {
if (layout->bits == 8) {
unpacked.f32 = util_format_srgb_8unorm_to_linear_float(packed);

View File

@@ -2185,7 +2185,7 @@ execsize:
| LPAREN exp2 RPAREN
{
if ($2 > 32 || !isPowerofTwo($2))
error(&@2, "Invalid execution size %d\n", $2);
error(&@2, "Invalid execution size %llu\n", $2);
$$ = cvt($2) - 1;
}

View File

@@ -709,12 +709,19 @@ void anv_CmdBlitImage(
}
bool flip_z = flip_coords(&src_start, &src_end, &dst_start, &dst_end);
float src_z_step = (float)(src_end + 1 - src_start) /
(float)(dst_end + 1 - dst_start);
const unsigned num_layers = dst_end - dst_start;
float src_z_step = (float)(src_end - src_start) / (float)num_layers;
/* There is no interpolation to the pixel center during rendering, so
* add the 0.5 offset ourselves here. */
float depth_center_offset = 0;
if (src_image->type == VK_IMAGE_TYPE_3D)
depth_center_offset = 0.5 / num_layers * (src_end - src_start);
if (flip_z) {
src_start = src_end;
src_z_step *= -1;
depth_center_offset *= -1;
}
unsigned src_x0 = pRegions[r].srcOffsets[0].x;
@@ -729,7 +736,6 @@ void anv_CmdBlitImage(
unsigned dst_y1 = pRegions[r].dstOffsets[1].y;
bool flip_y = flip_coords(&src_y0, &src_y1, &dst_y0, &dst_y1);
const unsigned num_layers = dst_end - dst_start;
anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
1U << aspect_bit,
dst.aux_usage,
@@ -738,7 +744,7 @@ void anv_CmdBlitImage(
for (unsigned i = 0; i < num_layers; i++) {
unsigned dst_z = dst_start + i;
unsigned src_z = src_start + i * src_z_step;
float src_z = src_start + i * src_z_step + depth_center_offset;
blorp_blit(&batch, &src, src_res->mipLevel, src_z,
src_format.isl_format, src_format.swizzle,

View File

@@ -272,12 +272,45 @@ dri3_fence_await(xcb_connection_t *c, struct loader_dri3_drawable *draw,
}
static void
dri3_update_num_back(struct loader_dri3_drawable *draw)
dri3_update_max_num_back(struct loader_dri3_drawable *draw)
{
if (draw->last_present_mode == XCB_PRESENT_COMPLETE_MODE_FLIP)
draw->num_back = 3;
else
draw->num_back = 2;
switch (draw->last_present_mode) {
case XCB_PRESENT_COMPLETE_MODE_FLIP: {
int new_max;
if (draw->swap_interval == 0)
new_max = 4;
else
new_max = 3;
assert(new_max <= LOADER_DRI3_MAX_BACK);
if (new_max != draw->max_num_back) {
/* On transition from swap interval == 0 to != 0, start with two
* buffers again. Otherwise keep the current number of buffers. Either
* way, more will be allocated if needed.
*/
if (new_max < draw->max_num_back)
draw->cur_num_back = 2;
draw->max_num_back = new_max;
}
break;
}
case XCB_PRESENT_COMPLETE_MODE_SKIP:
break;
default:
/* On transition from flips to copies, start with a single buffer again,
* a second one will be allocated if needed
*/
if (draw->max_num_back != 2)
draw->cur_num_back = 1;
draw->max_num_back = 2;
}
}
void
@@ -395,7 +428,7 @@ loader_dri3_drawable_init(xcb_connection_t *conn,
}
draw->swap_interval = swap_interval;
dri3_update_num_back(draw);
dri3_update_max_num_back(draw);
/* Create a new drawable */
draw->dri_drawable =
@@ -643,6 +676,7 @@ dri3_find_back(struct loader_dri3_drawable *draw)
{
int b;
int num_to_consider;
int max_num;
mtx_lock(&draw->mtx);
/* Increase the likelyhood of reusing current buffer */
@@ -651,15 +685,18 @@ dri3_find_back(struct loader_dri3_drawable *draw)
/* Check whether we need to reuse the current back buffer as new back.
* In that case, wait until it's not busy anymore.
*/
num_to_consider = draw->num_back;
if (!loader_dri3_have_image_blit(draw) && draw->cur_blit_source != -1) {
num_to_consider = 1;
max_num = 1;
draw->cur_blit_source = -1;
} else {
num_to_consider = draw->cur_num_back;
max_num = draw->max_num_back;
}
for (;;) {
for (b = 0; b < num_to_consider; b++) {
int id = LOADER_DRI3_BACK_ID((b + draw->cur_back) % draw->num_back);
int id = LOADER_DRI3_BACK_ID((b + draw->cur_back) % draw->cur_num_back);
struct loader_dri3_buffer *buffer = draw->buffers[id];
if (!buffer || !buffer->busy) {
@@ -668,7 +705,10 @@ dri3_find_back(struct loader_dri3_drawable *draw)
return id;
}
}
if (!dri3_wait_for_event_locked(draw, NULL)) {
if (num_to_consider < max_num) {
num_to_consider = ++draw->cur_num_back;
} else if (!dri3_wait_for_event_locked(draw, NULL)) {
mtx_unlock(&draw->mtx);
return -1;
}
@@ -2006,10 +2046,10 @@ loader_dri3_get_buffers(__DRIdrawable *driDrawable,
if (!dri3_update_drawable(draw))
return false;
dri3_update_num_back(draw);
dri3_update_max_num_back(draw);
/* Free no longer needed back buffers */
for (buf_id = draw->num_back; buf_id < LOADER_DRI3_MAX_BACK; buf_id++) {
for (buf_id = draw->cur_num_back; buf_id < LOADER_DRI3_MAX_BACK; buf_id++) {
if (draw->cur_blit_source != buf_id && draw->buffers[buf_id]) {
dri3_free_render_buffer(draw, draw->buffers[buf_id]);
draw->buffers[buf_id] = NULL;

View File

@@ -146,7 +146,8 @@ struct loader_dri3_drawable {
struct loader_dri3_buffer *buffers[LOADER_DRI3_NUM_BUFFERS];
int cur_back;
int num_back;
int cur_num_back;
int max_num_back;
int cur_blit_source;
uint32_t *stamp;

View File

@@ -234,19 +234,19 @@ static const struct format_mapping format_map[] = {
DEFAULT_RGB_FORMATS }
},
{
{ GL_RGB4 },
{ GL_RGB4, 0 },
{ PIPE_FORMAT_B4G4R4X4_UNORM, PIPE_FORMAT_B4G4R4A4_UNORM,
PIPE_FORMAT_A4B4G4R4_UNORM,
DEFAULT_RGB_FORMATS }
},
{
{ GL_RGB5 },
{ GL_RGB5, 0 },
{ PIPE_FORMAT_B5G5R5X1_UNORM, PIPE_FORMAT_X1B5G5R5_UNORM,
PIPE_FORMAT_B5G5R5A1_UNORM, PIPE_FORMAT_A1B5G5R5_UNORM,
DEFAULT_RGB_FORMATS }
},
{
{ GL_RGB565 },
{ GL_RGB565, 0 },
{ PIPE_FORMAT_B5G6R5_UNORM, DEFAULT_RGB_FORMATS }
},

View File

@@ -689,7 +689,7 @@ TODO: document the other workarounds.
<option name="gles_emulate_bgra" value="true" />
<option name="gles_apply_bgra_dest_swizzle" value="true"/>
</application>
<application name="Portal 2" executable="hl2_linux">
<application name="Portal 2" executable="portal2_linux">
<option name="gles_emulate_bgra" value="true" />
<option name="gles_apply_bgra_dest_swizzle" value="true"/>
</application>

View File

@@ -1209,8 +1209,8 @@ wsi_display_wait_thread(void *data)
if (ret > 0) {
pthread_mutex_lock(&wsi->wait_mutex);
(void) drmHandleEvent(wsi->fd, &event_context);
pthread_mutex_unlock(&wsi->wait_mutex);
pthread_cond_broadcast(&wsi->wait_cond);
pthread_mutex_unlock(&wsi->wait_mutex);
}
}
return NULL;