Compare commits

..

64 Commits

Author SHA1 Message Date
Chad Versace
3302281f00 CHROMIUM: i965: Implement EGL_KHR_mutable_render_buffer
Tested with a low-latency handwriting application on Android Nougat on
the Chrome OS Pixelbook (codename Eve) with Kabylake.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: Ia816fa6b0a1158f81e5b63477451bf337c2001aa
2018-05-01 03:16:01 -07:00
Chad Versace
54f07a7ebc CHROMIUM: egl/android: Implement EGL_KHR_mutable_render_buffer
Specifically, implement the extension DRI_MutableRenderBufferLoader.
However, the loader enables EGL_KHR_mutable_render_buffer only if the
DRI driver implements its half of the extension,
DRI_MutableRenderBufferDriver.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I7fe68a5a674d1707b1e7251d900b3affd5dd7660
2018-05-01 03:16:00 -07:00
Chad Versace
bf85c6b160 CHROMIUM: egl/main: Add bits for EGL_KHR_mutable_render_buffer
A follow-up patch enables EGL_KHR_mutable_render_buffer for Android.
This patch is separate from the Android patch because I think it's
easier to review the platform-independent bits separately.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I07470f2862796611b141f69f47f935b97b0e04a1
2018-05-01 03:15:58 -07:00
Chad Versace
272fd36b24 CHROMIUM: dri: Add param driCreateConfigs(mutable_render_buffer)
If set, then the config will have __DRI_ATTRIB_MUTABLE_RENDER_BUFFER,
which translates to EGL_MUTABLE_RENDER_BUFFER_BIT_KHR.

Not used yet.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: Icdf35794f3e9adf31e1f85740b87ce155efe1491
2018-05-01 03:15:56 -07:00
Chad Versace
2aaeab9fdd CHROMIUM: dri: Define DRI_MutableRenderBuffer extensions
Define extensions DRI_MutableRenderBufferDriver and
DRI_MutableRenderBufferLoader. These are the two halves for
EGL_KHR_mutable_render_buffer.

Outside the DRI code there is one additional change.  Add
gl_config::mutableRenderBuffer to match
__DRI_ATTRIB_MUTABLE_RENDER_BUFFER. Neither are used yet.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I4ca03d81e4557380b19c44d8d799a7cc9365d928
2018-05-01 03:15:54 -07:00
Chad Versace
0d7eae5847 CHROMIUM: egl/dri2: In dri2_make_current, return early on failure
This pulls an 'else' block into the function's main body, making the
code easier to follow.

Without this change, the upcoming EGL_KHR_mutable_render_buffer patch
transforms dri2_make_current() into spaghetti.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I26be2b7a8e78a162dcd867a44f62d6f48b9a8e4d
2018-05-01 03:15:42 -07:00
Chad Versace
3e8d93e1ff CHROMIUM: egl: Drop _EGLContext::WindowRenderBuffer
Replace it with two fields in _EGLSurface, RequestedRenderBuffer and
ActiveRenderBuffer. (_EGLSurface::RequestedRenderBuffer replaces
_EGLSurface::RenderBuffer).

There exist *two* queryable EGL_RENDER_BUFFER states in EGL:
eglQuerySurface(EGL_RENDER_BUFFER) and
eglQueryContext(EGL_RENDER_BUFFER). _EGLContext::WindowRenderBuffer was
related to eglQueryContext but not eglQuerySurface. Post-patch,
RequestedRenderBuffer is related to eglQuerySurface and
ActiveRenderBuffer is related to eglQueryContext.

The implementation of eglQuerySurface(EGL_RENDER_BUFFER) contained
abstruse logic which required comprehending the specification
complexities of how the two EGL_RENDER_BUFFER states interact. Sometimes
it returned _EGLContext::WindowRenderBuffer, sometimes
_EGLSurface::RenderBuffer. Why? The function tried to encode the actual
logic in the EGL spec. When did the function return which variable? Go
study the EGL spec, hope you understand it, then hope Mesa mutated the
EGL_RENDER_BUFFER state in all the correct places. Have fun.

I got a headache from the mental gymnastics.

To simplify eglQuerySurface(EGL_RENDER_BUFFER), and to improve
confidence in its correctness, flatten its indirect logic. For pixmap
and pbuffer surfaces, return a hard-coded literal value, as the spec
suggests. For window surfaces, simply return ActiveRenderBuffer.
Nothing difficult here.

These changes eliminate potentially very fragile code in the upcoming
EGL_KHR_mutable_render_buffer implementation.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: Ic5f2ab1952f26a87081bc4f78bc7fa96734c8f2a
2018-05-01 03:15:38 -07:00
Dave Airlie
b239996965 UPSTREAM: virgl: also remove dimension on indirect.
This fixes some dEQP tests that generated bad shaders.

Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 49c61d8b84)

BUG=b:78132369
TEST=play war robots under arc++ in Chrome OS for 1 hour.
Change-Id: I7d8de9e2a8289e9119f839b1d1aa99012bdbd6e8
Reviewed-on: https://chromium-review.googlesource.com/1013329
Commit-Ready: Lepton Wu <lepton@chromium.org>
Tested-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-04-17 15:47:09 -07:00
Lepton Wu
b16fbdb135 CHROMIUM: platform_android: use general fallback.
Instead of handling software fallback inside platform_android, just
let EGL framework to handle it. With this, we still can fall back
to software driver when hardware driver actually doesn't work.

BUG=b:77302150
TEST=manual - make sure betty still boot without virgl driver.

Change-Id: I5d514f67c9dc6f68661e03fd9fc9546acd7277bd
Reviewed-on: https://chromium-review.googlesource.com/1004006
Commit-Ready: Lepton Wu <lepton@chromium.org>
Tested-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-04-11 18:09:07 -07:00
Harish Krupo
c9bda01108 UPSTREAM: egl/android: Provide an option for the backend to expose KHR_image
From android cts 8.0_r4, a new test case checks if all the required egl
extensions are exposed. In the current implementation we expose KHR_image
if KHR_image_base and KHR_image_pixmap are supported but KHR_image spec
does not mandate the existence of both the extensions.
This patch preserves the current check and also provides the backend
with an option to expose the KHR_image extension.

Test: run cts -m CtsOpenGLTestCases -t \
android.opengl.cts.OpenGlEsVersionTest#testRequiredEglExtensions

Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Tapani Plli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 96fc5fbf23)
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>

BUG=b:77786960
TEST=android.opengl.cts.OpenGlEsVersionTest#testRequiredEglExtensions
     passes on CTS 8.1

Change-Id: I7c057ea4aa00f99885259ca0a97cac4554551c80
Reviewed-on: https://chromium-review.googlesource.com/1002796
Commit-Ready: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-04-11 04:38:57 -07:00
Eric Engestrom
68ceff0712 UPSTREAM: egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE
My refactor in 47273d7312 missed this early return; because
of it, setting UseFallback one layer above actually prevented the
software path from being used.

Remove this early return and let each platform's dri2_initialize_*()
decide what it can do with the LIBGL_ALWAYS_SOFTWARE restriction.

platform_{surfaceless,x11,wayland} were already handling it themselves.

Fixes: 47273d7312 "egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Brendan King <Brendan.King@imgtec.com>
(cherry picked from commit 2f421651ac)

BUG=b:77302150
TEST=manual - make sure betty still boots without virgl driver.
Change-Id: I5e2ddfbd7a72bf04d83cac8f08fafbe81a77e66c
Reviewed-on: https://chromium-review.googlesource.com/1004005
Commit-Ready: Lepton Wu <lepton@chromium.org>
Tested-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-04-10 19:13:13 -07:00
Nicolas Boichat
d7681cc943 FROMLIST: configure.ac: Fix -latomic test
When compiling with LLVM 6.0, the test fails to detect that
-latomic is actually required, as the atomic call is inlined.

In the code itself (src/util/disk_cache.c), we see this pattern:
p_atomic_add(cache->size, - (uint64_t)size);
where cache->size is an uint64_t *, and results in the following
link time error without -latomic:
src/util/disk_cache.c:628: error: undefined reference to '__atomic_fetch_add_8'

Fix the configure test to replicate this pattern, which then
correctly realizes the need for -latomic.

BUG=b:76397110
TEST=cros_workon_make --board=caroline-arcnext --reconf arc-mesa
(am from https://patchwork.freedesktop.org/patch/213657/, dropped
meson.build change)

Change-Id: I9cdad5fd32879a3577d6ef42e278960a934b23fb
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/985676
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-03-29 07:39:25 -07:00
Bas Nieuwenhuizen
a0f76c6a3b UPSTREAM: radv: Signal fence correctly after sparse binding.
It did not signal syncobjs in the fence, and also signalled too early
if there was work on the queue already, as we have to wait till that
work is done.

Fixes: d27aaae4d2 "radv: Add external fence support."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0347a83bbf)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I887768385833112effa1e8d414bf171640bb3564
Reviewed-on: https://chromium-review.googlesource.com/913504
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-14 09:42:21 -07:00
Bas Nieuwenhuizen
7c5b8d163f UPSTREAM: radv: Implement VK_ANDROID_native_buffer.
Passes
  dEQP-VK.api.smoke.*
  dEQP-VK.wsi.android.*

with android-cts-7.1_r12 .

Unlike the initial anv implementation this does
use syncobjs instead of waiting on the CPU.

This is missing meson build coverage for now.

One possible todo is that linux 4.15 now has a
sycall that allows us to export amdgpu fence to
a sync_file, which allows us not to force all
fences and semaphores to use syncobjs. However,
I had trouble with my kernel crashing regularly
with NULL pointers, and I'm not sure how beneficial
it is in the first place given that intel uses
syncobjs for all fences if available.

Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b1444c9ccb)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I002abe9e44ceba89c15f503d8c9fa3419aa2803e
Reviewed-on: https://chromium-review.googlesource.com/913503
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-14 07:03:57 -07:00
Lepton Wu
667a7dfd55 CHROMIUM: virgl: Fix crash on destruction.
We need to set surface/sampler_view destruction callbacks for
surface and sampler_view after https://chromium-review.googlesource.com/558120

BUG=b:70179880
TEST=Run android apps on bettyvirgl (rendering only works sometimes.)

Change-Id: I85cedb53a3dd86caba1d8cf890f63a0a5dfce4bd
Reviewed-on: https://chromium-review.googlesource.com/959231
Commit-Ready: Lepton Wu <lepton@chromium.org>
Tested-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Joe Kniss <djmk@google.com>
2018-03-13 19:00:16 -07:00
Bas Nieuwenhuizen
d9619b2fba UPSTREAM: radv: Add create image flag to not use DCC/CMASK.
If we import an image, we might not have space in the
buffer for CMASK, even though it is compatible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a3e241ed07)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I4a5d6634e89ea8ec4ac200405e5c84d6f30dcb2f
Reviewed-on: https://chromium-review.googlesource.com/913502
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-13 07:37:06 -07:00
Bas Nieuwenhuizen
26796ca5ca UPSTREAM: radv: Generate VK_ANDROID_native_buffer.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e344cd8178)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: Ia0957ef39597416bf3750e8138047c0c748a1d1b
Reviewed-on: https://chromium-review.googlesource.com/913501
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-08 11:33:15 -08:00
Bas Nieuwenhuizen
f45c9bb5d6 UPSTREAM: radv: reset semaphores & fences on sync_file export.
Per spec:

"Additionally, exporting a fence payload to a handle with copy transference has the same side effects
on the source fences payload as executing a fence reset operation. If the fence was using a
temporarily imported payload, the fences prior permanent payload will be restored."

And similar for semaphores:

"Additionally, exporting a semaphore payload to a handle with copy transference has the same side
effects on the source semaphores payload as executing a semaphore wait operation. If the
semaphore was using a temporarily imported payload, the semaphores prior permanent payload
will be restored."

Fixes: 42bc25a79c "radv: Advertise sync fd import and export."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b9f4c615f8)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I5c79d6b69bb76b8f97018a4726d10f6b0d740350
Reviewed-on: https://chromium-review.googlesource.com/913500
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-08 08:48:25 -08:00
Marek Olšák
9b34b2cee4 UPSTREAM: ac: rename has_syncobj_wait -> has_syncobj_wait_for_submit
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 4f19cc82f9)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I1b78fb53ac25131117c6ac2bf78f9f4f964eed3d
Reviewed-on: https://chromium-review.googlesource.com/913499
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-08 06:32:34 -08:00
Bas Nieuwenhuizen
96f37fa7e0 UPSTREAM: radv: Advertise sync fd import and export.
Passes dEQP-VK.*.sync_fd.*

Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 42bc25a79c)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: Icf0b6516fb015a2639bff9fd29050896f04f8980
Reviewed-on: https://chromium-review.googlesource.com/913498
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-07 10:50:31 -08:00
Bas Nieuwenhuizen
730d7edbd7 UPSTREAM: radv: Implement sync file import/export for fences & semaphores.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 52b3f50df8)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I6ed541af269ceaefad15565f52b95c2e33f73a0d
Reviewed-on: https://chromium-review.googlesource.com/913497
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-07 05:58:59 -08:00
Bas Nieuwenhuizen
64278fcd0c UPSTREAM: radv/amdgpu: wrap sync fd import/export.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b98bbdf490)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I0e84a9df34ab22e6dd54b894e245c8df3c254c3f
Reviewed-on: https://chromium-review.googlesource.com/913496
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-06 17:22:35 -08:00
Bas Nieuwenhuizen
c8541b9630 UPSTREAM: radv: Add external fence support.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d27aaae4d2)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I77a8313fc5728c0fec30f2e61695b0e3e935a3e9
Reviewed-on: https://chromium-review.googlesource.com/913495
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-06 09:59:23 -08:00
Bas Nieuwenhuizen
7f3baaf5c7 UPSTREAM: radv: Implement VK_KHR_external_fence_fd.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6abfa37879)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: Ie185825c75dc0b45dba4b17b7edbcce5309c3dfc
Reviewed-on: https://chromium-review.googlesource.com/913494
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-06 07:30:39 -08:00
Bas Nieuwenhuizen
5c32cb2c08 UPSTREAM: radv: Implement fences based on syncobjs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 969421b7da)

Added WSI code for this, as the upstream WSI got a rework.

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: I0f96630935b61f9d927ea7e1d1ea0e6c71d56796
Reviewed-on: https://chromium-review.googlesource.com/913493
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-03-05 18:34:37 -08:00
Bas Nieuwenhuizen
fd1aa710bc UPSTREAM: amd/common: Add detection of the syncobj wait/signal/reset ioctls.
First amdgpu bump after inclusion was 20 (which was done for local BOs).

Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b308bb8773)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: If2cede07cd2779d1d526aecee0b4c0f1d5bb5a4a
Reviewed-on: https://chromium-review.googlesource.com/913492
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-23 14:16:03 -08:00
Bas Nieuwenhuizen
7d019c72d8 UPSTREAM: radv: Add syncobj signal/reset/wait to winsys.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1c3cda7d27)

BUG=b:73102056
TEST=run nougat-mr1-cts-dev deqp vulkan tests.

Change-Id: Ieb018369d1bf588a27b4912475c05230c557b815
Reviewed-on: https://chromium-review.googlesource.com/913491
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-23 07:23:47 -08:00
Jason Ekstrand
6613048d9a UPSTREAM: i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that
use window-system MSAA.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Plli <tapani.palli@intel.com>
(cherry picked from commit 2f7205be47)

Fix for Telegram and KineMaster graphic corruption

BUG=b:71872728
TEST=Telegram and KineMaster work without corruption

Change-Id: If8c489abe2d26a0c639dfe6d5f10f8fd4c3719c4
Signed-off-by: Dmytro Chystiakov <dmytro.chystiakov@intel.com>
Reviewed-on: https://chromium-review.googlesource.com/915190
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2018-02-16 21:41:37 -08:00
Lepton Wu
037026e90f FROMLIST: gallium/winsys/kms: Add support for multi-planes
Add a new struct kms_sw_plane which delegate a plane and use it
in place of sw_displaytarget. Multiple planes share same underlying
kms_sw_displaytarget. For map request, we only hold 2 pointers for
ro map and rw map and return different pointers with offset.

Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-December/180761.html
(am from https://patchwork.freedesktop.org/patch/195118/)

TEST=play video with youtube android app inside emulator.
BUG=b:62836711

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I0863f522976cc8863d6e95492d9346df35c066ec
Reviewed-on: https://chromium-review.googlesource.com/843934
2018-02-02 23:53:32 -08:00
Bas Nieuwenhuizen
ec23d1a68a UPSTREAM: radeonsi: Export signalled sync file instead of -1.
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5a3404d443)

BUG=b:72449616
TESTED=Try play store in ARC with 4.14 kernel on Kahlee.

Change-Id: Ib053b640e70a0fe529e5cea84fd4144f93c8c588
Reviewed-on: https://chromium-review.googlesource.com/886703
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Benjamin Gordon <bmgordon@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Benjamin Gordon <bmgordon@chromium.org>
2018-01-31 03:36:29 -08:00
Francisco Jerez
68a7077012 FROMLIST: intel/fs: Optimize and simplify the copy propagation dataflow logic.
Previously the dataflow propagation algorithm would calculate the ACP
live-in and -out sets in a two-pass fixed-point algorithm.  The first
pass would update the live-out sets of all basic blocks of the program
based on their live-in sets, while the second pass would update the
live-in sets based on the live-out sets.  This is incredibly
inefficient in the typical case where the CFG of the program is
approximately acyclic, because it can take up to 2*n passes for an ACP
entry introduced at the top of the program to reach the bottom (where
n is the number of basic blocks in the program), until which point the
algorithm won't be able to reach a fixed point.

The same effect can be achieved in a single pass by computing the
live-in and -out sets in lock-step, because that makes sure that
processing of any basic block will pick up the updated live-out sets
of the lexically preceding blocks.  This gives the dataflow
propagation algorithm effectively O(n) run-time instead of O(n^2) in
the acyclic case.

The time spent in dataflow propagation is reduced by 30x in the
GLES31.functional.ssbo.layout.random.all_shared_buffer.5 dEQP
test-case on my CHV system (the improvement is likely to be of the
same order of magnitude on other platforms).  This more than reverses
an apparent run-time regression in this test-case from my previous
copy-propagation undefined-value handling patch, which was ultimately
caused by the additional work introduced in that commit to account for
undefined values being multiplied by a huge quadratic factor.

According to Chad this test was failing on CHV due to a 30s time-out
imposed by the Android CTS (this was the case regardless of my
undefined-value handling patch, even though my patch substantially
exacerbated the issue).  On my CHV system this patch reduces the
overall run-time of the test by approximately 12x, getting us to
around 13s, well below the time-out.

v2: Initialize live-out set to the universal set to avoid rather
    pessimistic dataflow estimation in shaders with cycles (Addresses
    performance regression reported by Eero in GpuTest Piano).
    Performance numbers given above still apply.  No shader-db changes
    with respect to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104271
Reported-by: Chad Versace <chadversary@chromium.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-December/180489.html
(am from https://patchwork.freedesktop.org/patch/194420/)

BUG=b:67394445
TEST=No regressions in Android CTS, GLES tests.
  Fixes timeouts in dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5
  on Brasswell boards.

Change-Id: I0d666c23693246b8d4fe8988f228f8c4ed7425f6
Reviewed-on: https://chromium-review.googlesource.com/862007
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
2018-01-18 00:00:07 +00:00
Francisco Jerez
3519cdfcfa UPSTREAM: intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.
This addresses a long-standing back-end compiler bug that could lead
to cross-channel data corruption in loops executed non-uniformly.  In
some cases live variables extending through a loop divergence point
(e.g. a non-uniform break) into a convergence point (e.g. the end of
the loop) wouldn't be considered live along all physical control flow
paths the SIMD thread could possibly have taken in between due to some
channels remaining in the loop for additional iterations.

This patch fixes the problem by extending the CFG with physical edges
that don't exist in the idealized non-vectorized program, but
represent valid control flow paths the SIMD EU may take due to the
divergence of logical threads.  This makes sense because the i965 IR
is explicitly SIMD, and it's not uncommon for instructions to have an
influence on neighboring channels (e.g. a force_writemask_all header
setup), so the behavior of the SIMD thread as a whole needs to be
considered.

No changes in shader-db.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4d1959e693)

This patch is a prerequisite for 4cbe48f5 "intel/fs: Optimize and
simplify the copy propagation dataflow logic".

BUG=b:67394445
TEST=No regressions in Android CTS, GLES tests.

Change-Id: I949f6f4e0127fec93d890e7669f870872f097a58
Reviewed-on: https://chromium-review.googlesource.com/862006
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2018-01-17 23:59:39 +00:00
Francisco Jerez
a058539d21 UPSTREAM: intel/fs: Don't let undefined values prevent copy propagation.
This makes the dataflow propagation logic of the copy propagation pass
more intelligent in cases where the destination of a copy is known to
be undefined for some incoming CFG edges, building upon the
definedness information provided by the last patch.  Helps a few
programs, and avoids a handful shader-db regressions from the next
patch.

shader-db results on ILK:

  total instructions in shared programs: 6541547 -> 6541523 (-0.00%)
  instructions in affected programs: 360 -> 336 (-6.67%)
  helped: 8
  HURT: 0

  LOST:   0
  GAINED: 10

shader-db results on BDW:

  total instructions in shared programs: 8174323 -> 8173882 (-0.01%)
  instructions in affected programs: 7730 -> 7289 (-5.71%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 4

shader-db results on SKL:

  total instructions in shared programs: 8185669 -> 8184598 (-0.01%)
  instructions in affected programs: 10364 -> 9293 (-10.33%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 2

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 9355116bda)

This patch is a prerequisite for 4cbe48f5 "intel/fs: Optimize and
simplify the copy propagation dataflow logic".

BUG=b:67394445
TEST=No regressions in Android CTS, GLES tests.

Change-Id: I8719e67ac14d3db8a7d6989d127ca4222cbdbfe4
Reviewed-on: https://chromium-review.googlesource.com/862005
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2018-01-17 23:59:36 +00:00
Francisco Jerez
6efb3d854f UPSTREAM: intel/fs: Restrict live intervals to the subset possibly reachable from any definition.
Currently the liveness analysis pass would extend a live interval up
to the top of the program when no unconditional and complete
definition of the variable is found that dominates all of its uses.

This can lead to a serious performance problem in shaders containing
many partial writes, like scalar arithmetic, FP64 and soon FP16
operations.  The number of oversize live intervals in such workloads
can cause the compilation time of the shader to explode because of the
worse than quadratic behavior of the register allocator and scheduler
when running out of registers, and it can also cause the running time
of the shader to explode due to the amount of spilling it leads to,
which is orders of magnitude slower than GRF memory.

This patch fixes it by computing the intersection of our current live
intervals with the subset of the program that can possibly be reached
from any definition of the variable.  Extending the storage allocation
of the variable beyond that is pretty useless because its value is
guaranteed to be undefined at a point that cannot be reached from any
definition.

According to Jason, this improves performance of the subgroup Vulkan
CTS tests significantly (e.g. the runtime of the dvec4 broadcast test
improves by nearly 50x).

No significant change in the running time of shader-db (with 5%
statistical significance).

shader-db results on IVB:

  total cycles in shared programs: 61108780 -> 60932856 (-0.29%)
  cycles in affected programs: 16335482 -> 16159558 (-1.08%)
  helped: 5121
  HURT: 4347

  total spills in shared programs: 1309 -> 1288 (-1.60%)
  spills in affected programs: 249 -> 228 (-8.43%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1652 -> 1597 (-3.33%)
  fills in affected programs: 262 -> 207 (-20.99%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 209

shader-db results on BDW:

  total cycles in shared programs: 67617262 -> 67361220 (-0.38%)
  cycles in affected programs: 23397142 -> 23141100 (-1.09%)
  helped: 8045
  HURT: 6488

  total spills in shared programs: 1456 -> 1252 (-14.01%)
  spills in affected programs: 465 -> 261 (-43.87%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1720 -> 1465 (-14.83%)
  fills in affected programs: 471 -> 216 (-54.14%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 162

shader-db results on SKL:

  total cycles in shared programs: 65436248 -> 65245186 (-0.29%)
  cycles in affected programs: 22560936 -> 22369874 (-0.85%)
  helped: 8457
  HURT: 6247

  total spills in shared programs: 437 -> 437 (0.00%)
  spills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total fills in shared programs: 870 -> 854 (-1.84%)
  fills in affected programs: 16 -> 0
  helped: 1
  HURT: 0

  LOST:   0
  GAINED: 107

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c3c1aa5aeb)

This patch is a prerequisite for 4cbe48f5 "intel/fs: Optimize and
simplify the copy propagation dataflow logic".

BUG=b:67394445
TEST=No regressions in Android CTS, GLES tests.

Change-Id: Icbe71f099618e45098a61502b79f3694bcc49877
Reviewed-on: https://chromium-review.googlesource.com/862004
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2018-01-17 23:59:32 +00:00
Francisco Jerez
d78b9b2232 UPSTREAM: intel/fs: Teach instruction scheduler about GRF bank conflict cycles.
This should allow the post-RA scheduler to do a slightly better job at
hiding latency in presence of instructions incurring bank conflicts.
The main purpuse of this patch is not to improve performance though,
but to get conflict cycles to show up in shader-db statistics in order
to make sure that regressions in the bank conflict mitigation pass
don't go unnoticed.

Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit acf98ff933)

This patch is a prerequisite for 4cbe48f5 "intel/fs: Optimize and
simplify the copy propagation dataflow logic".

BUG=b:67394445
TEST=No regressions in Android CTS, GLES tests.

Change-Id: Ie10e8bf2116b28a637fd7a3829a44a00b2867f11
Reviewed-on: https://chromium-review.googlesource.com/862003
Commit-Ready: Chad Versace <chadversary@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-01-16 12:02:24 -08:00
Francisco Jerez
dafe2a86ab UPSTREAM: intel/fs: Implement GRF bank conflict mitigation pass.
Unnecessary GRF bank conflicts increase the issue time of ternary
instructions (the overwhelmingly most common of which is MAD) by
roughly 50%, leading to reduced ALU throughput.  This pass attempts to
minimize the number of bank conflicts by rearranging the layout of the
GRF space post-register allocation.  It's in general not possible to
eliminate all of them without introducing extra copies, which are
typically more expensive than the bank conflict itself.

In a shader-db run on SKL this helps roughly 46k shaders:

   total conflicts in shared programs: 1008981 -> 600461 (-40.49%)
   conflicts in affected programs: 816222 -> 407702 (-50.05%)
   helped: 46234
   HURT: 72

The running time of shader-db itself on SKL seems to be increased by
roughly 2.52%1.13% with n=20 due to the additional work done by the
compiler back-end.

On earlier generations the pass is somewhat less effective in relative
terms because the hardware incurs a bank conflict anytime the last two
sources of the instruction are duplicate (e.g. while trying to square
a value using MAD), which is impossible to avoid without introducing
copies.  E.g. for a shader-db run on SNB:

   total conflicts in shared programs: 944636 -> 623185 (-34.03%)
   conflicts in affected programs: 853258 -> 531807 (-37.67%)
   helped: 31052
   HURT: 19

And on BDW:

   total conflicts in shared programs: 1418393 -> 987539 (-30.38%)
   conflicts in affected programs: 1179787 -> 748933 (-36.52%)
   helped: 47592
   HURT: 70

On SKL GT4e this improves performance of GpuTest Volplosion by 3.64%
0.33% with n=16.

NOTE: This patch intentionally disregards some i965 coding conventions
      for the sake of reviewability.  This is addressed by the next
      squash patch which introduces an amount of (for the most part
      boring) boilerplate that might distract reviewers from the
      non-trivial algorithmic details of the pass.

The following patch is squashed in:

SQUASH: intel/fs/bank_conflicts: Roll back to the nineties.

Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit af2c320190)

This patch is a prerequisite for 4cbe48f5 "intel/fs: Optimize and
simplify the copy propagation dataflow logic".

BUG=b:67394445
TEST=No regressions in Android CTS, GLES tests.

Change-Id: I21b0563b3855434a702989fbc947b786c486f7e3
Reviewed-on: https://chromium-review.googlesource.com/862002
Commit-Ready: Chad Versace <chadversary@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-01-16 12:02:24 -08:00
Bas Nieuwenhuizen
7152fe4723 UPSTREAM: radv: Don't advertise VK_EXT_debug_report.
We never supported it. Missed during copy and pasting.

Fixes: 17201a2eb0 "radv: port to using updated anv entrypoint/extension generator."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry-picked from 4eb0dca46b)

BUG=b:67506532
TEST=run vkinfo on kahlee

Change-Id: I09f5053383cc9eded33510c24f953b496383f798
Reviewed-on: https://chromium-review.googlesource.com/827017
Commit-Ready: Bas Nieuwenhuizen <basni@chromium.org>
Tested-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-12-15 07:54:33 -08:00
Jason Ekstrand
838e746fc9 FROMLIST: anv: Add support for the variablePointers feature
Not to be confused with variablePointersStorageBuffer which is the
subset of VK_KHR_variable_pointers required to enable the extension.
This means we now have "full" support for variable pointers.

Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173537.html
(am from https://patchwork.freedesktop.org/patch/183830/)

Needed for SPIR-V VariablePointers capability.

Testing:
  Tests for this feature were released in Vulkan CTS 1.0.2.4 on
  2017-07-11, and were later merged into the oreo branches of deqp. But
  in that release some testcases were buggy, causing some drivers to
  crash. In particular, the bugs crashed Anvil. All but one of the
  required fixes have landed in vk-gl-cts master@e52de55c. The last
  fix is still under review in Khronos's internal Gerrit as
  https://gerrit.khronos.org/#/c/1864/. I've pushed a public vk-gl-cts
  branch[1] containing the remaining fix, and tagged[2] the vk-gl-cts
  commit I tested against. Likewise for Mesa, I tagged[3] the commit
  I tested, based on branch cros/arc-17.3.

  Android is hard. Running the Oreo CTS on Nougat is even harder.
  I confirmed that some testcases for this feature passed when running
  the Oreo CTS on ARC++ Nougat, though the CTS eventually crashed due to
  the reasons explained above.

  I verified everything on Fedora instead. All 949 of the following
  tests passed:

  dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.*
  dEQP-VK.spirv_assembly.instruction.graphics.variable_pointers.*

  [1]: http://git.kiwitree.net/cgit/~chadv/vk-gl-cts/log/?h=fixes/spirv-variable-pointers
  [2]: http://git.kiwitree.net/cgit/~chadv/vk-gl-cts/log/?h=chadv/test/spirv-variable-pointers-2017-11-29
  [3]: http://git.kiwitree.net/cgit/~chadv/mesa/log/?h=chadv/test/arc-17.3-anv-variable-pointers-2017-11-29

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I93f94a0d5f976575826397d60b42d3b11a919269
Reviewed-on: https://chromium-review.googlesource.com/799681
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:39:51 +00:00
Jason Ekstrand
d1d6bf7605 FROMLIST: spirv: Add support for lowering workgroup access to offsets
Before, we always left workgroup variables as shared nir_variables and
let the driver call nir_lower_io.  This adds an option to do the
lowering directly in spirv_to_nir.  To do this, we implicitly assign the
variables a std430 layout and then treat them like a UBO or SSBO and
immediately lower all the way to an offset.

As a side-effect, the spirv_to_nir pass now handles variable pointers
for workgroup variables.

Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173534.html
(am from https://patchwork.freedesktop.org/patch/183827/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: Ibdfb71194fd43fe899a71e7da162ee0633d2d11a
Reviewed-on: https://chromium-review.googlesource.com/799680
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:39:08 +00:00
Jason Ekstrand
14c7f4783a FROMLIST: spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173536.html
(am from https://patchwork.freedesktop.org/patch/183829/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I096da96d15e7536b5536aa9f5d527ce8c47e9eaa
Reviewed-on: https://chromium-review.googlesource.com/799679
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:38:57 +00:00
Jason Ekstrand
69fae1186f FROMLIST: spirv: Add theoretical support for single component pointers
Up until now, all pointers have been ivec2s.  We're about to add support
for pointers to workgroup storage and those are going to be uints.

Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173532.html
(am from https://patchwork.freedesktop.org/patch/183825/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: Id8fc176fc1179d492dee77e5f018db8c67d884aa
Reviewed-on: https://chromium-review.googlesource.com/799678
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:38:43 +00:00
Jason Ekstrand
26ae9a5650 FROMLIST: spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index
There is no good reason why we should have the same logic repeated in
get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference.  If
we're a bit more careful about how we do things, we can just use the one
function and get rid of the other entirely.  This also makes the push
constant special case a lot more clear.

Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173535.html
(am from https://patchwork.freedesktop.org/patch/183828/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I338bf2214e916c779b86628c684e434ece81b4a5
Reviewed-on: https://chromium-review.googlesource.com/799677
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:38:23 +00:00
Jason Ekstrand
00898cd71d FROMLIST: spirv: Refactor a couple of pointer query helpers
This commit moves them both into vtn_variables.c towards the top, makes
them take a vtn_builder, and replaces a hand-rolled instance of
is_external_block with a function call.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173531.html
(am from https://patchwork.freedesktop.org/patch/183826/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: If060e901394e8eb6a34a39a4f9b6b12aaf519c57
Reviewed-on: https://chromium-review.googlesource.com/799676
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:38:07 +00:00
Jason Ekstrand
4e763cb3c1 FROMLIST: spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173529.html
(am from https://patchwork.freedesktop.org/patch/183823/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I47f5272b801c1d642025242993e97befd1d918ce
Reviewed-on: https://chromium-review.googlesource.com/799675
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:37:29 +00:00
Jason Ekstrand
ed61e74c4d FROMLIST: spirv: Refactor the base case of offset_pointer_dereference
This makes us key off of !offset instead of !block_index.  It also puts
the guts inside a switch statement so that we can handle more than just
UBOs and SSBOs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173533.html
(am from https://patchwork.freedesktop.org/patch/183824)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I5ec31e019875613ac192383b264bb08e0318ca60
Reviewed-on: https://chromium-review.googlesource.com/799674
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:37:08 +00:00
Jason Ekstrand
8142ac25cc FROMLIST: spirv: Add a switch statement for the block store opcode
This parallels what we do for vtn_block_load except that we don't yet
support anything except SSBO loads through this path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173530.html
(am from https://patchwork.freedesktop.org/patch/183822/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I6269c0fc7802276d8e0b030b9a68802f09c7226c
Reviewed-on: https://chromium-review.googlesource.com/799673
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:35:49 +00:00
Jason Ekstrand
794d5bbee5 FROMLIST: spirv: Use a dereference instead of vtn_variable_resource_index
This is equivalent and means we don't have resource index code scattered
about.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173528.html
(am from https://patchwork.freedesktop.org/patch/183821/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I847aa91fe096c71dc88b229c72d60eb3c9a3fcc5
Reviewed-on: https://chromium-review.googlesource.com/799672
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:35:27 +00:00
Jason Ekstrand
6d25795e51 FROMLIST: spirv: Only emit functions which are actually used
Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint.  This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173527.html
(am from https://patchwork.freedesktop.org/patch/183820/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I259660c64c6bb9b88af26f90b2a8b2f43f5138db
Reviewed-on: https://chromium-review.googlesource.com/799671
Tested-by: Chad Versace <chadversary@chromium.org>
Trybot-Ready: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:34:42 +00:00
Jason Ekstrand
0647d5800f FROMLIST: spirv: Drop the impl field from vtn_builder
We have a nir_builder and it has an impl field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2017-October/173526.html
(am from https://patchwork.freedesktop.org/patch/183819/)

Needed for SPIR-V VariablePointers capability.

BUG=b:68708929
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I1b8c8678e510cffd1dd897bc2b642e9c0d54c1de
Reviewed-on: https://chromium-review.googlesource.com/799670
Tested-by: Chad Versace <chadversary@chromium.org>
Trybot-Ready: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Commit-Queue: Kristian H. Kristensen <hoegsberg@chromium.org>
2017-11-30 23:33:07 +00:00
Tomasz Figa
e07a838408 HACK: egl/android: Partially handle HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED
There is no API available to properly query the IMPLEMENTATION_DEFINED
format. As a workaround we rely here on gralloc allocating either
an arbitrary YCbCr 4:2:0 or RGBX_8888, with the latter being recognized
by lock_ycbcr failing.

(replaces commmit b0147e6603835a2cc64a99c5a6caa3316d6c2172 from
arc-12.1.0-pre2 branch / CL:367216)

BUG=b:28671744
BUG=b:33533853
BUG=b:37615277
TEST=android.view.cts.WindowTest#testSetLocalFocus
TEST=No CTS regressions on cyan and reef.
TEST=Camera preview on Poppy looks correctly

Change-Id: Ifca4a7f82a6d04ccb50e0ee17f1998ffb243f85f
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/566793
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 8e0cdc96548416708890eee94b6cff6cd68e5ca5)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I226caf644e34312628a7606fbcf65b567cf338d9
Reviewed-on: https://chromium-review.googlesource.com/780840
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:37:39 +00:00
Benjamin Gordon
b17ff37e6a UPSTREAM: configure: Allow android as an EGL platform
I'm working on radeonsi support in the Chrome OS Android container
(ARC++).  Mesa in ARC++ uses autotools instead of Android.mk, but all
the necessary EGL bits are there, so the existing check is too strict.

Signed-off-by: Benjamin Gordon <bmgordon@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit de3555f834)

BUG=b:64515630
TEST=emerge-kahlee arc-mesa with crrev.com/c/698868

Change-Id: I9d7d1bed0bd166df174cfdc59c129cbfe4a81fd7
Reviewed-on: https://chromium-review.googlesource.com/780839
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-11-22 18:37:36 +00:00
Emil Velikov
42c96d393b FROMLIST: egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support
As said in the EGL_KHR_platform_android extensions

    For each EGLConfig that belongs to the Android platform, the
    EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
    WINDOW_FORMAT_RGBA_8888.

Although it should be applicable overall.

Even though we use HAL_PIXEL_FORMAT here, those are numerically
identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.

Barring the said format of course. That one is only listed in HAL.

Keep in mind that even if we try to use the said format, you'll get
caught by droid_create_surface(). The function compares the format of
the underlying window, against the NATIVE_VISUAL_ID of the config.

Unfortunatelly it only prints a warning, rather than error out, likely
leading to visual corruption.

While SDL will even call ANativeWindow_setBuffersGeometry() with the
wrong format, and conviniently ignore the [expected] failure.

Cc: mesa-stable@lists.freedesktop.org
Cc: Chad Versace <chadversary@google.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tomasz Figa <tfiga@chromium.org>
(am from https://patchwork.freedesktop.org/patch/166176/)
(tfiga: Remove only respective EGL config, leave EGL image as is.)

BUG=b:33533853
TEST=dEQP-EGL.functional.*.rgba8888_window tests pass on eve

Change-Id: I8eacfe852ede88b24c1a45bff1445aacd86f6992
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/582263
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit cc1bc630a2b17693a6e8b93a6193b415b3859297)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I04bc06eeffcd4bd0cee64c55f10f0a039b6f2d73
Reviewed-on: https://chromium-review.googlesource.com/780798
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:28 +00:00
Tomasz Figa
1171bddb74 CHROMIUM: egl/android: Support opening render nodes from within EGL
This patch adds support for opening render nodes directly from within
display initialization, Instead of relying on private interfaces
provided by gralloc.

In addition to having better separation from gralloc and being able to
use different render nodes for allocation and rendering, this also fixes
problems encountered when using the same DRI FD for gralloc and Mesa,
when both stepped each over another because of shared GEM handle
namespace.

BUG=b:29036398
TEST=No significant regressions in dEQP inside the container

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/367215
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
(cherry picked from commit 4471713aa71d83943eb195868707ebe4e6515bb6)

BUG=b:32077712
BUG=b:33533853
TEST=No CTS regressions on cyan and reef.

Change-Id: I7f901eb9dadbfc2200484666fdc6a2bc0ca42a0c
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/558138
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit d4c3c3b5b0a9a834736323dbcd43a424e9033fa2)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I050b5ba258f175d3d2543582963e4296c56df5ee
Reviewed-on: https://chromium-review.googlesource.com/780797
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:26 +00:00
Stéphane Marchesin
a10faeae28 CHROMIUM: i965: disable hiz on braswell
Hiz causes GPU hangs on braswell, so let's disable it.

BUG=b/35570762, b/35574152
TEST=run graphics_GLBench on 3 * kefka for a total of 45 hours, no GPU hangs observed
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I57402696fb0e970f0a38d87a33f2179b294a2cf1
Reviewed-on: https://chromium-review.googlesource.com/558133
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit ffdf27b84904d4c4e8294ce22e5fd9c423cf0d7c)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I3fd81db442bc8e97b5d44f3f03b35358dbf11318
Reviewed-on: https://chromium-review.googlesource.com/780796
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:23 +00:00
Tapani Pälli
5ec83db3fc FROMLIST: glcpp: Hack to handle expressions in #line directives.
GLSL ES 320 technically allows #line to have arbitrary expression trees
rather than integer literal constants, unlike the C and C++ preprocessor.
This is likely a completely unused feature that does not make sense.

However, Android irritatingly mandates this useless behavior, so this
patch implements a hack to try and support it.

We handle a single expression:

    #line <line number expression>

but we avoid handling the double expression:

    #line <line number expression> <source string expression>

because this is an ambiguous grammar.  Instead, we handle the case that
wraps both in parenthesis, which is actually well defined:

    #line (<line number expression>) (<source string expression>)

With this change following tests pass:

   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_expression_vertex
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_expression_fragment
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_and_file_expression_vertex
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_and_file_expression_fragment

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

BUG=b:33352633
BUG=b:33247335
TEST=affected tests passing on CTS 7.1_r1 sentry

Reviewed-on: https://chromium-review.googlesource.com/427305
Tested-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Haixia Shi <hshi@chromium.org>
Trybot-Ready: Haixia Shi <hshi@chromium.org>
[chadv: Cherry-picked from branch arc-12.1.0-pre2]
(cherry picked from commit 18675d69bcd2a66483fcfc15f4c5fa5db4c257af)
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I7afbbb386bd4a582e3f241014a83eaccad1d50d9
Reviewed-on: https://chromium-review.googlesource.com/558132
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit f0e7e697a8403e1bdf56b6f555d9488fe4f620ad)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I010627302e20d7748fb2ef2b1bcdd1ef48811072
Reviewed-on: https://chromium-review.googlesource.com/780795
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:21 +00:00
Deepak Sharma
0a04b702a8 CHROMIUM: radeonsi: Fix crash on sampler_view_destroy
Set sampler_view_destroy method for radeonsi,
when upper layer tries to destroy an object.

BUG=chrome-os-partner:56075
TEST=compile

Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: Ia069a648617019f4df2eb3e9d8fa41b9d9b71ff7
Reviewed-on: https://chromium-review.googlesource.com/558131
Reviewed-by: Deepak Sharma <deepak.sharma@amd.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 6d7b21f317b9d017636db639b05c1dd49c78f8e0)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I1d68a7828f9632e36b57e3d10ba2dc73a748632a
Reviewed-on: https://chromium-review.googlesource.com/780794
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:18 +00:00
Corbin Simpson
dab369a824 CHROMIUM: i965: Clamp scissor state instead of truncating on gen6.
Replaces one undefined behavior with another, slightly more friendly,
undefined behavior.

This changes glScissor() behavior on i965 to clamp instead of truncate
out-of-range scissors. Technically either behavior is acceptable, but
clamping has more predictable results on out-of-range scissors.

BUG=chromium:360217
TEST=Watched some Youtube on Link; can't reproduce original bug as reported.

Signed-off-by: Corbin Simpson <simpsoco@chromium.org>
Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I475deb2c102dd1b563ae4cc05f9fae5906c5c094
Reviewed-on: https://chromium-review.googlesource.com/558128
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit f390b920db89fd332f6c1398c886de43cdc4e868)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I3869ed8bc8e7e4551ff75c8e1c3fb1ec7ccb784f
Reviewed-on: https://chromium-review.googlesource.com/780791
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:15 +00:00
Haixia Shi
e96314b6cf CHROMIUM: i965: Fix corner cases of brw depth stencil workaround
Since we can't repro this bug, it's hard to track it down, but it
looks like there are multiple issues with the workaround, which this
patch tries to fix.

This fixes two corner cases with the workaround:
- Fix the case where there is a depth but no stencil
- Fix the case there the depth mt hasn't been created

BUG=chromium:423546
TEST=builds and runs on link

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: Ib2813252dc825443470f67b6214c16d38981cda5
Reviewed-on: https://chromium-review.googlesource.com/558127
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 1d04841335da04fa7b97cf105ebf1514f081f7d9)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I2255687f69808e5ecb8eb9eb461921df4698108d
Reviewed-on: https://chromium-review.googlesource.com/780790
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:13 +00:00
Stéphane Marchesin
d226caef7a CHROMIUM: i965: Return NULL if we don't have a miptree
If we have no miptree (irb->mt == NULL) we still go ahead and look at
the stencil miptree, which causes crashes. Instead, let's return NULL if
we don't have a miptree, which will be correctly handled later.

BUG=chromium:387897
TEST=can't reproduce the bug, but compiles and runs

Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: Ief9c1fabd393b19a47f212885aff9e333c577785
Reviewed-on: https://chromium-review.googlesource.com/558126
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit ecfcf4db4ff27e0ec91fb31da3c49601b968346a)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I0fe67275f1227eaa4434d82594491fc27a9bd731
Reviewed-on: https://chromium-review.googlesource.com/780789
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:10 +00:00
Stéphane Marchesin
4bcb10cacc CHROMIUM: i965: Disable hardware contexts for gen6
They don't seem to work, and cause regular GPU hangs, so let's disable
them.

BUG=chromium:288818
TEST=by hand: (along with the kernel patch) run multiple flash videos with hardware decode, no GPU hang happens

Signed-off-by: Dominik Behr <dbehr@chromium.org>
Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I374fd27f113c3399362f2ca40bdbbaf7574f5fae
Reviewed-on: https://chromium-review.googlesource.com/558125
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit f12a531c67e865e427f1f21c960e6c4a30ff45c8)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I0277438a011a9161f49d6bcbb57747d04a8e832d
Reviewed-on: https://chromium-review.googlesource.com/780788
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 18:33:00 +00:00
Stéphane Marchesin
403ab71152 CHROMIUM: glsl: Avoid crash when overflowing the samplers array
Fixes a crash when we have too many samplers.

BUG=chromium:141901
TEST=by hand

Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I5a997d65080fee8f4536cca86f06a38af3786682
Reviewed-on: https://chromium-review.googlesource.com/558122
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 4a87b3221cabe0ae76ac0ed017bbc7e86a88a90e)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I9eafec1dee5ee2e9b156cffa4731212d83585240
Reviewed-on: https://chromium-review.googlesource.com/780785
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 15:45:37 +00:00
James Ausmus
b178753c0a CHROMIUM: gallium: Fix renderbuffer destruction crash
Avoid crash on surface/sampler_view destruction when the context is gone

When we delete the context, sometimes there are pending surfaces and
sampler view left. Since mesa doesn't properly refcount them, the
context can go away before its resources. Until mesa is fixed to
properly refcount all these resources, let's just carry the destroy
function on the resource itself, which gives us a way to free it.

BUG=none
TEST=compile

Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: Ibfd5d2de9606beedb8275979c0f155eee61f51fb
Reviewed-on: https://chromium-review.googlesource.com/558120
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 8fb20c7216b406b560565ed23209d71dc2826a97)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: I299169e3b5204b590ce6d3e4385a3dbb8bdda4fb
Reviewed-on: https://chromium-review.googlesource.com/780783
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2017-11-22 15:40:11 +00:00
Stéphane Marchesin
8940a624ae CHROMIUM: st/mesa: Do not flush front buffer on context flush
Make gallium work again with new chrome.

BUG=none
TEST=compile

Signed-off-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
(applied manually from src/third_party/media-libs/mesa/files)

BUG=b:33533853
TEST=No CTS regressions on Cyan and Reef.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Change-Id: I023df483ceecc01e42150c1abb6e6963577efc60
Reviewed-on: https://chromium-review.googlesource.com/558119
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 34218aac20aead9e6a159ecd3201655feb4d806d)

BUG=b:69553386
TEST=No regressions on Eve in `cts-tradefed run cts -m CtsDeqpTestCases`.

Change-Id: Id9cb60f87c42db613617bc1ad3ae8b2a62701746
Reviewed-on: https://chromium-review.googlesource.com/780782
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2017-11-22 15:38:08 +00:00
Tomasz Figa
7cc5c96a9a CHROMIUM: Add PRESUBMIT.cfg to disable various checks
This makes it so that we don't need to run 'repo upload --no-verify'.

BUG=b:26574868
TEST=ran upload on some CLs

(cherry picked from commit 7529842c0739e1f1e54ac49abae3695c63e483b8)
Change-Id: I0b22a5ee0321dc454affeefcfac0eee75490bd6e
Reviewed-on: https://chromium-review.googlesource.com/780587
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2017-11-21 01:49:41 +00:00
305 changed files with 5193 additions and 6238 deletions

View File

@@ -364,6 +364,38 @@ matrix:
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="meson Vulkan"
- BUILD=meson
- MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- libexpat1-dev
- libelf-dev
- python3-pip
- env:
- LABEL="meson loaders/classic DRI"
- BUILD=meson
- MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3-pip
install:
- pip install --user mako

View File

@@ -65,7 +65,6 @@ LOCAL_CFLAGS += \
-DHAVE_PTHREAD=1 \
-DHAVE_DLADDR \
-DHAVE_DL_ITERATE_PHDR \
-DHAVE_ENDIAN_H \
-DMAJOR_IN_SYSMACROS \
-fvisibility=hidden \
-Wno-sign-compare

10
PRESUBMIT.cfg Normal file
View File

@@ -0,0 +1,10 @@
# This sample config file disables all of the ChromiumOS source style checks.
# Comment out the disable-flags for any checks you want to leave enabled.
[Hook Overrides]
stray_whitespace_check: false
long_line_check: false
cros_license_check: false
tab_check: false
bug_field_check: false
test_field_check: false

View File

@@ -1 +1 @@
17.3.9
17.3.0-rc5

View File

@@ -1,185 +0,0 @@
# fixes: The commit addresses Meson which is explicitly disabled for 17.3
ab0809e5529725bd0af6f7b6ce06415020b9d32e meson: fix strtof locale support check
# fixes: The commit addresses Meson which is explicitly disabled for 17.3
44fbbd6fd07e5784b05e08e762e54b6c71f95ab1 util: add mesa-sha1 test to meson
# stable: The commit addresses earlier commit 6132992cdb which did not land in
# branch
3d2b157e23c9d66df97d59be6efd1098878cc110 i965/fs: Use UW types when using V immediates
# extra: The commit just references a fix for an additional change in its v2.
c1ff99fd70cd2ceb2cac4723e4fd5efc93834746 main: Clear shader program data whenever ProgramBinary is called
# fixes: The commit addresses earlier commits 40a01c9a0ef and 8d745abc009 which
# did not land in branch
9b0223046668593deb9c0be0b557994bb5218788 egl: pass the dri2_dpy to the $plat_teardown functions
# fixes: The commit addresses earlier commit d50937f137 which did not land in
# branch
78a8b73e7d45f55ced98a148b26247d91f4e0171 vulkan/wsi: free cmd pools
# stable: The commit addresses earlier commit 6d87500fe12 which did not land in
# branch
525b4f7548462bfc2e82f2d1f04f61ce6854a3c5 i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext
# stable: The commit depends on earlier commit a4be2bcee2 which did not land in
# branch
a29d63ecf71546c4798c609e37810f0ec81793d8 swr: refactor swr_create_screen to allow for proper cleanup on error
# stable: Explicit 18.0 only nominations
4b69ba381766cd911eb1284f1b0332a139ec8a75 anv/pipeline: Don't assert on more than 32 samplers
bc0a21e34811e0e1542236dbaf5fb1fa56bbb98c anv/cmd_state: Drop the scratch_size field
d6c9a89d1324ed2c723cbd3c6d8390691c58dfd2 anv/cmd_buffer: Get rid of the meta query workaround
cd3feea74582cea2d18306d167609f4fbe681bb3 anv/cmd_buffer: Rework anv_cmd_state_reset
ddc2d285484a1607f79ffeb2fc6c09367c6aea1f anv/cmd_buffer: Use some pre-existing pipeline temporaries
9af5379228d7be9c7ea41e0912a8770d28ead92b anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
d5592e2fdaa9ce8b98d38b2d29e2a7d2c4abda08 anv: Remove semicolons from vk_error[f] definitions
90cceaa9dd3b12e039a131a50c6866dce04e7fb2 anv/cmd_buffer: Refactor ensure_push_descriptor_set
b9e1ca16f84016f1d40efa9bfee89db48a7702b4 anv/cmd_buffer: Add a helper for binding descriptor sets
31b2144c836485ef6476bd455f1c02b96deafab7 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
97f96610c8b858267c121c0ad6ffc630e2aafc09 anv: Separate compute and graphics descriptor sets
e85aaec1489b00f24ebef4ae5b1da598091275e1 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
8bd5ec5b862333c936426ff18d093d07dd006182 anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
24caee8975355a2b54b41c484ff3c897e1911760 anv/cmd_buffer: Use a temporary variable for dynamic state
95ff2322948692f5f7b1d444aabe878fba53304c anv/cmd_buffer: Move dynamic state to graphics state
38ec78049f69821091a2d42b0f457a1b044d4273 anv/cmd_buffer: Move num_workgroups to compute state
4064fe59e7144fa822568543cfcc043387645d4e anv/cmd_buffer: Move gen7 index buffer state to graphics state
# fixes: The commit requires earlier commit 49d035122ee which did not land in
# branch
766589d89a211e67f313e8cb38f2d05b09975f96 radv: fix sample_mask_in loading. (v3.1)
# stable: The commits address the Meson build that is explicitly disabled in
# branch
c38c60a63c63b02d1030c6c349aa0a73105e10eb meson: fix BSD build
5781c3d1db4a01e77f416c1685025c4d830ae87d meson: correctly set SYSCONFDIR for loading dirrc
7c8cfe2d59bfc0dbf718a74b08b6dceaa84f7242 meson: fix missing dependencies
53f9131205a63fa8b282ab2a7e96c48209447da0 meson: fix getting cflags from pkg-config
8fae5eddd9982f4586d76471d0196befeb46de24 meson: handle LLVM 'x.x.xgit-revision' versionsi
# stable: The commit requires earlier commit 01ab218bbc which did not land in
# branch
0e879aad2fd1dac102c13d680edf455aa068d5df swr/rast: support llvm 3.9 type declarations
# stable: The commit requires earlier commit w41c36c45 which did not land in
# branch
49b0a140a731069e0e4959c65bfd1b597a4fb141 ac/nir: set amdgpu.uniform and invariant.load for UBOs
# stable: The commits address gen10 support which is missing in branch
ca19ee33d7d39cb89d948b1c983763065975ce5b i965/gen10: Ignore push constant packets during context restore.
78c125af3904c539ea69bec2dd9fdf7a5162854f anv/gen10: Ignore push constant packets during context restore.
bcfd78e4489f538e34138269650fc6cbe8c9d75f i965/gen10: Re-enable push constants.
# stable: The commits are explicit 18.0 nominations
17423c993d0b083c7a77a404b85788687f5efe36 winsys/amdgpu: fix assertion failure with UVD and VCE rings
e0e23ea69cab23b9193b1e7c568fd23fc7073071 r600/eg: construct proper rat mask for image/buffers.
# stable: The commits address the initial shader cache support which did not land in branch
28db950b51274ce296cd625db62abe935d1e4ed9 i965: fix prog_data leak in brw_disk_cache
b99c88037bf64b033579f237ec287857c53b0ad6 i965: fix disk_cache leak when destroying context
# stable: The commit covers nir serialise, which did not land in branch
d0343bef6680cc660ba691bbed31a2a1b7449f79 nir: mark unused space in packed_tex_data
# stable: The KHX extension is disabled all together in the stable branches.
bee9270853c34aa8e4b3d19a125608ee67c87b86 radv: Don't expose VK_KHX_multiview on android.
# fixes: The commit addresses the meson build, which is disabled in branch
4a0bab1d7f942ad0ac9b98ab34e6a9e4694f3c04 meson: libdrm shouldn't appear in Requires.private: if it wasn't found
16bf8138308008f4b889caa827a8291ff72745b8 meson/swr: re-shuffle generated files
bbef9474fa52d9aba06eeede52558fc5ccb762dd meson/swr: Updated copyright dates
d7235ef83b92175537e3b538634ffcff29bf0dce meson: Don't confuse the install and search paths for dri drivers
c75a4e5b465261e982ea31ef875325a3cc30e79d meson: Check for actual LLVM required versions
105178db8f5d7d45b268c7664388d7db90350704 meson: fix test source name for static glapi
c74719cf4adae2fa142e154bff56716427d3d992 glapi: fix check_table test for non-shared glapi with meson
# stable: Explicit 18.0 only nominations
2ffe395cba0f7b3c1f1c41062f4376eae3a188b5 radv: Don't expose VK_KHX_multiview on android.
4195eed961ccfe404ae81b9112189fc93a254ded glsl/linker: check same name is not used in block and outside
a5053ba27ed76f666e315de7150433c5aaaaf2c3 anv/device: initialize the list of enabled extensions properly
bd6c0cab606fa0a3b821e50542ba06ff714292bf i965: perf: use drmIoctl() instead of ioctl()
bf1577fe0972ae910c071743dc89d261a46c2926 i965/gen10: Remove warning message.
fcae3d1a9acc080bf31cf7b5c4d0b18e67319b09 anv/gen10: Remove warning message.
eb2e17e2d15bf58b60460437330d719131fb859e docs: Add Cannonlake support to 18.0 release notes.
9a508b719be32ef10ca929250b7aafba313104c6 android: anv/extensions: fix generated sources build
d448954228e69fd1b4000ea13e28c2ba2832db13 android: anv: add dependency on libnativewindow for O and later
6451b0703ff3027b746d6268b98dd2b3e6698be5 android: vulkan/util: add dependency on libnativewindow for O and later
c956d0f4069cf39d8d3c57ebed8d905575e9ea34 radv: make sure to emit cache flushes before starting a query
c133a3411bbf47c2ba7d9cdae7e35a64fe276068 radv: do not set pending_reset_query in BeginCommandBuffer()
55376cb31e2f495a4d872b4ffce2135c3365b873 st/mesa: expose 0 shader binary formats for compat profiles for Qt
# stable: The commits address gen10 support which is missing in branch
56dc9f9f49638e0769d6bc696ff7f5dafccec9fc intel/compiler: Memory fence commit must always be enabled for gen10+
# stable: The commit requires earlier commits 4e7f6437b535 and a6b379284365
# which did not land in branch
ab5cee4c241cb360cf67101dd751e0f38637b526 r600/compute: only mark buffer/image state dirty for fragment shaders
# stable: The commits have a specific version for the 17.3 branch
4796025ba518baa0e8893337591a3f452a375d94 intel/isl: Add an isl_color_value_is_zero helper
85d0bec9616bc1ffa8e4ab5e7c5d12ff4e414872 anv: Be more careful about fast-clear colors
# stable: The commit fixes earlier commit cd3feea74582 which did not land in
# branch
4c77e21c814145e845bac64cce40eadfd7ac0bd9 anv: Move setting current_pipeline to cmd_state_init
# stable: The commit is causing several regressions in Vulkan CTS tests in
# different platforms (hsw, bdw, bsw, ...)
85d0bec9616bc1ffa8e4ab5e7c5d12ff4e414872 anv: Be more careful about fast-clear colors
# stable: The commit requires earlier commit a03d456f5a41 which did not land in
# branch
c7cadcbda47537d474eea52b9e77e57ef9287f9b r600: Take ALU_EXTENDED into account when evaluating jump offsets
# fixes: The commit requires earlier commits 77097d96a0 and a5a654b19a which
# did not land in branch
c7dcee58b5fe183e1653c13bff6a212f0d157b29 i965: Avoid problems from referencing orphaned BOs after growing.
# fixes: The commit addresses the meson build, which is disabled in branch
5317211fa029ee8d0e1c802ef8c01f64c470e3d5 meson: use a custom target instead of a generator for i965 oa
d672084ba29a64f5ec8c9cd23d4b77c0efa05693 meson: define empty variables for libswdri and libswkmsdri
8eb608df61912cfd0633fe982b140e22e7563770 meson: add libswdri and libswkmsdri to dri link_with
7023b373ec76a2ea25b1bd0a7501276de9007047 meson: link dri3 xcb libs into vlwinsys instead of into each target
5c460337fd9c1096dea4bc569bd876a112ed6f16 meson: Fix GL and EGL pkg-config files with glvnd
e23192022a2cde122a6ccc70e5495fda009bee12 meson: install vulkan_intel.h header
# fixes: The commit fixes earlier commit 1c57a6da5e which did not land in
# branch
3401b028df1074a06a7fbc3fb1cda949646ef75d ac/shader: fix vertex input with components.
# extra: The commit requires earlier commit a63c74be851 which did not land in
# branch
fa8a764b62588420ac789df79ec0ab858b38639f i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.
# extra: The commit requires earlier commit a44744e01d which did not land in
# branch
adca1e4a92a53a403b7620c3356dcf038f0bcecc anv/image: Separate modifiers from legacy scanout
# stable: The commit requires earlier commits fe81e1f9751 and 92c1290dc57 which
# did not land in branch
fb5825e7ceeb16ac05f870ffe1e5a5daa09e68dd glsl: Fix memory leak with known glsl_type instances
# fixes: The commits require earlier commits 2deb82207572 and b2653007b980
# which did not land in branch
4f0c89d66c570e82d832e2e49227517302e271a2 ac/nir: pass the nir variable through tcs loading.
27a5e5366e89498d98d786cc84fafbdb220c4d94 radv: mark all tess output for an indirect access.
# fixes: The commit requires earlier commits b358e0e67fac and b2653007b980
# which did not land in branch
8f052a3e257a61240cb311032497d016278117a8 radv: handle exporting view index to fragment shader. (v1.1)
# fixes: The commit fixes earier commits 83d4a5d5aea5a8a05be2,
# b2f2236dc565dd1460f0 and c62cf1f165919bc74296 which did not land in
# branch
880c1718b6d14b33fe5ba918af70fea5be890c6b omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}
# stable: Explicit 18.0 only nominations
d77844a5290948a490ce6921c1623d1dd7af6c31 docs: fix 18.0 release note version
# stable: Explicit 18.0 only nominations
1866f76f7bc3ec54b4e91eb7d329b2e6f7b6277c freedreno/a5xx: fix page faults on last level
2f175bfe5d8ca59a8a68b6d6d072cd7bf2f8baa9 freedreno/a5xx: don't align height for PIPE_BUFFER
# fixes: A specific backport of this commit was applied for this branch.
4503ff760c794c3bb15b978a47c530037d56498e ac/nir: Add workaround for GFX9 buffer views.

View File

@@ -74,7 +74,7 @@ AC_SUBST([OPENCL_VERSION])
# in the first entry.
LIBDRM_REQUIRED=2.4.75
LIBDRM_RADEON_REQUIRED=2.4.71
LIBDRM_AMDGPU_REQUIRED=2.4.89
LIBDRM_AMDGPU_REQUIRED=2.4.85
LIBDRM_INTEL_REQUIRED=2.4.75
LIBDRM_NVVIEUX_REQUIRED=2.4.66
LIBDRM_NOUVEAU_REQUIRED=2.4.66
@@ -383,9 +383,11 @@ if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
AC_MSG_CHECKING(whether -latomic is needed)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
uint64_t v;
struct {
uint64_t* v;
} x;
int main() {
return (int)__atomic_load_n(&v, __ATOMIC_ACQUIRE);
return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=no, GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=yes)
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC)
if test "x$GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC" = xyes; then
@@ -791,10 +793,8 @@ fi
AC_HEADER_MAJOR
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_HEADERS([endian.h])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
AC_CHECK_FUNC([memfd_create], [DEFINES="$DEFINES -DHAVE_MEMFD_CREATE"])
AC_MSG_CHECKING([whether strtod has locale support])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
@@ -867,10 +867,10 @@ dnl In practise that should be sufficient for all platforms, since any
dnl platforms build with GCC and Clang support the flag.
PTHREAD_LIBS="$PTHREAD_LIBS -pthread"
dnl pthread-stubs is mandatory on some BSD platforms, due to the nature of the
dnl pthread-stubs is mandatory on BSD platforms, due to the nature of the
dnl project. Even then there's a notable issue as described in the project README
case "$host_os" in
linux* | cygwin* | darwin* | solaris* | *-gnu* | gnu* | openbsd*)
linux* | cygwin* | darwin* | solaris* | *-gnu* | gnu*)
pthread_stubs_possible="no"
;;
* )
@@ -1208,10 +1208,10 @@ AC_ARG_ENABLE([xa],
[enable_xa=no])
AC_ARG_ENABLE([gbm],
[AS_HELP_STRING([--enable-gbm],
[enable gbm library @<:@default=yes except cygwin and macOS@:>@])],
[enable gbm library @<:@default=yes except cygwin@:>@])],
[enable_gbm="$enableval"],
[case "$host_os" in
cygwin* | darwin*)
cygwin*)
enable_gbm=no
;;
*)
@@ -1536,7 +1536,7 @@ fi
AC_ARG_ENABLE([driglx-direct],
[AS_HELP_STRING([--disable-driglx-direct],
[disable direct rendering in GLX and EGL for DRI \
@<:@default=enabled@:>@])],
@<:@default=auto@:>@])],
[driglx_direct="$enableval"],
[driglx_direct="yes"])
@@ -2410,12 +2410,13 @@ dnl Surfaceless is an alternative for the last one.
dnl
require_basic_egl() {
case "$with_platforms" in
*drm*|*surfaceless*)
*drm*|*surfaceless*|*android*)
;;
*)
AC_MSG_ERROR([$1 requires one of these:
1) --with-platforms=drm (X, Wayland, offscreen rendering based on DRM)
2) --with-platforms=surfaceless (offscreen only)
3) --with-platforms=android (Android only)
Recommended options: drm,x11])
;;
esac
@@ -2495,14 +2496,6 @@ if test -n "$with_gallium_drivers"; then
HAVE_GALLIUM_RADEONSI=yes
PKG_CHECK_MODULES([RADEON], [libdrm >= $LIBDRM_RADEON_REQUIRED libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
# Blacklist libdrm_amdgpu 2.4.90 because it causes a crash in older
# radeonsi with pretty much any app.
libdrm_version=`pkg-config libdrm_amdgpu --modversion`
if test "x$libdrm_version" = x2.4.90; then
AC_MSG_ERROR([radeonsi can't use libdrm 2.4.90 due to a compatibility issue. Use a newer or older version.])
fi
require_libdrm "radeonsi"
radeon_llvm_check $LLVM_REQUIRED_RADEONSI "radeonsi"
if test "x$enable_egl" = xyes; then
@@ -2720,18 +2713,6 @@ if test "x$enable_llvm" = xyes; then
fi
fi
fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" == xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
fi
fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)

View File

@@ -88,40 +88,22 @@ This is a work-around for that.
<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) and possibly the GL API type.
<ul>
<li>The format should be MAJOR.MINOR[FC|COMPAT]
<li>FC is an optional suffix that indicates a forward compatible
context. This is only valid for versions &gt;= 3.0.
<li>COMPAT is an optional suffix that indicates a compatibility
context or GL_ARB_compatibility support. This is only valid for
versions &gt;= 3.1.
<li>GL versions &lt;= 3.0 are set to a compatibility (non-Core)
profile
<li>GL versions = 3.1, depending on the driver, it may or may not
have the ARB_compatibility extension enabled.
<li>GL versions &gt;= 3.2 are set to a Core profile
<li>Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC, 3.1COMPAT, X.Y, X.YFC,
X.YCOMPAT.
<ul>
<li>2.1 - select a compatibility (non-Core) profile with GL
version 2.1.
<li>3.0 - select a compatibility (non-Core) profile with GL
version 3.0.
<li>3.0FC - select a Core+Forward Compatible profile with GL
version 3.0.
<li>3.1 - select GL version 3.1 with GL_ARB_compatibility enabled
per the driver default.
<li>3.1FC - select GL version 3.1 with forward compatibility and
GL_ARB_compatibility disabled.
<li>3.1COMPAT - select GL version 3.1 with GL_ARB_compatibility
enabled.
<li>X.Y - override GL version to X.Y without changing the profile.
<li>X.YFC - select a Core+Forward Compatible profile with GL
version X.Y.
<li>X.YCOMPAT - select a Compatibility profile with GL version
X.Y.
</ul>
<li>Mesa may not really implement all the features of the given
version. (for developers only)
<li> The format should be MAJOR.MINOR[FC]
<li> FC is an optional suffix that indicates a forward compatible context.
This is only valid for versions &gt;= 3.0.
<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile
<li> GL versions = 3.0, see below
<li> GL versions &gt; 3.0 are set to a Core profile
<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC
<ul>
<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1
<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0
<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0
<li> 3.1 - select a Core profile with GL version 3.1
<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1
</ul>
<li> Mesa may not really implement all the features of the given version.
(for developers only)
</ul>
<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) for OpenGL ES.

View File

@@ -20,7 +20,7 @@
The Gallium llvmpipe driver is a software rasterizer that uses LLVM to
do runtime code generation.
Shaders, point/line/triangle rasterization and vertex processing are
implemented with LLVM IR which is translated to x86, x86-64, or ppc64le machine
implemented with LLVM IR which is translated to x86 or x86-64 machine
code.
Also, the driver is multithreaded to take advantage of multiple CPU cores
(up to 8 at this time).
@@ -32,36 +32,24 @@ It's the fastest software rasterizer for Mesa.
<ul>
<li>
<p>An x86 or amd64 processor; 64-bit mode recommended.</p>
<p>
For x86 or amd64 processors, 64-bit mode is recommended.
Support for SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will
yield the most efficient code. The fewer features the CPU has the more
likely it is that you will run into underperforming, buggy, or incomplete code.
</p>
<p>
For ppc64le processors, use of the Altivec feature (the Vector
Facility) is recommended if supported; use of the VSX feature (the
Vector-Scalar Facility) is recommended if supported AND Mesa is
built with LLVM version 4.0 or later.
likely is that you run into underperforming, buggy, or incomplete code.
</p>
<p>
See /proc/cpuinfo to know what your CPU supports.
</p>
</li>
<li>
<p>Unless otherwise stated, LLVM version 3.4 is recommended; 3.3 or later is required.</p>
<p>LLVM: version 3.4 recommended; 3.3 or later required.</p>
<p>
For Linux, on a recent Debian based distribution do:
</p>
<pre>
aptitude install llvm-dev
</pre>
<p>
If you want development snapshot builds of LLVM for Debian and derived
distributions like Ubuntu, you can use the APT repository at <a
href="https://apt.llvm.org/" title="Debian Development packages for LLVM"
>apt.llvm.org</a>, which are maintained by Debian's LLVM maintainer.
</p>
<p>
For a RPM-based distribution do:
</p>
@@ -240,8 +228,8 @@ build/linux-???-debug/gallium/drivers/llvmpipe:
</ul>
<p>
Some of these tests can output results and benchmarks to a tab-separated file
for later analysis, e.g.:
Some of this tests can output results and benchmarks to a tab-separated-file
for posterior analysis, e.g.:
</p>
<pre>
build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
@@ -252,8 +240,8 @@ for later analysis, e.g.:
<ul>
<li>
When looking at this code for the first time, start in lp_state_fs.c, and
then skim through the lp_bld_* functions called there, and the comments
When looking to this code by the first time start in lp_state_fs.c, and
then skim through the lp_bld_* functions called in there, and the comments
at the top of the lp_bld_*.c functions.
</li>
<li>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.0 Release Notes / December 8. 2017</h1>
<h1>Mesa 17.3.0 Release Notes / TBD</h1>
<p>
Mesa 17.3.0 is a new development release.
@@ -33,8 +33,7 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
0cb1ffe2b4637d80f08df3bdfeb300352dcffd8ff4f6711278639b084e3f07f9 mesa-17.3.0.tar.gz
29a0a3a6c39990d491a1a58ed5c692e596b3bfc6c01d0b45e0b787116c50c6d9 mesa-17.3.0.tar.xz
TBD.
</pre>
@@ -59,187 +58,14 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">Bug 100613</a> - Regression in Mesa 17 on s390x (zSystems)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101655">Bug 101655</a> - Explicit sync support for android</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101709">Bug 101709</a> - [llvmpipe] piglit gl-1.0-scissor-offscreen regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101832">Bug 101832</a> - [PATCH][regression][bisect] Xorg fails to start after f50aa21456d82c8cb6fbaa565835f1acc1720a5d</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101851">Bug 101851</a> - [regression] libEGL_common.a undefined reference to '__gxx_personality_v0'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101867">Bug 101867</a> - Launch options window renders black in Feral Games in current Mesa trunk</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101876">Bug 101876</a> - SIGSEGV when launching Steam</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101925">Bug 101925</a> - playstore/webview crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101941">Bug 101941</a> - Getting different output depending on attribute declaration order</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101961">Bug 101961</a> - Serious Sam Fusion hangs system completely</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101981">Bug 101981</a> - Commit ddc32537d6db69198e88ef0dfe19770bf9daa536 breaks rendering in multiple applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101982">Bug 101982</a> - Weston crashes when running an OpenGL program on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101983">Bug 101983</a> - [G33] ES2-CTS.functional.shaders.struct.uniform.sampler_nested* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101989">Bug 101989</a> - ES3-CTS.functional.state_query.integers.viewport_getinteger regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102006">Bug 102006</a> - gstreamer vaapih264enc segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102014">Bug 102014</a> - Mesa git build broken by commit bc7f41e11d325280db12e7b9444501357bc13922</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102015">Bug 102015</a> - [Regression,bisected]: Segfaults with various programs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102038">Bug 102038</a> - assertion failure in update_framebuffer_size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102050">Bug 102050</a> - commit b4f639d02a causes build breakage on Android 32bit builds</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102052">Bug 102052</a> - No package 'expat' found</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102062">Bug 102062</a> - Segfault at eglCreateContext in android-x86</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102125">Bug 102125</a> - [softpipe] piglit arb_texture_view-targets regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102201">Bug 102201</a> - [regression, SI] GPU crash in Unigine Valley</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102274">Bug 102274</a> - assertion failure in ir_validate.cpp:240</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102377">Bug 102377</a> - PIPE_*_4BYTE_ALIGNED_ONLY caps crashing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102429">Bug 102429</a> - [regression, SI] Performance decrease in Unigine Valley &amp; Heaven</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102454">Bug 102454</a> - glibc 2.26 doesn't provide anymore xlocale.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102461">Bug 102461</a> - [llvmpipe] piglit glean fragprog1 XPD test 1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102467">Bug 102467</a> - src/mesa/state_tracker/st_cb_readpixels.c:178]: (warning) Redundant assignment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102496">Bug 102496</a> - Frontbuffer rendering corruption on mesa master</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102502">Bug 102502</a> - [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102530">Bug 102530</a> - [bisected] Kodi crashes when launching a stream - commit bd2662bf</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102552">Bug 102552</a> - Null dereference due to not checking return value of util_format_description</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102565">Bug 102565</a> - u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102573">Bug 102573</a> - fails to build on armel</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: &gt;&gt; should be &gt; &gt; within a nested template argument list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102685">Bug 102685</a> - piglit.spec.glsl-1_50.compiler.vs-redeclares-pervertex-out-before-global-redeclaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102844">Bug 102844</a> - memory leak with glDeleteProgram for shader program type GL_COMPUTE_SHADER</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102847">Bug 102847</a> - swr fail to build with llvm-5.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102852">Bug 102852</a> - Scons: Support the new Scons 3.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102904">Bug 102904</a> - piglit and gl45 cts linker tests regressed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102924">Bug 102924</a> - mesa (git version) images too dark</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102940">Bug 102940</a> - Regression: Vulkan KMS rendering crashes since 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102999">Bug 102999</a> - [BISECTED,REGRESSION] Failing Android EGL dEQP with RGBA configs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103002">Bug 103002</a> - string_buffer_test.cpp:43: error: ISO C++ forbids initialization of member str1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103214">Bug 103214</a> - GLES CTS functional.state_query.indexed.atomic_counter regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103247">Bug 103247</a> - Performance regression: car chase, manhattan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103253">Bug 103253</a> - blob.h:138:1: error: unknown type name 'ssize_t'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103265">Bug 103265</a> - [llvmpipe] piglit depth-tex-compare regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103323">Bug 103323</a> - Possible unintended error message in file pixel.c line 286</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of memfd_create follows non-static declaration</li>
TBD
</ul>
<h2>Changes</h2>
<ul>
TBD
</ul>
</div>
</body>

View File

@@ -1,191 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.1 Release Notes / December 21, 2017</h1>
<p>
Mesa 17.3.1 is a bug fix release which fixes bugs found since the 17.3.0 release.
</p>
<p>
Mesa 17.3.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
b0bb0419dbe3043ed4682a28eaf95721f427ca3f23a3c2a7dc77dbe8a3b6384d mesa-17.3.1.tar.gz
9ae607e0998a586fb2c866cfc8e45e6f52d1c56cb1b41288253ea83eada824c1 mesa-17.3.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Add LLVM version to the device name string</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>spirv: Fix loading an entire block at once.</li>
<li>radv: Don't advertise VK_EXT_debug_report.</li>
<li>radv: Fix multi-layer blits.</li>
</ul>
<p>Ben Crocker (1):</p>
<ul>
<li>docs/llvmpipe: document ppc64le as alternative architecture to x86.</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>xlib: call _mesa_warning() instead of fprintf()</li>
<li>gallium/aux: include nr_samples in util_resource_size() computation</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: port merge tess info from anv</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.0</li>
<li>util: scons: wire up the sha1 test</li>
<li>cherry-ignore: meson: fix strtof locale support check</li>
<li>cherry-ignore: util: add mesa-sha1 test to meson</li>
<li>Update version to 17.3.1</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>compiler: use NDEBUG to guard asserts</li>
</ul>
<p>Fabian Bieler (2):</p>
<ul>
<li>glsl: Match order of gl_LightSourceParameters elements.</li>
<li>glsl: Fix gl_NormalScale.</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/sb: do not convert if-blocks that contain indirect array access</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>nir/opcodes: Fix constant-folding of bitfield_insert</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Switch over to fully external-or-not MOCS scheme</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>travis: disable Meson build</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>meta: Initialize depth/clear values on declaration.</li>
<li>meta: Fix ClearTexture with GL_DEPTH_COMPONENT.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move destroy command before feedback command</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>radeonsi: flush the context after resource_copy_region for buffer exports</li>
<li>radeonsi: allow DMABUF exports for local buffers</li>
<li>winsys/amdgpu: disable local BOs again due to worse performance</li>
<li>radeonsi: don't call force_dcc_off for buffers</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util: Assume little endian in the absence of platform-specific handling</li>
<li>util: Add a SHA1 unit test program</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nvc0/ir: Properly lower 64-bit shifts when the shift value is &gt;32</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>glsl: get correct member type when processing xfb ifc arrays</li>
</ul>
<p>Vadym Shovkoplias (2):</p>
<ul>
<li>glx/dri3: Remove unused deviceName variable</li>
<li>util/disk_cache: Remove unneeded free() on always null string</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,109 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.2 Release Notes / January 8, 2018</h1>
<p>
Mesa 17.3.2 is a bug fix release which fixes bugs found since the 17.3.1 release.
</p>
<p>
Mesa 17.3.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f997e80f14c385f9a2ba827c2b74aebf1b7426712ca4a81c631ef9f78e437bf4 mesa-17.3.2.tar.gz
e2844a13f2d6f8f24bee65804a51c42d8dc6ae9c36cff7ee61d0940e796d64c6 mesa-17.3.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Fix DCC compatible formats.</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>egl: link libEGL against the dynamic version of libglapi</li>
</ul>
<p>Dave Airlie (6):</p>
<ul>
<li>radv/gfx9: add support for 3d images to blit 2d paths</li>
<li>radv: handle depth/stencil image copy with layouts better. (v3.1)</li>
<li>radv/meta: fix blit paths for depth/stencil (v2.1)</li>
<li>radv: fix issue with multisample positions and interp_var_at_sample.</li>
<li>radv/gfx9: add 3d sampler image-&gt;buffer copy shader. (v3)</li>
<li>radv: don't do format replacement on tc compat htile surfaces.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.1</li>
<li>Update version to 17.3.2</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE</li>
</ul>
<p>Rob Herring (1):</p>
<ul>
<li>egl/android: Fix build break with dri2_initialize_android _EGLDisplay parameter</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv/gfx9: fix primitive topology when adjacency is used</li>
<li>radv: use a faster version for nir_op_pack_half_2x16</li>
</ul>
<p>Tapani Pälli (2):</p>
<ul>
<li>mesa: add AllowGLSLCrossStageInterpolationMismatch workaround</li>
<li>drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,151 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.3 Release Notes / January 18, 2018</h1>
<p>
Mesa 17.3.3 is a bug fix release which fixes bugs found since the 17.3.2 release.
</p>
<p>
Mesa 17.3.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c733d37a161501cd81dc9b309ccb613753b98eafc6d35e0847548a6642749772 mesa-17.3.3.tar.gz
41bac5de0ef6adc1f41a1ec0f80c19e361298ce02fa81b5f9ba4fdca33a9379b mesa-17.3.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (3):</p>
<ul>
<li>anv: Add missing unlock in anv_scratch_pool_alloc</li>
<li>anv: Take write mask into account in has_color_buffer_write_enabled</li>
<li>anv: Make sure state on primary is correct after CmdExecuteCommands</li>
</ul>
<p>Andres Gomez (1):</p>
<ul>
<li>anv: Import mako templates only during execution of anv_extensions</li>
</ul>
<p>Bas Nieuwenhuizen (11):</p>
<ul>
<li>radv: Invert condition for all samples identical during resolve.</li>
<li>radv: Flush caches before subpass resolve.</li>
<li>radv: Fix fragment resolve destination offset.</li>
<li>radv: Use correct framebuffer size for partial FS resolves.</li>
<li>radv: Always use fragment resolve if dest uses DCC.</li>
<li>Revert "radv/gfx9: fix block compression texture views."</li>
<li>radv: Use correct HTILE expanded words.</li>
<li>radv: Allow writing 0 scissors.</li>
<li>ac/nir: Handle loading data from compact arrays.</li>
<li>radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT.</li>
<li>ac/nir: Sanitize location_frac for local variables.</li>
</ul>
<p>Dave Airlie (8):</p>
<ul>
<li>radv: fix events on compute queues.</li>
<li>radv: fix pipeline statistics end query on compute queue</li>
<li>radv/gfx9: fix 3d image to image transfers on compute queues.</li>
<li>radv/gfx9: fix 3d image clears on compute queues</li>
<li>radv/gfx9: fix buffer to image for 3d images on compute queues</li>
<li>radv/gfx9: fix block compression texture views.</li>
<li>radv/gfx9: use a bigger hammer to flush cb/db caches.</li>
<li>radv/gfx9: use correct swizzle parameter to work out border swizzle.</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.2</li>
</ul>
<p>Florian Will (1):</p>
<ul>
<li>glsl: Respect std430 layout in lower_buffer_access</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>cherry-ignore: intel/fs: Use the original destination region for int MUL lowering</li>
<li>cherry-ignore: i965/fs: Use UW types when using V immediates</li>
<li>cherry-ignore: main: Clear shader program data whenever ProgramBinary is called</li>
<li>cherry-ignore: egl: pass the dri2_dpy to the $plat_teardown functions</li>
<li>cherry-ignore: vulkan/wsi: free cmd pools</li>
<li>Update version to 17.3.3</li>
</ul>
<p>Józef Kucia (1):</p>
<ul>
<li>radeonsi: fix alpha-to-coverage if color writes are disabled</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Require space for MI_BATCHBUFFER_END.</li>
<li>i965: Torch public intel_batchbuffer_emit_dword/float helpers.</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: disable in-place resolve for non-supertiled surfaces</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: VkDescriptorSetLayoutBinding can have descriptorCount == 0</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>loader/dri3: Avoid freeing renderbuffers in use</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: fix invalid sign masks in avx512 simdlib code</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,275 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>
<p>
Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.
</p>
<p>
Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84 mesa-17.3.4.tar.gz
71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0 mesa-17.3.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>
</ul>
<p>Bas Nieuwenhuizen (10):</p>
<ul>
<li>radv: Fix ordering issue in meta memory allocation failure path.</li>
<li>radv: Fix memory allocation failure path in compute resolve init.</li>
<li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>
<li>radv: Fix fragment resolve init memory allocation failure paths.</li>
<li>radv: Fix bufimage failure deallocation.</li>
<li>radv: Init variant entry with memset.</li>
<li>radv: Don't allow 3d or 1d depth/stencil textures.</li>
<li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>
<li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>
<li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>radeon/vcn: add and manage render picture list</li>
<li>radeon/uvd: add and manage render picture list</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>configure.ac: add missing llvm dependencies to .pc files</li>
</ul>
<p>Dave Airlie (10):</p>
<ul>
<li>r600/sb: fix a bug emitting ar load from a constant.</li>
<li>ac/nir: account for view index in the user sgpr allocation.</li>
<li>radv: add fs_key meta format support to resolve passes.</li>
<li>radv: don't use hw resolve for integer image formats</li>
<li>radv: don't use hw resolves for r16g16 norm formats.</li>
<li>radv: move spi_baryc_cntl to pipeline</li>
<li>r600/sb: insert the else clause when we might depart from a loop</li>
<li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>
<li>radv/gfx9: fix block compression texture views. (v2)</li>
<li>virgl: also remove dimension on indirect.</li>
</ul>
<p>Eleni Maria Stea (1):</p>
<ul>
<li>mesa: Fix function pointers initialization in status tracker</li>
</ul>
<p>Emil Velikov (18):</p>
<ul>
<li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>
<li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>
<li>cherry-ignore: anv: add explicit 18.0 only nominations</li>
<li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>
<li>cherry-ignore: meson: multiple fixes</li>
<li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>
<li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>
<li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>
<li>cherry-ignore: add gen10 fixes</li>
<li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>
<li>cherry-ignore: add i965 shader cache fixes</li>
<li>cherry-ignore: nir: mark unused space in packed_tex_data</li>
<li>radv: Stop advertising VK_KHX_multiview</li>
<li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>
<li>configure.ac: correct driglx-direct help text</li>
<li>cherry-ignore: add meson fix</li>
<li>cherry-ignore: add a few more meson fixes</li>
<li>Update version to 17.3.4</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radeon: remove left over dead code</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>st/va: release held locks in error paths</li>
<li>st/vdpau: release held lock in error path</li>
</ul>
<p>Igor Gnatenko (1):</p>
<ul>
<li>link mesautil with pthreads</li>
</ul>
<p>Indrajit Das (4):</p>
<ul>
<li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>
<li>radeon/uvd: update quantiser matrices only when requested</li>
<li>radeon/vcn: update quantiser matrices only when requested</li>
<li>st/va: clear pointers for mpeg2 quantiser matrices</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>
<li>i965: Add more precise cache tracking helpers</li>
<li>i965/blorp: Add more destination flushing</li>
<li>i965: Track the depth and render caches separately</li>
<li>i965: Track format and aux usage in the render cache</li>
<li>Re-enable regular fast-clears (CCS_D) on gen9+</li>
<li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>
<li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>
<li>i965/miptree: Use the tiling from the modifier instead of the BO</li>
<li>i965/bufmgr: Add a create_from_prime_tiled function</li>
<li>i965: Set tiling on BOs imported with modifiers</li>
<li>i965/miptree: Take an aux_usage in prepare/finish_render</li>
<li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>
<li>i965/surface_state: Drop brw_aux_surface_disabled</li>
<li>intel/fs: Use the original destination region for int MUL lowering</li>
<li>anv/pipeline: Don't look at blend state unless we have an attachment</li>
<li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>
<li>anv: Stop advertising VK_KHX_multiview</li>
<li>i965: Call prepare_external after implicit window-system MSAA resolves</li>
</ul>
<p>Jon Turney (3):</p>
<ul>
<li>configure: Default to gbm=no on osx</li>
<li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>
<li>glx/apple: locate dispatch table functions to wrap by name</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>svga: Prevent use after free.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.3</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Bind null render targets for shadow sampling + color.</li>
<li>i965: Bump official kernel requirement to Linux v3.9.</li>
</ul>
<p>Lucas Stach (2):</p>
<ul>
<li>etnaviv: dirty TS state when framebuffer has changed</li>
<li>renderonly: fix dumb BO allocation for non 32bpp formats</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't ignore pitch for imported textures</li>
</ul>
<p>Matthew Nicholls (2):</p>
<ul>
<li>radv: restore previous stencil reference after depth-stencil clear</li>
<li>radv: remove predication on cache flushes</li>
</ul>
<p>Maxin B. John (1):</p>
<ul>
<li>anv_icd.py: improve reproducible builds</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600: don't do stack workarounds for hemlock</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: create pipeline layout objects for all meta operations</li>
</ul>
<p>Samuel Thibault (1):</p>
<ul>
<li>glx: fix non-dri build</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix buffer overflow bug in 64bit SSBO loads</li>
<li>ac: fix visit_ssa_undef() for doubles</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,66 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>
<p>
Mesa 17.3.5 is a bug fix release which fixes bugs found since the 17.3.4 release.
</p>
<p>
Mesa 17.3.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
bc1ee20366aae2affc37c89228f871f438136f70252005e9f842169bde976788 mesa-17.3.5.tar.gz
eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a mesa-17.3.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.4</li>
<li>Update version to 17.3.5</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>ac/nir: Fix conflict resolution typo in handle_vs_input_decl</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,85 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>
<p>
Mesa 17.3.6 is a bug fix release which fixes bugs found since the 17.3.5 release.
</p>
<p>
Mesa 17.3.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
d5e10ea3f0d11b06d2b0b235bba372a04278c39bc0e712090bda1f61842db188 mesa-17.3.6.tar.gz
e5915680d44ac9d05defdec529db7459ac9edd441c9845266eff2e2d3e57fbf8 mesa-17.3.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.5</li>
<li>Update version to 17.3.6</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/draw: Do resolves properly for textures used by TXF</li>
<li>i965: Replace draw_aux_buffer_disabled with draw_aux_usage</li>
<li>i965/draw: Set NEW_AUX_STATE when draw aux changes</li>
<li>i965: Stop disabling aux during texture preparation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't disable CCS for RT dependencies when dispatching compute.</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>i965: Don't try to disable render aux buffers for compute</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,312 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.7 Release Notes / March 21, 2018</h1>
<p>
Mesa 17.3.7 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f08de6d0ccb3dbca04b44790d85c3ff9e7b1cc4189d1b7c7167e5ba7d98736c0 mesa-17.3.7.tar.gz
0595904a8fba65a8fe853a84ad3c940205503b94af41e8ceed245fada777ac1e mesa-17.3.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Fix CmdCopyImage between uncompressed and compressed images</li>
</ul>
<p>Andriy Khulap (1):</p>
<ul>
<li>i965: Fix RELOC_WRITE typo in brw_store_data_imm64()</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>isl: Don't use surface format R32_FLOAT for typed atomic integer operations</li>
</ul>
<p>Bas Nieuwenhuizen (6):</p>
<ul>
<li>radv: Always lower indirect derefs after nir_lower_global_vars_to_local.</li>
<li>radeonsi: Export signalled sync file instead of -1.</li>
<li>radv: Implement WaitForFences with !waitAll.</li>
<li>radv: Implement waiting on non-submitted fences.</li>
<li>radv: Fix copying from 3D images starting at non-zero depth.</li>
<li>radv: Increase the number of dynamic uniform buffers.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>glx: Properly handle cases where screen creation fails</li>
</ul>
<p>Daniel Stone (3):</p>
<ul>
<li>i965: Fix bugs in intel_from_planar</li>
<li>egl/wayland: Fix ARGB/XRGB transposition in config map</li>
<li>egl/wayland: Always use in-tree wayland-egl-backend.h</li>
</ul>
<p>Dave Airlie (9):</p>
<ul>
<li>r600: fix cubemap arrays</li>
<li>r600/sb/cayman: fix indirect ubo access on cayman</li>
<li>r600: fix xfb stream check.</li>
<li>ac/nir: to integer the args to bcsel.</li>
<li>r600/cayman: fix fragcood loading recip generation.</li>
<li>radv: don't support tc-compat on multisample d32s8 at all.</li>
<li>virgl: remap query types to hw support.</li>
<li>ac/nir: don't apply slice rounding on txf_ms</li>
<li>r600: implement callstack workaround for evergreen.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>glapi/check_table: Remove 'extern "C"' block</li>
<li>glapi: remove APPLE extensions from test</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.6</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.</li>
<li>ac/nir: Fix compiler warning about uninitialized dw_addr.</li>
<li>glsl/tests: Fix strict aliasing warning about int64/double.</li>
<li>glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965: Fix KHR_blend_equation_advanced with some render targets.</li>
</ul>
<p>Frank Binns (1):</p>
<ul>
<li>egl/dri2: fix segfault when display initialisation fails</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr/rast: blend_epi32() should return Integer, not Float</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
</ul>
<p>Gurchetan Singh (1):</p>
<ul>
<li>mesa: don't clamp just based on ARB_viewport_array extension</li>
</ul>
<p>Iago Toral Quiroga (2):</p>
<ul>
<li>i965/sbe: fix number of inputs for active components</li>
<li>i965/vec4: use a temp register to compute offsets for pull loads</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radv: Really use correct HTILE expanded words.</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/isl: Add an isl_color_value_is_zero helper</li>
<li>vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails</li>
<li>intel/fs: Set up sampler message headers in the visitor on gen7+</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: pthread-stubs not present on OpenBSD</li>
</ul>
<p>Jordan Justen (3):</p>
<ul>
<li>i965: Create new program cache bo when clearing the program cache</li>
<li>program: Don't reset SamplersValidated when restoring from shader cache</li>
<li>intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (14):</p>
<ul>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>cherry-ignore: r600/compute: only mark buffer/image state dirty for fragment shaders</li>
<li>cherry-ignore: anv: Move setting current_pipeline to cmd_state_init</li>
<li>cherry-ignore: anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: Add patches that has a specific version for 17.3</li>
<li>cherry-ignore: r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
<li>cherry-ignore: intel/compiler: Memory fence commit must always be enabled for gen10+</li>
<li>cherry-ignore: i965: Avoid problems from referencing orphaned BOs after growing.</li>
<li>cherry-ignore: include all Meson related fixes</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.</li>
<li>cherry-ignore: anv/image: Separate modifiers from legacy scanout</li>
<li>cherry-ignore: glsl: Fix memory leak with known glsl_type instances</li>
<li>Update version to 17.3.7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvir/nvc0: fix legalizing of ld unlock c0[0x10000]</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Emit CS stall before MEDIA_VFE_STATE.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: ensure reading config IDs from sysfs isn't interrupted</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: align command buffer starting address to fix some Raven hangs</li>
<li>configure.ac: blacklist libdrm 2.4.90</li>
</ul>
<p>Michal Navratil (1):</p>
<ul>
<li>winsys/amdgpu: allow non page-aligned size bo creation from pointer</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>glsl/linker: fix bug when checking precision qualifier</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>ac/nir: use ordered float comparisons except for not equal</li>
<li>Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"</li>
</ul>
<p>Stephan Gerhold (1):</p>
<ul>
<li>util/build-id: Fix address comparison for binaries with LOAD vaddr &gt; 0</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>svga: Fix a leftover debug hack</li>
<li>loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: fix MemoryBuffer build break for llvm-6</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>nir: fix interger divide by zero crash during constant folding</li>
</ul>
<p>Tobias Droste (1):</p>
<ul>
<li>gallivm: Use new LLVM fast-math-flags API</li>
</ul>
<p>Vadym Shovkoplias (1):</p>
<ul>
<li>mesa: add glsl version query (v4)</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>swr/rast: Fix macOS macro.</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,147 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.8 Release Notes / April 03, 2018</h1>
<p>
Mesa 17.3.8 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
175d2ca9be2af3a8db6cd603986096d75da70f59699528d7b6675d542a305e23 mesa-17.3.8.tar.gz
8f9d9bf281c48e4a8f5228816577263b4c655248dc7666e75034ab422951a6b1 mesa-17.3.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: get correct offset into LDS for indexed vars.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>mesa: Don't write to user buffer in glGetTexParameterIuiv on error</li>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Emit texture cache invalidates around blorp_copy</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>i965: Calculate thread_count in brw_alloc_stage_scratch</li>
<li>i965: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.7</li>
<li>cherry-ignore: ac/nir: pass the nir variable through tcs loading.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>cherry-ignore: docs: fix 18.0 release note version</li>
<li>Update version to 17.3.8</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,162 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.9 Release Notes / April 18, 2018</h1>
<p>
Mesa 17.3.9 is a bug fix release which fixes bugs found since the 17.3.8 release.
</p>
<p>
Mesa 17.3.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4d625f65a1ff4cd8cfeb39e38f047507c6dea047502a0d53113c96f54588f340 mesa-17.3.9.tar.gz
c5beb5fc05f0e0c294fefe1a393ee118cb67e27a4dca417d77c297f7d4b6e479 mesa-17.3.9.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (2):</p>
<ul>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>glsl: remove unreachable assert()</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.8</li>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>Update version to 17.3.9</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (6):</p>
<ul>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

View File

@@ -48,6 +48,7 @@ typedef unsigned int drm_drawable_t;
typedef struct drm_clip_rect drm_clip_rect_t;
#endif
#include <stdbool.h>
#include <stdint.h>
/**
@@ -704,7 +705,8 @@ struct __DRIuseInvalidateExtensionRec {
#define __DRI_ATTRIB_BIND_TO_TEXTURE_TARGETS 46
#define __DRI_ATTRIB_YINVERTED 47
#define __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE 48
#define __DRI_ATTRIB_MAX (__DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE + 1)
#define __DRI_ATTRIB_MUTABLE_RENDER_BUFFER 49 /* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR */
#define __DRI_ATTRIB_MAX 50
/* __DRI_ATTRIB_RENDER_TYPE */
#define __DRI_ATTRIB_RGBA_BIT 0x01
@@ -1810,7 +1812,48 @@ struct __DRI2rendererQueryExtensionRec {
enum __DRIimageBufferMask {
__DRI_IMAGE_BUFFER_BACK = (1 << 0),
__DRI_IMAGE_BUFFER_FRONT = (1 << 1)
__DRI_IMAGE_BUFFER_FRONT = (1 << 1),
/**
* A buffer shared between application and compositor. The buffer may be
* simultaneously accessed by each.
*
* A shared buffer is equivalent to an EGLSurface whose EGLConfig contains
* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR and whose active EGL_RENDER_BUFFER (as
* opposed to any pending, requested change to EGL_RENDER_BUFFER) is
* EGL_SINGLE_BUFFER.
*
* If the loader returns __DRI_IMAGE_BUFFER_SHARED, then it is returned
* alone without accompanying back nor front buffer.
*
* The loader returns __DRI_IMAGE_BUFFER_SHARED if and only if:
* - The loader supports __DRI_MUTABLE_RENDER_BUFFER_LOADER.
* - The driver supports __DRI_MUTABLE_RENDER_BUFFER_DRIVER.
* - The EGLConfig of the drawable EGLSurface contains
* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR.
* - The EGLContext's EGL_RENDER_BUFFER is EGL_SINGLE_BUFFER.
* Equivalently, the EGLSurface's active EGL_RENDER_BUFFER (as
* opposed to any pending,requested change to EGL_RENDER_BUFFER) is
* EGL_SINGLE_BUFFER.
*
* A shared buffer is similar a front buffer in that all rendering to the
* buffer should appear promptly on the screen. It is different from
* a front buffer in that its behavior is independent from the
* GL_DRAW_BUFFER state. Specifically, if GL_DRAW_FRAMEBUFFER is 0 and the
* __DRIdrawable's current buffer mask is __DRI_IMAGE_BUFFER_SHARED, then
* all rendering should appear promptly on the screen if GL_DRAW_BUFFER is
* not GL_NONE.
*
* The difference between a shared buffer and a front buffer is motivated
* by the constraints of Android and OpenGL ES. OpenGL ES does not support
* front-buffer rendering. Android's SurfaceFlinger protocol provides the
* EGL driver only a back buffer and no front buffer. The shared buffer
* mode introduced by EGL_KHR_mutable_render_buffer is a backdoor though
* EGL that allows Android OpenGL ES applications to render to what is
* effectively the front buffer, a backdoor that required no change to the
* OpenGL ES API and little change to the SurfaceFlinger API.
*/
__DRI_IMAGE_BUFFER_SHARED = (1 << 2),
};
struct __DRIimageList {
@@ -1949,4 +1992,83 @@ struct __DRIbackgroundCallableExtensionRec {
GLboolean (*isThreadSafe)(void *loaderPrivate);
};
/**
* The driver portion of EGL_KHR_mutable_render_buffer.
*
* If the driver creates a __DRIconfig with
* __DRI_ATTRIB_MUTABLE_RENDER_BUFFER, then it must support this extension.
*
* To support this extension:
*
* - The driver should create at least one __DRIconfig with
* __DRI_ATTRIB_MUTABLE_RENDER_BUFFER. This is strongly recommended but
* not required.
*
* - The driver must be able to handle __DRI_IMAGE_BUFFER_SHARED if
* returned by __DRIimageLoaderExtension:getBuffers().
*
* - When rendering to __DRI_IMAGE_BUFFER_SHARED, it must call
* __DRImutableRenderBufferLoaderExtension::displaySharedBuffer() on each
* application-initiated flush. This includes glFlush, glFinish,
* GL_SYNC_FLUSH_COMMANDS_BIT, EGL_SYNC_FLUSH_COMMANDS_BIT, and possibly
* more. (Android applications expect that glFlush will immediately
* display the buffer when in shared buffer mode because that is common
* behavior among Android drivers). It :may: call displaySharedBuffer()
* more often than required.
*
* - When rendering to __DRI_IMAGE_BUFFER_SHARED, it must ensure that the
* buffer is always in a format compatible for display because the
* display engine (usually SurfaceFlinger or hwcomposer) may display the
* image at any time, even concurrently with 3D rendering. For example,
* display hardware and the GL hardware may be able to access the buffer
* simultaneously. In particular, if the buffer is compressed than take
* care that SurfaceFlinger and hwcomposer can consume the compression
* format.
*
* \see __DRI_IMAGE_BUFFER_SHARED
* \see __DRI_ATTRIB_MUTABLE_RENDER_BUFFER
* \see __DRI_MUTABLE_RENDER_BUFFER_LOADER
*/
#define __DRI_MUTABLE_RENDER_BUFFER_DRIVER "DRI_MutableRenderBufferDriver"
#define __DRI_MUTABLE_RENDER_BUFFER_DRIVER_VERSION 1
typedef struct __DRImutableRenderBufferDriverExtensionRec __DRImutableRenderBufferDriverExtension;
struct __DRImutableRenderBufferDriverExtensionRec {
__DRIextension base;
};
/**
* The loader portion of EGL_KHR_mutable_render_buffer.
*
* Requires loader extension DRI_IMAGE_LOADER, through which the loader sends
* __DRI_IMAGE_BUFFER_SHARED to the driver.
*
* \see __DRI_MUTABLE_RENDER_BUFFER_DRIVER
*/
#define __DRI_MUTABLE_RENDER_BUFFER_LOADER "DRI_MutableRenderBufferLoader"
#define __DRI_MUTABLE_RENDER_BUFFER_LOADER_VERSION 1
typedef struct __DRImutableRenderBufferLoaderExtensionRec __DRImutableRenderBufferLoaderExtension;
struct __DRImutableRenderBufferLoaderExtensionRec {
__DRIextension base;
/**
* Inform the display engine (usually SurfaceFlinger or hwcomposer)
* that the __DRIdrawable has new content. The display engine may ignore
* this, for example, if it continually refreshes and displays the buffer
* on every frame, as in EGL_ANDROID_front_buffer_auto_refresh. On the
* other extreme, the display engine may refresh and display the buffer
* only in frames in which the driver calls this.
*
* If the fence_fd is not -1, then the display engine will display the
* buffer only after the fence signals.
*
* The drawable's current __DRIimageBufferMask, as returned by
* __DRIimageLoaderExtension::getBuffers(), must contain
* __DRI_IMAGE_BUFFER_SHARED.
*/
void (*displaySharedBuffer)(__DRIdrawable *drawable, int fence_fd,
void *loaderPrivate);
};
#endif

View File

@@ -497,7 +497,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major')
pre_args += '-DMAJOR_IN_MKDEV'
endif
foreach h : ['xlocale.h', 'sys/sysctl.h', 'endian.h']
foreach h : ['xlocale.h', 'sys/sysctl.h']
if cc.has_header(h)
pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify())
endif
@@ -609,7 +609,7 @@ dep_libdrm_amdgpu = []
dep_libdrm_radeon = []
dep_libdrm_nouveau = []
if with_amd_vk or with_gallium_radeonsi
dep_libdrm_amdgpu = dependency('libdrm_amdgpu', version : '>= 2.4.89')
dep_libdrm_amdgpu = dependency('libdrm_amdgpu', version : '>= 2.4.85')
endif
if with_gallium_radeonsi # older radeon too
dep_libdrm_radeon = dependency('libdrm_radeon', version : '>= 2.4.71')

View File

@@ -352,9 +352,6 @@ def generate(env):
if check_header(env, 'xlocale.h'):
cppdefines += ['HAVE_XLOCALE_H']
if check_header(env, 'endian.h'):
cppdefines += ['HAVE_ENDIAN_H']
if check_functions(env, ['strtod_l', 'strtof_l']):
cppdefines += ['HAVE_STRTOD_L']

View File

@@ -98,9 +98,7 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
{
struct amdgpu_buffer_size_alignments alignment_info = {};
struct amdgpu_heap_info vram, vram_vis, gtt;
struct drm_amdgpu_info_hw_ip dma = {}, compute = {}, uvd = {};
struct drm_amdgpu_info_hw_ip uvd_enc = {}, vce = {}, vcn_dec = {};
struct drm_amdgpu_info_hw_ip vcn_enc = {}, gfx = {};
struct drm_amdgpu_info_hw_ip dma = {}, compute = {}, uvd = {}, vce = {}, vcn_dec = {};
uint32_t vce_version = 0, vce_feature = 0, uvd_version = 0, uvd_feature = 0;
int r, i, j;
drmDevicePtr devinfo;
@@ -156,12 +154,6 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
return false;
}
r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_GFX, 0, &gfx);
if (r) {
fprintf(stderr, "amdgpu: amdgpu_query_hw_ip_info(gfx) failed.\n");
return false;
}
r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_COMPUTE, 0, &compute);
if (r) {
fprintf(stderr, "amdgpu: amdgpu_query_hw_ip_info(compute) failed.\n");
@@ -277,6 +269,7 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
vce.available_rings ? vce_version : 0;
info->has_userptr = true;
info->has_syncobj = has_syncobj(fd);
info->has_syncobj_wait_for_submit = info->has_syncobj && info->drm_minor >= 20;
info->has_sync_file = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
info->num_render_backends = amdinfo->rb_pipes;
@@ -323,17 +316,6 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
if (info->chip_class == SI)
info->gfx_ib_pad_with_type2 = TRUE;
unsigned ib_align = 0;
ib_align = MAX2(ib_align, gfx.ib_start_alignment);
ib_align = MAX2(ib_align, compute.ib_start_alignment);
ib_align = MAX2(ib_align, dma.ib_start_alignment);
ib_align = MAX2(ib_align, uvd.ib_start_alignment);
ib_align = MAX2(ib_align, uvd_enc.ib_start_alignment);
ib_align = MAX2(ib_align, vce.ib_start_alignment);
ib_align = MAX2(ib_align, vcn_dec.ib_start_alignment);
ib_align = MAX2(ib_align, vcn_enc.ib_start_alignment);
info->ib_start_alignment = ib_align;
return true;
}

View File

@@ -61,7 +61,6 @@ struct radeon_info {
bool has_virtual_memory;
bool gfx_ib_pad_with_type2;
bool has_hw_decode;
unsigned ib_start_alignment;
uint32_t num_sdma_rings;
uint32_t num_compute_rings;
uint32_t uvd_fw_version;
@@ -82,6 +81,7 @@ struct radeon_info {
uint32_t drm_patchlevel;
bool has_userptr;
bool has_syncobj;
bool has_syncobj_wait_for_submit;
bool has_sync_file;
bool has_ctx_priority;

View File

@@ -41,16 +41,6 @@
#include "shader_enums.h"
#define AC_LLVM_INITIAL_CF_DEPTH 4
/* Data for if/else/endif and bgnloop/endloop control flow structures.
*/
struct ac_llvm_flow {
/* Loop exit or next part of if/else/endif. */
LLVMBasicBlockRef next_block;
LLVMBasicBlockRef loop_entry_block;
};
/* Initialize module-independent parts of the context.
*
* The caller is responsible for initializing ctx::module and ctx::builder.
@@ -102,14 +92,6 @@ ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context,
ctx->empty_md = LLVMMDNodeInContext(ctx->context, NULL, 0);
}
void
ac_llvm_context_dispose(struct ac_llvm_context *ctx)
{
free(ctx->flow);
ctx->flow = NULL;
ctx->flow_depth_max = 0;
}
unsigned
ac_get_type_size(LLVMTypeRef type)
{
@@ -978,26 +960,6 @@ LLVMValueRef ac_build_buffer_load_format(struct ac_llvm_context *ctx,
AC_FUNC_ATTR_READONLY);
}
LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context *ctx,
LLVMValueRef rsrc,
LLVMValueRef vindex,
LLVMValueRef voffset,
bool can_speculate)
{
LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->builder, rsrc, LLVMConstInt(ctx->i32, 2, 0), "");
LLVMValueRef stride = LLVMBuildExtractElement(ctx->builder, rsrc, LLVMConstInt(ctx->i32, 1, 0), "");
stride = LLVMBuildLShr(ctx->builder, stride, LLVMConstInt(ctx->i32, 16, 0), "");
LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->builder,
LLVMBuildICmp(ctx->builder, LLVMIntUGT, elem_count, stride, ""),
elem_count, stride, "");
LLVMValueRef new_rsrc = LLVMBuildInsertElement(ctx->builder, rsrc, new_elem_count,
LLVMConstInt(ctx->i32, 2, 0), "");
return ac_build_buffer_load_format(ctx, new_rsrc, vindex, voffset, can_speculate);
}
/**
* Set range metadata on an instruction. This can only be used on load and
* call instructions. If you know an instruction can only produce the values
@@ -1780,174 +1742,3 @@ void ac_init_exec_full_mask(struct ac_llvm_context *ctx)
"llvm.amdgcn.init.exec", ctx->voidt,
&full_mask, 1, AC_FUNC_ATTR_CONVERGENT);
}
static struct ac_llvm_flow *
get_current_flow(struct ac_llvm_context *ctx)
{
if (ctx->flow_depth > 0)
return &ctx->flow[ctx->flow_depth - 1];
return NULL;
}
static struct ac_llvm_flow *
get_innermost_loop(struct ac_llvm_context *ctx)
{
for (unsigned i = ctx->flow_depth; i > 0; --i) {
if (ctx->flow[i - 1].loop_entry_block)
return &ctx->flow[i - 1];
}
return NULL;
}
static struct ac_llvm_flow *
push_flow(struct ac_llvm_context *ctx)
{
struct ac_llvm_flow *flow;
if (ctx->flow_depth >= ctx->flow_depth_max) {
unsigned new_max = MAX2(ctx->flow_depth << 1,
AC_LLVM_INITIAL_CF_DEPTH);
ctx->flow = realloc(ctx->flow, new_max * sizeof(*ctx->flow));
ctx->flow_depth_max = new_max;
}
flow = &ctx->flow[ctx->flow_depth];
ctx->flow_depth++;
flow->next_block = NULL;
flow->loop_entry_block = NULL;
return flow;
}
static void set_basicblock_name(LLVMBasicBlockRef bb, const char *base,
int label_id)
{
char buf[32];
snprintf(buf, sizeof(buf), "%s%d", base, label_id);
LLVMSetValueName(LLVMBasicBlockAsValue(bb), buf);
}
/* Append a basic block at the level of the parent flow.
*/
static LLVMBasicBlockRef append_basic_block(struct ac_llvm_context *ctx,
const char *name)
{
assert(ctx->flow_depth >= 1);
if (ctx->flow_depth >= 2) {
struct ac_llvm_flow *flow = &ctx->flow[ctx->flow_depth - 2];
return LLVMInsertBasicBlockInContext(ctx->context,
flow->next_block, name);
}
LLVMValueRef main_fn =
LLVMGetBasicBlockParent(LLVMGetInsertBlock(ctx->builder));
return LLVMAppendBasicBlockInContext(ctx->context, main_fn, name);
}
/* Emit a branch to the given default target for the current block if
* applicable -- that is, if the current block does not already contain a
* branch from a break or continue.
*/
static void emit_default_branch(LLVMBuilderRef builder,
LLVMBasicBlockRef target)
{
if (!LLVMGetBasicBlockTerminator(LLVMGetInsertBlock(builder)))
LLVMBuildBr(builder, target);
}
void ac_build_bgnloop(struct ac_llvm_context *ctx, int label_id)
{
struct ac_llvm_flow *flow = push_flow(ctx);
flow->loop_entry_block = append_basic_block(ctx, "LOOP");
flow->next_block = append_basic_block(ctx, "ENDLOOP");
set_basicblock_name(flow->loop_entry_block, "loop", label_id);
LLVMBuildBr(ctx->builder, flow->loop_entry_block);
LLVMPositionBuilderAtEnd(ctx->builder, flow->loop_entry_block);
}
void ac_build_break(struct ac_llvm_context *ctx)
{
struct ac_llvm_flow *flow = get_innermost_loop(ctx);
LLVMBuildBr(ctx->builder, flow->next_block);
}
void ac_build_continue(struct ac_llvm_context *ctx)
{
struct ac_llvm_flow *flow = get_innermost_loop(ctx);
LLVMBuildBr(ctx->builder, flow->loop_entry_block);
}
void ac_build_else(struct ac_llvm_context *ctx, int label_id)
{
struct ac_llvm_flow *current_branch = get_current_flow(ctx);
LLVMBasicBlockRef endif_block;
assert(!current_branch->loop_entry_block);
endif_block = append_basic_block(ctx, "ENDIF");
emit_default_branch(ctx->builder, endif_block);
LLVMPositionBuilderAtEnd(ctx->builder, current_branch->next_block);
set_basicblock_name(current_branch->next_block, "else", label_id);
current_branch->next_block = endif_block;
}
void ac_build_endif(struct ac_llvm_context *ctx, int label_id)
{
struct ac_llvm_flow *current_branch = get_current_flow(ctx);
assert(!current_branch->loop_entry_block);
emit_default_branch(ctx->builder, current_branch->next_block);
LLVMPositionBuilderAtEnd(ctx->builder, current_branch->next_block);
set_basicblock_name(current_branch->next_block, "endif", label_id);
ctx->flow_depth--;
}
void ac_build_endloop(struct ac_llvm_context *ctx, int label_id)
{
struct ac_llvm_flow *current_loop = get_current_flow(ctx);
assert(current_loop->loop_entry_block);
emit_default_branch(ctx->builder, current_loop->loop_entry_block);
LLVMPositionBuilderAtEnd(ctx->builder, current_loop->next_block);
set_basicblock_name(current_loop->next_block, "endloop", label_id);
ctx->flow_depth--;
}
static void if_cond_emit(struct ac_llvm_context *ctx, LLVMValueRef cond,
int label_id)
{
struct ac_llvm_flow *flow = push_flow(ctx);
LLVMBasicBlockRef if_block;
if_block = append_basic_block(ctx, "IF");
flow->next_block = append_basic_block(ctx, "ELSE");
set_basicblock_name(if_block, "if", label_id);
LLVMBuildCondBr(ctx->builder, cond, if_block, flow->next_block);
LLVMPositionBuilderAtEnd(ctx->builder, if_block);
}
void ac_build_if(struct ac_llvm_context *ctx, LLVMValueRef value,
int label_id)
{
LLVMValueRef cond = LLVMBuildFCmp(ctx->builder, LLVMRealUNE,
value, ctx->f32_0, "");
if_cond_emit(ctx, cond, label_id);
}
void ac_build_uif(struct ac_llvm_context *ctx, LLVMValueRef value,
int label_id)
{
LLVMValueRef cond = LLVMBuildICmp(ctx->builder, LLVMIntNE,
ac_to_integer(ctx, value),
ctx->i32_0, "");
if_cond_emit(ctx, cond, label_id);
}

View File

@@ -34,8 +34,6 @@
extern "C" {
#endif
struct ac_llvm_flow;
struct ac_llvm_context {
LLVMContextRef context;
LLVMModuleRef module;
@@ -59,10 +57,6 @@ struct ac_llvm_context {
LLVMValueRef f32_0;
LLVMValueRef f32_1;
struct ac_llvm_flow *flow;
unsigned flow_depth;
unsigned flow_depth_max;
unsigned range_md_kind;
unsigned invariant_load_md_kind;
unsigned uniform_md_kind;
@@ -77,9 +71,6 @@ void
ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context,
enum chip_class chip_class);
void
ac_llvm_context_dispose(struct ac_llvm_context *ctx);
unsigned ac_get_type_size(LLVMTypeRef type);
LLVMTypeRef ac_to_integer_type(struct ac_llvm_context *ctx, LLVMTypeRef t);
@@ -197,14 +188,6 @@ LLVMValueRef ac_build_buffer_load_format(struct ac_llvm_context *ctx,
LLVMValueRef voffset,
bool can_speculate);
/* load_format that handles the stride & element count better if idxen is
* disabled by LLVM. */
LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context *ctx,
LLVMValueRef rsrc,
LLVMValueRef vindex,
LLVMValueRef voffset,
bool can_speculate);
LLVMValueRef
ac_get_thread_id(struct ac_llvm_context *ctx);
@@ -299,18 +282,6 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ac,
uint32_t num_outputs,
uint8_t *num_param_exports);
void ac_init_exec_full_mask(struct ac_llvm_context *ctx);
void ac_build_bgnloop(struct ac_llvm_context *ctx, int lable_id);
void ac_build_break(struct ac_llvm_context *ctx);
void ac_build_continue(struct ac_llvm_context *ctx);
void ac_build_else(struct ac_llvm_context *ctx, int lable_id);
void ac_build_endif(struct ac_llvm_context *ctx, int lable_id);
void ac_build_endloop(struct ac_llvm_context *ctx, int lable_id);
void ac_build_if(struct ac_llvm_context *ctx, LLVMValueRef value,
int lable_id);
void ac_build_uif(struct ac_llvm_context *ctx, LLVMValueRef value,
int lable_id);
#ifdef __cplusplus
}
#endif

View File

@@ -562,30 +562,7 @@ struct user_sgpr_info {
bool indirect_all_descriptor_sets;
};
static bool needs_view_index_sgpr(struct nir_to_llvm_context *ctx,
gl_shader_stage stage)
{
switch (stage) {
case MESA_SHADER_VERTEX:
if (ctx->shader_info->info.needs_multiview_view_index ||
(!ctx->options->key.vs.as_es && !ctx->options->key.vs.as_ls && ctx->options->key.has_multiview_view_index))
return true;
break;
case MESA_SHADER_TESS_EVAL:
if (ctx->shader_info->info.needs_multiview_view_index || (!ctx->options->key.tes.as_es && ctx->options->key.has_multiview_view_index))
return true;
case MESA_SHADER_GEOMETRY:
case MESA_SHADER_TESS_CTRL:
if (ctx->shader_info->info.needs_multiview_view_index)
return true;
default:
break;
}
return false;
}
static void allocate_user_sgprs(struct nir_to_llvm_context *ctx,
bool needs_view_index,
struct user_sgpr_info *user_sgpr_info)
{
memset(user_sgpr_info, 0, sizeof(struct user_sgpr_info));
@@ -639,9 +616,6 @@ static void allocate_user_sgprs(struct nir_to_llvm_context *ctx,
break;
}
if (needs_view_index)
user_sgpr_info->sgpr_count++;
if (ctx->shader_info->info.needs_push_constants)
user_sgpr_info->sgpr_count += 2;
@@ -771,8 +745,8 @@ static void create_function(struct nir_to_llvm_context *ctx,
struct user_sgpr_info user_sgpr_info;
struct arg_info args = {};
LLVMValueRef desc_sets;
bool needs_view_index = needs_view_index_sgpr(ctx, stage);
allocate_user_sgprs(ctx, needs_view_index, &user_sgpr_info);
allocate_user_sgprs(ctx, &user_sgpr_info);
if (user_sgpr_info.need_ring_offsets && !ctx->options->supports_spill) {
add_user_sgpr_argument(&args, const_array(ctx->v4i32, 16), &ctx->ring_offsets); /* address of rings */
@@ -790,7 +764,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
case MESA_SHADER_VERTEX:
radv_define_common_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &user_sgpr_info, &args, &desc_sets);
radv_define_vs_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &args);
if (needs_view_index)
if (ctx->shader_info->info.needs_multiview_view_index || (!ctx->options->key.vs.as_es && !ctx->options->key.vs.as_ls && ctx->options->key.has_multiview_view_index))
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
if (ctx->options->key.vs.as_es)
add_sgpr_argument(&args, ctx->i32, &ctx->es2gs_offset); // es2gs offset
@@ -822,7 +796,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_offsets); // tcs out offsets
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_layout); // tcs out layout
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_in_layout); // tcs in layout
if (needs_view_index)
if (ctx->shader_info->info.needs_multiview_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_vgpr_argument(&args, ctx->i32, &ctx->tcs_patch_id); // patch id
@@ -837,7 +811,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_offsets); // tcs out offsets
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_layout); // tcs out layout
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_in_layout); // tcs in layout
if (needs_view_index)
if (ctx->shader_info->info.needs_multiview_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_sgpr_argument(&args, ctx->i32, &ctx->oc_lds); // param oc lds
add_sgpr_argument(&args, ctx->i32, &ctx->tess_factor_offset); // tess factor offset
@@ -848,9 +822,8 @@ static void create_function(struct nir_to_llvm_context *ctx,
case MESA_SHADER_TESS_EVAL:
radv_define_common_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &user_sgpr_info, &args, &desc_sets);
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_offchip_layout); // tcs offchip layout
if (needs_view_index)
if (ctx->shader_info->info.needs_multiview_view_index || (!ctx->options->key.tes.as_es && ctx->options->key.has_multiview_view_index))
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
if (ctx->options->key.tes.as_es) {
add_sgpr_argument(&args, ctx->i32, &ctx->oc_lds); // OC LDS
add_sgpr_argument(&args, ctx->i32, NULL); //
@@ -882,7 +855,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
radv_define_vs_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &args);
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_ring_stride); // gsvs stride
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_num_entries); // gsvs num entires
if (needs_view_index)
if (ctx->shader_info->info.needs_multiview_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_vgpr_argument(&args, ctx->i32, &ctx->gs_vtx_offset[0]); // vtx01
@@ -907,7 +880,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
radv_define_vs_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &args);
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_ring_stride); // gsvs stride
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_num_entries); // gsvs num entires
if (needs_view_index)
if (ctx->shader_info->info.needs_multiview_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_sgpr_argument(&args, ctx->i32, &ctx->gs2vs_offset); // gs2vs offset
add_sgpr_argument(&args, ctx->i32, &ctx->gs_wave_id); // wave id
@@ -1286,8 +1259,7 @@ static LLVMValueRef emit_bcsel(struct ac_llvm_context *ctx,
{
LLVMValueRef v = LLVMBuildICmp(ctx->builder, LLVMIntNE, src0,
ctx->i32_0, "");
return LLVMBuildSelect(ctx->builder, v, ac_to_integer(ctx, src1),
ac_to_integer(ctx, src2), "");
return LLVMBuildSelect(ctx->builder, v, src1, src2, "");
}
static LLVMValueRef emit_find_lsb(struct ac_llvm_context *ctx,
@@ -1545,13 +1517,23 @@ static LLVMValueRef emit_bitfield_insert(struct ac_llvm_context *ctx,
static LLVMValueRef emit_pack_half_2x16(struct ac_llvm_context *ctx,
LLVMValueRef src0)
{
LLVMValueRef const16 = LLVMConstInt(ctx->i32, 16, false);
int i;
LLVMValueRef comp[2];
src0 = ac_to_float(ctx, src0);
comp[0] = LLVMBuildExtractElement(ctx->builder, src0, ctx->i32_0, "");
comp[1] = LLVMBuildExtractElement(ctx->builder, src0, ctx->i32_1, "");
for (i = 0; i < 2; i++) {
comp[i] = LLVMBuildFPTrunc(ctx->builder, comp[i], ctx->f16, "");
comp[i] = LLVMBuildBitCast(ctx->builder, comp[i], ctx->i16, "");
comp[i] = LLVMBuildZExt(ctx->builder, comp[i], ctx->i32, "");
}
return ac_build_cvt_pkrtz_f16(ctx, comp);
comp[1] = LLVMBuildShl(ctx->builder, comp[1], const16, "");
comp[0] = LLVMBuildOr(ctx->builder, comp[0], comp[1], "");
return comp[0];
}
static LLVMValueRef emit_unpack_half_2x16(struct ac_llvm_context *ctx,
@@ -1774,16 +1756,16 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
result = emit_int_cmp(&ctx->ac, LLVMIntUGE, src[0], src[1]);
break;
case nir_op_feq:
result = emit_float_cmp(&ctx->ac, LLVMRealOEQ, src[0], src[1]);
result = emit_float_cmp(&ctx->ac, LLVMRealUEQ, src[0], src[1]);
break;
case nir_op_fne:
result = emit_float_cmp(&ctx->ac, LLVMRealUNE, src[0], src[1]);
break;
case nir_op_flt:
result = emit_float_cmp(&ctx->ac, LLVMRealOLT, src[0], src[1]);
result = emit_float_cmp(&ctx->ac, LLVMRealULT, src[0], src[1]);
break;
case nir_op_fge:
result = emit_float_cmp(&ctx->ac, LLVMRealOGE, src[0], src[1]);
result = emit_float_cmp(&ctx->ac, LLVMRealUGE, src[0], src[1]);
break;
case nir_op_fabs:
result = emit_intrin_1f_param(&ctx->ac, "llvm.fabs",
@@ -2257,19 +2239,11 @@ static LLVMValueRef build_tex_intrinsic(struct ac_nir_context *ctx,
struct ac_image_args *args)
{
if (instr->sampler_dim == GLSL_SAMPLER_DIM_BUF) {
if (ctx->abi->gfx9_stride_size_workaround) {
return ac_build_buffer_load_format_gfx9_safe(&ctx->ac,
args->resource,
args->addr,
ctx->ac.i32_0,
true);
} else {
return ac_build_buffer_load_format(&ctx->ac,
args->resource,
args->addr,
ctx->ac.i32_0,
true);
}
return ac_build_buffer_load_format(&ctx->ac,
args->resource,
args->addr,
LLVMConstInt(ctx->ac.i32, 0, false),
true);
}
args->opcode = ac_image_sample;
@@ -2379,46 +2353,6 @@ static LLVMValueRef visit_get_buffer_size(struct ac_nir_context *ctx,
return get_buffer_size(ctx, desc, false);
}
static uint32_t widen_mask(uint32_t mask, unsigned multiplier)
{
uint32_t new_mask = 0;
for(unsigned i = 0; i < 32 && (1u << i) <= mask; ++i)
if (mask & (1u << i))
new_mask |= ((1u << multiplier) - 1u) << (i * multiplier);
return new_mask;
}
static LLVMValueRef extract_vector_range(struct ac_llvm_context *ctx, LLVMValueRef src,
unsigned start, unsigned count)
{
LLVMTypeRef type = LLVMTypeOf(src);
if (LLVMGetTypeKind(type) != LLVMVectorTypeKind) {
assert(start == 0);
assert(count == 1);
return src;
}
unsigned src_elements = LLVMGetVectorSize(type);
assert(start < src_elements);
assert(start + count <= src_elements);
if (start == 0 && count == src_elements)
return src;
if (count == 1)
return LLVMBuildExtractElement(ctx->builder, src, LLVMConstInt(ctx->i32, start, false), "");
assert(count <= 8);
LLVMValueRef indices[8];
for (unsigned i = 0; i < count; ++i)
indices[i] = LLVMConstInt(ctx->i32, start + i, false);
LLVMValueRef swizzle = LLVMConstVector(indices, count);
return LLVMBuildShuffleVector(ctx->builder, src, src, swizzle, "");
}
static void visit_store_ssbo(struct ac_nir_context *ctx,
nir_intrinsic_instr *instr)
{
@@ -2441,8 +2375,6 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
if (components_32bit > 1)
data_type = LLVMVectorType(ctx->ac.f32, components_32bit);
writemask = widen_mask(writemask, elem_size_mult);
base_data = ac_to_float(&ctx->ac, src_data);
base_data = trim_vector(&ctx->ac, base_data, instr->num_components);
base_data = LLVMBuildBitCast(ctx->ac.builder, base_data,
@@ -2452,7 +2384,7 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
int start, count;
LLVMValueRef data;
LLVMValueRef offset;
LLVMValueRef tmp;
u_bit_scan_consecutive_range(&writemask, &start, &count);
/* Due to an LLVM limitation, split 3-element writes
@@ -2462,6 +2394,9 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
count = 2;
}
start *= elem_size_mult;
count *= elem_size_mult;
if (count > 4) {
writemask |= ((1u << (count - 4)) - 1u) << (start + 4);
count = 4;
@@ -2469,14 +2404,30 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
if (count == 4) {
store_name = "llvm.amdgcn.buffer.store.v4f32";
data = base_data;
} else if (count == 2) {
LLVMTypeRef v2f32 = LLVMVectorType(ctx->ac.f32, 2);
tmp = LLVMBuildExtractElement(ctx->ac.builder,
base_data, LLVMConstInt(ctx->ac.i32, start, false), "");
data = LLVMBuildInsertElement(ctx->ac.builder, LLVMGetUndef(v2f32), tmp,
ctx->ac.i32_0, "");
tmp = LLVMBuildExtractElement(ctx->ac.builder,
base_data, LLVMConstInt(ctx->ac.i32, start + 1, false), "");
data = LLVMBuildInsertElement(ctx->ac.builder, data, tmp,
ctx->ac.i32_1, "");
store_name = "llvm.amdgcn.buffer.store.v2f32";
} else {
assert(count == 1);
if (get_llvm_num_components(base_data) > 1)
data = LLVMBuildExtractElement(ctx->ac.builder, base_data,
LLVMConstInt(ctx->ac.i32, start, false), "");
else
data = base_data;
store_name = "llvm.amdgcn.buffer.store.f32";
}
data = extract_vector_range(&ctx->ac, base_data, start, count);
offset = base_offset;
if (start != 0) {
@@ -2586,11 +2537,8 @@ static LLVMValueRef visit_load_buffer(struct ac_nir_context *ctx,
i1false,
};
int idx = i;
if (instr->dest.ssa.bit_size == 64)
idx = i > 1 ? 1 : 0;
results[i] = ac_build_intrinsic(&ctx->ac, load_name, data_type, params, 5, 0);
results[idx] = ac_build_intrinsic(&ctx->ac, load_name, data_type, params, 5, 0);
}
LLVMValueRef ret = results[0];
@@ -2857,7 +2805,7 @@ get_dw_address(struct nir_to_llvm_context *ctx,
LLVMConstInt(ctx->i32, 4, false), ""), "");
else if (const_index && !compact_const_index)
dw_addr = LLVMBuildAdd(ctx->builder, dw_addr,
LLVMConstInt(ctx->i32, const_index * 4, false), "");
LLVMConstInt(ctx->i32, const_index, false), "");
dw_addr = LLVMBuildAdd(ctx->builder, dw_addr,
LLVMConstInt(ctx->i32, param * 4, false), "");
@@ -3135,7 +3083,6 @@ static LLVMValueRef visit_load_var(struct ac_nir_context *ctx,
LLVMValueRef indir_index;
LLVMValueRef ret;
unsigned const_index;
unsigned stride = instr->variables[0]->var->data.compact ? 1 : 4;
bool vs_in = ctx->stage == MESA_SHADER_VERTEX &&
instr->variables[0]->var->data.mode == nir_var_shader_in;
get_deref_offset(ctx, instr->variables[0], vs_in, NULL, NULL,
@@ -3161,13 +3108,13 @@ static LLVMValueRef visit_load_var(struct ac_nir_context *ctx,
count -= chan / 4;
LLVMValueRef tmp_vec = ac_build_gather_values_extended(
&ctx->ac, ctx->abi->inputs + idx + chan, count,
stride, false, true);
4, false, true);
values[chan] = LLVMBuildExtractElement(ctx->ac.builder,
tmp_vec,
indir_index, "");
} else
values[chan] = ctx->abi->inputs[idx + chan + const_index * stride];
values[chan] = ctx->abi->inputs[idx + chan + const_index * 4];
}
break;
case nir_var_local:
@@ -3178,13 +3125,13 @@ static LLVMValueRef visit_load_var(struct ac_nir_context *ctx,
count -= chan / 4;
LLVMValueRef tmp_vec = ac_build_gather_values_extended(
&ctx->ac, ctx->locals + idx + chan, count,
stride, true, true);
4, true, true);
values[chan] = LLVMBuildExtractElement(ctx->ac.builder,
tmp_vec,
indir_index, "");
} else {
values[chan] = LLVMBuildLoad(ctx->ac.builder, ctx->locals[idx + chan + const_index * stride], "");
values[chan] = LLVMBuildLoad(ctx->ac.builder, ctx->locals[idx + chan + const_index * 4], "");
}
}
break;
@@ -3206,14 +3153,14 @@ static LLVMValueRef visit_load_var(struct ac_nir_context *ctx,
count -= chan / 4;
LLVMValueRef tmp_vec = ac_build_gather_values_extended(
&ctx->ac, ctx->outputs + idx + chan, count,
stride, true, true);
4, true, true);
values[chan] = LLVMBuildExtractElement(ctx->ac.builder,
tmp_vec,
indir_index, "");
} else {
values[chan] = LLVMBuildLoad(ctx->ac.builder,
ctx->outputs[idx + chan + const_index * stride],
ctx->outputs[idx + chan + const_index * 4],
"");
}
}
@@ -3239,12 +3186,17 @@ visit_store_var(struct ac_nir_context *ctx,
NULL, NULL, &const_index, &indir_index);
if (get_elem_bits(&ctx->ac, LLVMTypeOf(src)) == 64) {
int old_writemask = writemask;
src = LLVMBuildBitCast(ctx->ac.builder, src,
LLVMVectorType(ctx->ac.f32, get_llvm_num_components(src) * 2),
"");
writemask = widen_mask(writemask, 2);
writemask = 0;
for (unsigned chan = 0; chan < 4; chan++) {
if (old_writemask & (1 << chan))
writemask |= 3u << (2 * chan);
}
}
switch (instr->variables[0]->var->data.mode) {
@@ -3621,23 +3573,8 @@ static void visit_image_store(struct ac_nir_context *ctx,
glc = i1true;
if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_BUF) {
LLVMValueRef rsrc = get_sampler_desc(ctx, instr->variables[0], AC_DESC_BUFFER, true, true);
if (ctx->abi->gfx9_stride_size_workaround) {
LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 2, 0), "");
LLVMValueRef stride = LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 1, 0), "");
stride = LLVMBuildLShr(ctx->ac.builder, stride, LLVMConstInt(ctx->ac.i32, 16, 0), "");
LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->ac.builder,
LLVMBuildICmp(ctx->ac.builder, LLVMIntUGT, elem_count, stride, ""),
elem_count, stride, "");
rsrc = LLVMBuildInsertElement(ctx->ac.builder, rsrc, new_elem_count,
LLVMConstInt(ctx->ac.i32, 2, 0), "");
}
params[0] = ac_to_float(&ctx->ac, get_src(ctx, instr->src[2])); /* data */
params[1] = rsrc;
params[1] = get_sampler_desc(ctx, instr->variables[0], AC_DESC_BUFFER, true, true);
params[2] = LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, instr->src[0]),
ctx->ac.i32_0, ""); /* vindex */
params[3] = ctx->ac.i32_0; /* voffset */
@@ -4798,7 +4735,7 @@ static void visit_tex(struct ac_nir_context *ctx, nir_tex_instr *instr)
/* This seems like a bit of a hack - but it passes Vulkan CTS with it */
if (instr->sampler_dim != GLSL_SAMPLER_DIM_3D &&
instr->sampler_dim != GLSL_SAMPLER_DIM_CUBE &&
instr->op != nir_texop_txf && instr->op != nir_texop_txf_ms) {
instr->op != nir_texop_txf) {
coords[2] = apply_round_slice(&ctx->ac, coords[2]);
}
address[count++] = coords[2];
@@ -4973,26 +4910,27 @@ static void visit_ssa_undef(struct ac_nir_context *ctx,
const nir_ssa_undef_instr *instr)
{
unsigned num_components = instr->def.num_components;
LLVMTypeRef type = LLVMIntTypeInContext(ctx->ac.context, instr->def.bit_size);
LLVMValueRef undef;
if (num_components == 1)
undef = LLVMGetUndef(type);
undef = LLVMGetUndef(ctx->ac.i32);
else {
undef = LLVMGetUndef(LLVMVectorType(type, num_components));
undef = LLVMGetUndef(LLVMVectorType(ctx->ac.i32, num_components));
}
_mesa_hash_table_insert(ctx->defs, &instr->def, undef);
}
static void visit_jump(struct ac_llvm_context *ctx,
static void visit_jump(struct ac_nir_context *ctx,
const nir_jump_instr *instr)
{
switch (instr->type) {
case nir_jump_break:
ac_build_break(ctx);
LLVMBuildBr(ctx->ac.builder, ctx->break_block);
LLVMClearInsertionPosition(ctx->ac.builder);
break;
case nir_jump_continue:
ac_build_continue(ctx);
LLVMBuildBr(ctx->ac.builder, ctx->continue_block);
LLVMClearInsertionPosition(ctx->ac.builder);
break;
default:
fprintf(stderr, "Unknown NIR jump instr: ");
@@ -5030,7 +4968,7 @@ static void visit_block(struct ac_nir_context *ctx, nir_block *block)
visit_ssa_undef(ctx, nir_instr_as_ssa_undef(instr));
break;
case nir_instr_type_jump:
visit_jump(&ctx->ac, nir_instr_as_jump(instr));
visit_jump(ctx, nir_instr_as_jump(instr));
break;
default:
fprintf(stderr, "Unknown NIR instr type: ");
@@ -5047,34 +4985,56 @@ static void visit_if(struct ac_nir_context *ctx, nir_if *if_stmt)
{
LLVMValueRef value = get_src(ctx, if_stmt->condition);
nir_block *then_block =
(nir_block *) exec_list_get_head(&if_stmt->then_list);
LLVMValueRef fn = LLVMGetBasicBlockParent(LLVMGetInsertBlock(ctx->ac.builder));
LLVMBasicBlockRef merge_block =
LLVMAppendBasicBlockInContext(ctx->ac.context, fn, "");
LLVMBasicBlockRef if_block =
LLVMAppendBasicBlockInContext(ctx->ac.context, fn, "");
LLVMBasicBlockRef else_block = merge_block;
if (!exec_list_is_empty(&if_stmt->else_list))
else_block = LLVMAppendBasicBlockInContext(
ctx->ac.context, fn, "");
ac_build_uif(&ctx->ac, value, then_block->index);
LLVMValueRef cond = LLVMBuildICmp(ctx->ac.builder, LLVMIntNE, value,
LLVMConstInt(ctx->ac.i32, 0, false), "");
LLVMBuildCondBr(ctx->ac.builder, cond, if_block, else_block);
LLVMPositionBuilderAtEnd(ctx->ac.builder, if_block);
visit_cf_list(ctx, &if_stmt->then_list);
if (LLVMGetInsertBlock(ctx->ac.builder))
LLVMBuildBr(ctx->ac.builder, merge_block);
if (!exec_list_is_empty(&if_stmt->else_list)) {
nir_block *else_block =
(nir_block *) exec_list_get_head(&if_stmt->else_list);
ac_build_else(&ctx->ac, else_block->index);
LLVMPositionBuilderAtEnd(ctx->ac.builder, else_block);
visit_cf_list(ctx, &if_stmt->else_list);
if (LLVMGetInsertBlock(ctx->ac.builder))
LLVMBuildBr(ctx->ac.builder, merge_block);
}
ac_build_endif(&ctx->ac, then_block->index);
LLVMPositionBuilderAtEnd(ctx->ac.builder, merge_block);
}
static void visit_loop(struct ac_nir_context *ctx, nir_loop *loop)
{
nir_block *first_loop_block =
(nir_block *) exec_list_get_head(&loop->body);
LLVMValueRef fn = LLVMGetBasicBlockParent(LLVMGetInsertBlock(ctx->ac.builder));
LLVMBasicBlockRef continue_parent = ctx->continue_block;
LLVMBasicBlockRef break_parent = ctx->break_block;
ac_build_bgnloop(&ctx->ac, first_loop_block->index);
ctx->continue_block =
LLVMAppendBasicBlockInContext(ctx->ac.context, fn, "");
ctx->break_block =
LLVMAppendBasicBlockInContext(ctx->ac.context, fn, "");
LLVMBuildBr(ctx->ac.builder, ctx->continue_block);
LLVMPositionBuilderAtEnd(ctx->ac.builder, ctx->continue_block);
visit_cf_list(ctx, &loop->body);
ac_build_endloop(&ctx->ac, first_loop_block->index);
if (LLVMGetInsertBlock(ctx->ac.builder))
LLVMBuildBr(ctx->ac.builder, ctx->continue_block);
LLVMPositionBuilderAtEnd(ctx->ac.builder, ctx->break_block);
ctx->continue_block = continue_parent;
ctx->break_block = break_parent;
}
static void visit_cf_list(struct ac_nir_context *ctx,
@@ -5116,16 +5076,16 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
variable->data.driver_location = idx * 4;
for (unsigned i = 0; i < attrib_count; ++i, ++idx) {
if (ctx->options->key.vs.instance_rate_inputs & (1u << (index + i))) {
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.instance_id,
ctx->abi.start_instance, "");
ctx->shader_info->vs.vgpr_comp_cnt =
MAX2(3, ctx->shader_info->vs.vgpr_comp_cnt);
} else
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.vertex_id,
ctx->abi.base_vertex, "");
if (ctx->options->key.vs.instance_rate_inputs & (1u << index)) {
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.instance_id,
ctx->abi.start_instance, "");
ctx->shader_info->vs.vgpr_comp_cnt = MAX2(3,
ctx->shader_info->vs.vgpr_comp_cnt);
} else
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.vertex_id,
ctx->abi.base_vertex, "");
for (unsigned i = 0; i < attrib_count; ++i, ++idx) {
t_offset = LLVMConstInt(ctx->i32, index + i, false);
t_list = ac_build_load_to_sgpr(&ctx->ac, t_list_ptr, t_offset);
@@ -5496,7 +5456,6 @@ setup_locals(struct ac_nir_context *ctx,
nir_foreach_variable(variable, &func->impl->locals) {
unsigned attrib_count = glsl_count_attribute_slots(variable->type, false);
variable->data.driver_location = ctx->num_locals * 4;
variable->data.location_frac = 0;
ctx->num_locals += attrib_count;
}
ctx->locals = malloc(4 * ctx->num_locals * sizeof(LLVMValueRef));
@@ -5938,7 +5897,7 @@ handle_es_outputs_post(struct nir_to_llvm_context *ctx,
}
for (unsigned i = 0; i < RADEON_LLVM_MAX_OUTPUTS; ++i) {
LLVMValueRef dw_addr = NULL;
LLVMValueRef dw_addr;
LLVMValueRef *out_ptr = &ctx->nir->outputs[i * 4];
int param_index;
int length = 4;
@@ -6428,8 +6387,6 @@ static void ac_llvm_finalize_module(struct nir_to_llvm_context * ctx)
LLVMDisposeBuilder(ctx->builder);
LLVMDisposePassManager(passmgr);
ac_llvm_context_dispose(&ctx->ac);
}
static void
@@ -6646,7 +6603,6 @@ LLVMModuleRef ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
ctx.abi.load_ssbo = radv_load_ssbo;
ctx.abi.load_sampler_desc = radv_get_sampler_desc;
ctx.abi.clamp_shadow_reference = false;
ctx.abi.gfx9_stride_size_workaround = ctx.ac.chip_class == GFX9;
if (shader_count >= 2)
ac_init_exec_full_mask(&ctx.ac);

View File

@@ -92,10 +92,6 @@ struct ac_shader_abi {
/* Whether to clamp the shadow reference value to [0,1]on VI. Radeonsi currently
* uses it due to promoting D16 to D32, but radv needs it off. */
bool clamp_shadow_reference;
/* Whether to workaround GFX9 ignoring the stride for the buffer size if IDXEN=0
* and LLVM optimizes an indexed load with constant index to IDXEN=0. */
bool gfx9_stride_size_workaround;
};
#endif /* AC_SHADER_ABI_H */

View File

@@ -99,6 +99,13 @@ VULKAN_LIB_DEPS += \
$(WAYLAND_CLIENT_LIBS)
endif
if HAVE_PLATFORM_ANDROID
AM_CPPFLAGS += $(ANDROID_CPPFLAGS)
AM_CFLAGS += $(ANDROID_CFLAGS)
VULKAN_LIB_DEPS += $(ANDROID_LIBS)
VULKAN_SOURCES += $(VULKAN_ANDROID_FILES)
endif
noinst_LTLIBRARIES = libvulkan_common.la
libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
@@ -106,11 +113,14 @@ nodist_EXTRA_libvulkan_radeon_la_SOURCES = dummy.cpp
libvulkan_radeon_la_SOURCES = $(VULKAN_GEM_FILES)
vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
vk_android_native_buffer_xml = $(top_srcdir)/src/vulkan/registry/vk_android_native_buffer.xml
radv_entrypoints.c: radv_entrypoints_gen.py radv_extensions.py $(vulkan_api_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/radv_entrypoints_gen.py \
--xml $(vulkan_api_xml) --outdir $(builddir)
--xml $(vulkan_api_xml) \
--xml $(vk_android_native_buffer_xml) \
--outdir $(builddir)
radv_entrypoints.h: radv_entrypoints.c
radv_extensions.c: radv_extensions.py \
@@ -118,6 +128,7 @@ radv_extensions.c: radv_extensions.py \
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/radv_extensions.py \
--xml $(vulkan_api_xml) \
--xml $(vk_android_native_buffer_xml) \
--out $@
vk_format_table.c: vk_format_table.py \

View File

@@ -69,6 +69,9 @@ VULKAN_FILES := \
vk_format.h \
$(RADV_WS_AMDGPU_FILES)
VULKAN_ANDROID_FILES := \
radv_android.c
VULKAN_WSI_WAYLAND_FILES := \
radv_wsi_wayland.c

View File

@@ -29,10 +29,11 @@ radv_entrypoints = custom_target(
radv_extensions_c = custom_target(
'radv_extensions.c',
input : ['radv_extensions.py', vk_api_xml],
input : ['radv_extensions.py', vk_api_xml, vk_android_native_buffer_xml],
output : ['radv_extensions.c'],
command : [prog_python2, '@INPUT0@', '--xml', '@INPUT1@',
'--out', '@OUTPUT@'],
command : [
prog_python2, '@INPUT0@', '--xml', '@INPUT1@', '--xml', '@INPUT2@', '--out', '@OUTPUT@',
],
)
vk_format_table_c = custom_target(

View File

@@ -0,0 +1,366 @@
/*
* Copyright © 2017, Google Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include <hardware/gralloc.h>
#include <hardware/hardware.h>
#include <hardware/hwvulkan.h>
#include <vulkan/vk_android_native_buffer.h>
#include <vulkan/vk_icd.h>
#include <libsync.h>
#include "radv_private.h"
static int radv_hal_open(const struct hw_module_t* mod, const char* id, struct hw_device_t** dev);
static int radv_hal_close(struct hw_device_t *dev);
static void UNUSED
static_asserts(void)
{
STATIC_ASSERT(HWVULKAN_DISPATCH_MAGIC == ICD_LOADER_MAGIC);
}
PUBLIC struct hwvulkan_module_t HAL_MODULE_INFO_SYM = {
.common = {
.tag = HARDWARE_MODULE_TAG,
.module_api_version = HWVULKAN_MODULE_API_VERSION_0_1,
.hal_api_version = HARDWARE_MAKE_API_VERSION(1, 0),
.id = HWVULKAN_HARDWARE_MODULE_ID,
.name = "AMD Vulkan HAL",
.author = "Google",
.methods = &(hw_module_methods_t) {
.open = radv_hal_open,
},
},
};
/* If any bits in test_mask are set, then unset them and return true. */
static inline bool
unmask32(uint32_t *inout_mask, uint32_t test_mask)
{
uint32_t orig_mask = *inout_mask;
*inout_mask &= ~test_mask;
return *inout_mask != orig_mask;
}
static int
radv_hal_open(const struct hw_module_t* mod, const char* id,
struct hw_device_t** dev)
{
assert(mod == &HAL_MODULE_INFO_SYM.common);
assert(strcmp(id, HWVULKAN_DEVICE_0) == 0);
hwvulkan_device_t *hal_dev = malloc(sizeof(*hal_dev));
if (!hal_dev)
return -1;
*hal_dev = (hwvulkan_device_t) {
.common = {
.tag = HARDWARE_DEVICE_TAG,
.version = HWVULKAN_DEVICE_API_VERSION_0_1,
.module = &HAL_MODULE_INFO_SYM.common,
.close = radv_hal_close,
},
.EnumerateInstanceExtensionProperties = radv_EnumerateInstanceExtensionProperties,
.CreateInstance = radv_CreateInstance,
.GetInstanceProcAddr = radv_GetInstanceProcAddr,
};
*dev = &hal_dev->common;
return 0;
}
static int
radv_hal_close(struct hw_device_t *dev)
{
/* hwvulkan.h claims that hw_device_t::close() is never called. */
return -1;
}
VkResult
radv_image_from_gralloc(VkDevice device_h,
const VkImageCreateInfo *base_info,
const VkNativeBufferANDROID *gralloc_info,
const VkAllocationCallbacks *alloc,
VkImage *out_image_h)
{
RADV_FROM_HANDLE(radv_device, device, device_h);
VkImage image_h = VK_NULL_HANDLE;
struct radv_image *image = NULL;
struct radv_bo *bo = NULL;
VkResult result;
result = radv_image_create(device_h,
&(struct radv_image_create_info) {
.vk_info = base_info,
.scanout = true,
.no_metadata_planes = true},
alloc,
&image_h);
if (result != VK_SUCCESS)
return result;
if (gralloc_info->handle->numFds != 1) {
return vk_errorf(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR,
"VkNativeBufferANDROID::handle::numFds is %d, "
"expected 1", gralloc_info->handle->numFds);
}
/* Do not close the gralloc handle's dma_buf. The lifetime of the dma_buf
* must exceed that of the gralloc handle, and we do not own the gralloc
* handle.
*/
int dma_buf = gralloc_info->handle->data[0];
image = radv_image_from_handle(image_h);
VkDeviceMemory memory_h;
const VkMemoryDedicatedAllocateInfoKHR ded_alloc = {
.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO_KHR,
.pNext = NULL,
.buffer = VK_NULL_HANDLE,
.image = image_h
};
const VkImportMemoryFdInfoKHR import_info = {
.sType = VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR,
.pNext = &ded_alloc,
.handleType = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHR,
.fd = dup(dma_buf),
};
/* Find the first VRAM memory type, or GART for PRIME images. */
int memory_type_index = -1;
for (int i = 0; i < device->physical_device->memory_properties.memoryTypeCount; ++i) {
bool is_local = !!(device->physical_device->memory_properties.memoryTypes[i].propertyFlags & VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
if (is_local) {
memory_type_index = i;
break;
}
}
/* fallback */
if (memory_type_index == -1)
memory_type_index = 0;
result = radv_AllocateMemory(device_h,
&(VkMemoryAllocateInfo) {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.pNext = &import_info,
.allocationSize = image->size,
.memoryTypeIndex = memory_type_index,
},
alloc,
&memory_h);
if (result != VK_SUCCESS)
goto fail_create_image;
radv_BindImageMemory(device_h, image_h, memory_h, 0);
image->owned_memory = memory_h;
/* Don't clobber the out-parameter until success is certain. */
*out_image_h = image_h;
return VK_SUCCESS;
fail_create_image:
fail_size:
radv_DestroyImage(device_h, image_h, alloc);
return result;
}
VkResult radv_GetSwapchainGrallocUsageANDROID(
VkDevice device_h,
VkFormat format,
VkImageUsageFlags imageUsage,
int* grallocUsage)
{
RADV_FROM_HANDLE(radv_device, device, device_h);
struct radv_physical_device *phys_dev = device->physical_device;
VkPhysicalDevice phys_dev_h = radv_physical_device_to_handle(phys_dev);
VkResult result;
*grallocUsage = 0;
/* WARNING: Android Nougat's libvulkan.so hardcodes the VkImageUsageFlags
* returned to applications via VkSurfaceCapabilitiesKHR::supportedUsageFlags.
* The relevant code in libvulkan/swapchain.cpp contains this fun comment:
*
* TODO(jessehall): I think these are right, but haven't thought hard
* about it. Do we need to query the driver for support of any of
* these?
*
* Any disagreement between this function and the hardcoded
* VkSurfaceCapabilitiesKHR:supportedUsageFlags causes tests
* dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
*/
const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
.format = format,
.type = VK_IMAGE_TYPE_2D,
.tiling = VK_IMAGE_TILING_OPTIMAL,
.usage = imageUsage,
};
VkImageFormatProperties2KHR image_format_props = {
.sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
};
/* Check that requested format and usage are supported. */
result = radv_GetPhysicalDeviceImageFormatProperties2KHR(phys_dev_h,
&image_format_info, &image_format_props);
if (result != VK_SUCCESS) {
return vk_errorf(result,
"radv_GetPhysicalDeviceImageFormatProperties2KHR failed "
"inside %s", __func__);
}
if (unmask32(&imageUsage, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
*grallocUsage |= GRALLOC_USAGE_HW_RENDER;
if (unmask32(&imageUsage, VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_STORAGE_BIT |
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT))
*grallocUsage |= GRALLOC_USAGE_HW_TEXTURE;
/* All VkImageUsageFlags not explicitly checked here are unsupported for
* gralloc swapchains.
*/
if (imageUsage != 0) {
return vk_errorf(VK_ERROR_FORMAT_NOT_SUPPORTED,
"unsupported VkImageUsageFlags(0x%x) for gralloc "
"swapchain", imageUsage);
}
/*
* FINISHME: Advertise all display-supported formats. Mostly
* DRM_FORMAT_ARGB2101010 and DRM_FORMAT_ABGR2101010, but need to check
* what we need for 30-bit colors.
*/
if (format == VK_FORMAT_B8G8R8A8_UNORM ||
format == VK_FORMAT_B5G6R5_UNORM_PACK16) {
*grallocUsage |= GRALLOC_USAGE_HW_FB |
GRALLOC_USAGE_HW_COMPOSER |
GRALLOC_USAGE_EXTERNAL_DISP;
}
if (*grallocUsage == 0)
return VK_ERROR_FORMAT_NOT_SUPPORTED;
return VK_SUCCESS;
}
VkResult
radv_AcquireImageANDROID(
VkDevice device,
VkImage image_h,
int nativeFenceFd,
VkSemaphore semaphore,
VkFence fence)
{
VkResult semaphore_result = VK_SUCCESS, fence_result = VK_SUCCESS;
if (semaphore != VK_NULL_HANDLE) {
int semaphore_fd = nativeFenceFd >= 0 ? dup(nativeFenceFd) : nativeFenceFd;
semaphore_result = radv_ImportSemaphoreFdKHR(device,
&(VkImportSemaphoreFdInfoKHR) {
.sType = VK_STRUCTURE_TYPE_IMPORT_SEMAPHORE_FD_INFO_KHR,
.flags = VK_SEMAPHORE_IMPORT_TEMPORARY_BIT_KHR,
.fd = semaphore_fd,
.semaphore = semaphore,
});
}
if (fence != VK_NULL_HANDLE) {
int fence_fd = nativeFenceFd >= 0 ? dup(nativeFenceFd) : nativeFenceFd;
fence_result = radv_ImportFenceFdKHR(device,
&(VkImportFenceFdInfoKHR) {
.sType = VK_STRUCTURE_TYPE_IMPORT_FENCE_FD_INFO_KHR,
.flags = VK_FENCE_IMPORT_TEMPORARY_BIT_KHR,
.fd = fence_fd,
.fence = fence,
});
}
close(nativeFenceFd);
if (semaphore_result != VK_SUCCESS)
return semaphore_result;
return fence_result;
}
VkResult
radv_QueueSignalReleaseImageANDROID(
VkQueue _queue,
uint32_t waitSemaphoreCount,
const VkSemaphore* pWaitSemaphores,
VkImage image,
int* pNativeFenceFd)
{
RADV_FROM_HANDLE(radv_queue, queue, _queue);
VkResult result = VK_SUCCESS;
if (waitSemaphoreCount == 0) {
if (pNativeFenceFd)
*pNativeFenceFd = -1;
return VK_SUCCESS;
}
int fd = -1;
for (uint32_t i = 0; i < waitSemaphoreCount; ++i) {
int tmp_fd;
result = radv_GetSemaphoreFdKHR(radv_device_to_handle(queue->device),
&(VkSemaphoreGetFdInfoKHR) {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_GET_FD_INFO_KHR,
.handleType = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR,
.semaphore = pWaitSemaphores[i],
}, &tmp_fd);
if (result != VK_SUCCESS) {
if (fd >= 0)
close (fd);
return result;
}
if (fd < 0)
fd = tmp_fd;
else if (tmp_fd >= 0) {
sync_accumulate("radv", &fd, tmp_fd);
close(tmp_fd);
}
}
if (pNativeFenceFd) {
*pNativeFenceFd = fd;
} else if (fd >= 0) {
close(fd);
/* We still need to do the exports, to reset the semaphores, but
* otherwise we don't wait on them. */
}
return VK_SUCCESS;
}

View File

@@ -380,7 +380,7 @@ radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer)
flags = RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
si_cs_emit_cache_flush(cmd_buffer->cs,
si_cs_emit_cache_flush(cmd_buffer->cs, false,
cmd_buffer->device->physical_device->rad_info.chip_class,
NULL, 0,
radv_cmd_buffer_uses_mec(cmd_buffer),
@@ -541,8 +541,7 @@ radv_update_multisample_state(struct radv_cmd_buffer *cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_028804_DB_EQAA, ms->db_eqaa);
radeon_set_context_reg(cmd_buffer->cs, R_028A4C_PA_SC_MODE_CNTL_1, ms->pa_sc_mode_cntl_1);
if (old_pipeline && num_samples == old_pipeline->graphics.ms.num_samples &&
old_pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.needs_sample_positions == pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.needs_sample_positions)
if (old_pipeline && num_samples == old_pipeline->graphics.ms.num_samples)
return;
radeon_set_context_reg_seq(cmd_buffer->cs, R_028BDC_PA_SC_LINE_CNTL, 2);
@@ -919,6 +918,7 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
{
struct radv_shader_variant *ps;
uint64_t va;
unsigned spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
struct radv_blend_state *blend = &pipeline->graphics.blend;
assert (pipeline->shaders[MESA_SHADER_FRAGMENT]);
@@ -940,10 +940,13 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR,
ps->config.spi_ps_input_addr);
if (ps->info.info.ps.force_persample)
spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL,
S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, pipeline->graphics.spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_028710_SPI_SHADER_Z_FORMAT,
pipeline->graphics.shader_z_format);
@@ -1915,11 +1918,11 @@ radv_dst_access_flush(struct radv_cmd_buffer *cmd_buffer,
switch ((VkAccessFlagBits)(1 << b)) {
case VK_ACCESS_INDIRECT_COMMAND_READ_BIT:
case VK_ACCESS_INDEX_READ_BIT:
case VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT:
break;
case VK_ACCESS_UNIFORM_READ_BIT:
flush_bits |= RADV_CMD_FLAG_INV_VMEM_L1 | RADV_CMD_FLAG_INV_SMEM_L1;
break;
case VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT:
case VK_ACCESS_SHADER_READ_BIT:
case VK_ACCESS_TRANSFER_READ_BIT:
case VK_ACCESS_INPUT_ATTACHMENT_READ_BIT:
@@ -3579,8 +3582,7 @@ void radv_CmdEndRenderPass(
/*
* For HTILE we have the following interesting clear words:
* 0xfffff30f: Uncompressed, full depth range, for depth+stencil HTILE
* 0xfffc000f: Uncompressed, full depth range, for depth only HTILE.
* 0x0000030f: Uncompressed.
* 0xfffffff0: Clear depth to 1.0
* 0x00000000: Clear depth to 0.0
*/
@@ -3629,8 +3631,7 @@ static void radv_handle_depth_image_transition(struct radv_cmd_buffer *cmd_buffe
radv_initialize_htile(cmd_buffer, image, range, 0);
} else if (!radv_layout_is_htile_compressed(image, src_layout, src_queue_mask) &&
radv_layout_is_htile_compressed(image, dst_layout, dst_queue_mask)) {
uint32_t clear_value = vk_format_is_stencil(image->vk_format) ? 0xfffff30f : 0xfffc000f;
radv_initialize_htile(cmd_buffer, image, range, clear_value);
radv_initialize_htile(cmd_buffer, image, range, 0xffffffff);
} else if (radv_layout_is_htile_compressed(image, src_layout, src_queue_mask) &&
!radv_layout_is_htile_compressed(image, dst_layout, dst_queue_mask)) {
VkImageSubresourceRange local_range = *range;
@@ -3832,7 +3833,7 @@ static void write_event(struct radv_cmd_buffer *cmd_buffer,
si_cs_emit_write_event_eop(cs,
cmd_buffer->state.predicating,
cmd_buffer->device->physical_device->rad_info.chip_class,
radv_cmd_buffer_uses_mec(cmd_buffer),
false,
V_028A90_BOTTOM_OF_PIPE_TS, 0,
1, va, 2, value);

View File

@@ -76,43 +76,32 @@ radv_get_device_uuid(struct radeon_info *info, void *uuid)
ac_compute_device_uuid(info, uuid, VK_UUID_SIZE);
}
static void
radv_get_device_name(enum radeon_family family, char *name, size_t name_len)
static const char *
get_chip_name(enum radeon_family family)
{
const char *chip_string;
char llvm_string[32] = {};
switch (family) {
case CHIP_TAHITI: chip_string = "AMD RADV TAHITI"; break;
case CHIP_PITCAIRN: chip_string = "AMD RADV PITCAIRN"; break;
case CHIP_VERDE: chip_string = "AMD RADV CAPE VERDE"; break;
case CHIP_OLAND: chip_string = "AMD RADV OLAND"; break;
case CHIP_HAINAN: chip_string = "AMD RADV HAINAN"; break;
case CHIP_BONAIRE: chip_string = "AMD RADV BONAIRE"; break;
case CHIP_KAVERI: chip_string = "AMD RADV KAVERI"; break;
case CHIP_KABINI: chip_string = "AMD RADV KABINI"; break;
case CHIP_HAWAII: chip_string = "AMD RADV HAWAII"; break;
case CHIP_MULLINS: chip_string = "AMD RADV MULLINS"; break;
case CHIP_TONGA: chip_string = "AMD RADV TONGA"; break;
case CHIP_ICELAND: chip_string = "AMD RADV ICELAND"; break;
case CHIP_CARRIZO: chip_string = "AMD RADV CARRIZO"; break;
case CHIP_FIJI: chip_string = "AMD RADV FIJI"; break;
case CHIP_POLARIS10: chip_string = "AMD RADV POLARIS10"; break;
case CHIP_POLARIS11: chip_string = "AMD RADV POLARIS11"; break;
case CHIP_POLARIS12: chip_string = "AMD RADV POLARIS12"; break;
case CHIP_STONEY: chip_string = "AMD RADV STONEY"; break;
case CHIP_VEGA10: chip_string = "AMD RADV VEGA"; break;
case CHIP_RAVEN: chip_string = "AMD RADV RAVEN"; break;
default: chip_string = "AMD RADV unknown"; break;
case CHIP_TAHITI: return "AMD RADV TAHITI";
case CHIP_PITCAIRN: return "AMD RADV PITCAIRN";
case CHIP_VERDE: return "AMD RADV CAPE VERDE";
case CHIP_OLAND: return "AMD RADV OLAND";
case CHIP_HAINAN: return "AMD RADV HAINAN";
case CHIP_BONAIRE: return "AMD RADV BONAIRE";
case CHIP_KAVERI: return "AMD RADV KAVERI";
case CHIP_KABINI: return "AMD RADV KABINI";
case CHIP_HAWAII: return "AMD RADV HAWAII";
case CHIP_MULLINS: return "AMD RADV MULLINS";
case CHIP_TONGA: return "AMD RADV TONGA";
case CHIP_ICELAND: return "AMD RADV ICELAND";
case CHIP_CARRIZO: return "AMD RADV CARRIZO";
case CHIP_FIJI: return "AMD RADV FIJI";
case CHIP_POLARIS10: return "AMD RADV POLARIS10";
case CHIP_POLARIS11: return "AMD RADV POLARIS11";
case CHIP_POLARIS12: return "AMD RADV POLARIS12";
case CHIP_STONEY: return "AMD RADV STONEY";
case CHIP_VEGA10: return "AMD RADV VEGA";
case CHIP_RAVEN: return "AMD RADV RAVEN";
default: return "AMD RADV unknown";
}
if (HAVE_LLVM > 0) {
snprintf(llvm_string, sizeof(llvm_string),
" (LLVM %i.%i.%i)", (HAVE_LLVM >> 8) & 0xff,
HAVE_LLVM & 0xff, MESA_LLVM_VERSION_PATCH);
}
snprintf(name, name_len, "%s%s", chip_string, llvm_string);
}
static void
@@ -232,7 +221,7 @@ radv_physical_device_init(struct radv_physical_device *device,
goto fail;
}
radv_get_device_name(device->rad_info.family, device->name, sizeof(device->name));
device->name = get_chip_name(device->rad_info.family);
if (radv_device_get_cache_uuid(device->rad_info.family, device->cache_uuid)) {
radv_finish_wsi(device);
@@ -622,9 +611,9 @@ void radv_GetPhysicalDeviceProperties(
.maxPerStageResources = max_descriptor_set_size,
.maxDescriptorSetSamplers = max_descriptor_set_size,
.maxDescriptorSetUniformBuffers = max_descriptor_set_size,
.maxDescriptorSetUniformBuffersDynamic = MAX_DYNAMIC_UNIFORM_BUFFERS,
.maxDescriptorSetUniformBuffersDynamic = MAX_DYNAMIC_BUFFERS / 2,
.maxDescriptorSetStorageBuffers = max_descriptor_set_size,
.maxDescriptorSetStorageBuffersDynamic = MAX_DYNAMIC_STORAGE_BUFFERS,
.maxDescriptorSetStorageBuffersDynamic = MAX_DYNAMIC_BUFFERS / 2,
.maxDescriptorSetSampledImages = max_descriptor_set_size,
.maxDescriptorSetStorageImages = max_descriptor_set_size,
.maxDescriptorSetInputAttachments = max_descriptor_set_size,
@@ -1049,6 +1038,10 @@ VkResult radv_CreateDevice(
}
}
#ifdef ANDROID
device->always_use_syncobj = device->physical_device->rad_info.has_syncobj_wait_for_submit;
#endif
#if HAVE_LLVM < 0x0400
device->llvm_supports_spill = false;
#else
@@ -1119,15 +1112,13 @@ VkResult radv_CreateDevice(
result = radv_CreatePipelineCache(radv_device_to_handle(device),
&ci, NULL, &pc);
if (result != VK_SUCCESS)
goto fail_meta;
goto fail;
device->mem_cache = radv_pipeline_cache_from_handle(pc);
*pDevice = radv_device_to_handle(device);
return VK_SUCCESS;
fail_meta:
radv_device_finish_meta(device);
fail:
if (device->trace_bo)
device->ws->buffer_destroy(device->trace_bo);
@@ -1690,6 +1681,7 @@ radv_get_preamble_cs(struct radv_queue *queue,
if (i == 0) {
si_cs_emit_cache_flush(cs,
false,
queue->device->physical_device->rad_info.chip_class,
NULL, 0,
queue->queue_family_index == RING_COMPUTE &&
@@ -1701,6 +1693,7 @@ radv_get_preamble_cs(struct radv_queue *queue,
RADV_CMD_FLAG_INV_GLOBAL_L2);
} else if (i == 1) {
si_cs_emit_cache_flush(cs,
false,
queue->device->physical_device->rad_info.chip_class,
NULL, 0,
queue->queue_family_index == RING_COMPUTE &&
@@ -1805,12 +1798,14 @@ fail:
static VkResult radv_alloc_sem_counts(struct radv_winsys_sem_counts *counts,
int num_sems,
const VkSemaphore *sems,
VkFence _fence,
bool reset_temp)
{
int syncobj_idx = 0, sem_idx = 0;
if (num_sems == 0)
if (num_sems == 0 && _fence == VK_NULL_HANDLE)
return VK_SUCCESS;
for (uint32_t i = 0; i < num_sems; i++) {
RADV_FROM_HANDLE(radv_semaphore, sem, sems[i]);
@@ -1820,6 +1815,12 @@ static VkResult radv_alloc_sem_counts(struct radv_winsys_sem_counts *counts,
counts->sem_count++;
}
if (_fence != VK_NULL_HANDLE) {
RADV_FROM_HANDLE(radv_fence, fence, _fence);
if (fence->temp_syncobj || fence->syncobj)
counts->syncobj_count++;
}
if (counts->syncobj_count) {
counts->syncobj = (uint32_t *)malloc(sizeof(uint32_t) * counts->syncobj_count);
if (!counts->syncobj)
@@ -1848,6 +1849,14 @@ static VkResult radv_alloc_sem_counts(struct radv_winsys_sem_counts *counts,
}
}
if (_fence != VK_NULL_HANDLE) {
RADV_FROM_HANDLE(radv_fence, fence, _fence);
if (fence->temp_syncobj)
counts->syncobj[syncobj_idx++] = fence->temp_syncobj;
else if (fence->syncobj)
counts->syncobj[syncobj_idx++] = fence->syncobj;
}
return VK_SUCCESS;
}
@@ -1878,15 +1887,16 @@ VkResult radv_alloc_sem_info(struct radv_winsys_sem_info *sem_info,
int num_wait_sems,
const VkSemaphore *wait_sems,
int num_signal_sems,
const VkSemaphore *signal_sems)
const VkSemaphore *signal_sems,
VkFence fence)
{
VkResult ret;
memset(sem_info, 0, sizeof(*sem_info));
ret = radv_alloc_sem_counts(&sem_info->wait, num_wait_sems, wait_sems, true);
ret = radv_alloc_sem_counts(&sem_info->wait, num_wait_sems, wait_sems, VK_NULL_HANDLE, true);
if (ret)
return ret;
ret = radv_alloc_sem_counts(&sem_info->signal, num_signal_sems, signal_sems, false);
ret = radv_alloc_sem_counts(&sem_info->signal, num_signal_sems, signal_sems, fence, false);
if (ret)
radv_free_sem_info(sem_info);
@@ -1896,6 +1906,32 @@ VkResult radv_alloc_sem_info(struct radv_winsys_sem_info *sem_info,
return ret;
}
/* Signals fence as soon as all the work currently put on queue is done. */
static VkResult radv_signal_fence(struct radv_queue *queue,
struct radv_fence *fence)
{
int ret;
VkResult result;
struct radv_winsys_sem_info sem_info;
result = radv_alloc_sem_info(&sem_info, 0, NULL, 0, NULL,
radv_fence_to_handle(fence));
if (result != VK_SUCCESS)
return result;
ret = queue->device->ws->cs_submit(queue->hw_ctx, queue->queue_idx,
&queue->device->empty_cs[queue->queue_family_index],
1, NULL, NULL, &sem_info,
false, fence->fence);
radv_free_sem_info(&sem_info);
/* TODO: find a better error */
if (ret)
return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
return VK_SUCCESS;
}
VkResult radv_QueueSubmit(
VkQueue _queue,
uint32_t submitCount,
@@ -1952,7 +1988,8 @@ VkResult radv_QueueSubmit(
pSubmits[i].waitSemaphoreCount,
pSubmits[i].pWaitSemaphores,
pSubmits[i].signalSemaphoreCount,
pSubmits[i].pSignalSemaphores);
pSubmits[i].pSignalSemaphores,
_fence);
if (result != VK_SUCCESS)
return result;
@@ -2021,11 +2058,7 @@ VkResult radv_QueueSubmit(
if (fence) {
if (!fence_emitted) {
struct radv_winsys_sem_info sem_info = {0};
ret = queue->device->ws->cs_submit(ctx, queue->queue_idx,
&queue->device->empty_cs[queue->queue_family_index],
1, NULL, NULL, &sem_info,
false, base_fence);
radv_signal_fence(queue, fence);
}
fence->submitted = true;
}
@@ -2517,7 +2550,8 @@ radv_sparse_image_opaque_bind_memory(struct radv_device *device,
pBindInfo[i].waitSemaphoreCount,
pBindInfo[i].pWaitSemaphores,
pBindInfo[i].signalSemaphoreCount,
pBindInfo[i].pSignalSemaphores);
pBindInfo[i].pSignalSemaphores,
_fence);
if (result != VK_SUCCESS)
return result;
@@ -2536,8 +2570,11 @@ radv_sparse_image_opaque_bind_memory(struct radv_device *device,
}
if (fence && !fence_emitted) {
fence->signalled = true;
if (fence) {
if (!fence_emitted) {
radv_signal_fence(queue, fence);
}
fence->submitted = true;
}
return VK_SUCCESS;
@@ -2550,6 +2587,11 @@ VkResult radv_CreateFence(
VkFence* pFence)
{
RADV_FROM_HANDLE(radv_device, device, _device);
const VkExportFenceCreateInfoKHR *export =
vk_find_struct_const(pCreateInfo->pNext, EXPORT_FENCE_CREATE_INFO_KHR);
VkExternalFenceHandleTypeFlagsKHR handleTypes =
export ? export->handleTypes : 0;
struct radv_fence *fence = vk_alloc2(&device->alloc, pAllocator,
sizeof(*fence), 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
@@ -2560,10 +2602,24 @@ VkResult radv_CreateFence(
memset(fence, 0, sizeof(*fence));
fence->submitted = false;
fence->signalled = !!(pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT);
fence->fence = device->ws->create_fence();
if (!fence->fence) {
vk_free2(&device->alloc, pAllocator, fence);
return VK_ERROR_OUT_OF_HOST_MEMORY;
fence->temp_syncobj = 0;
if (device->always_use_syncobj || handleTypes) {
int ret = device->ws->create_syncobj(device->ws, &fence->syncobj);
if (ret) {
vk_free2(&device->alloc, pAllocator, fence);
return VK_ERROR_OUT_OF_HOST_MEMORY;
}
if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
device->ws->signal_syncobj(device->ws, fence->syncobj);
}
fence->fence = NULL;
} else {
fence->fence = device->ws->create_fence();
if (!fence->fence) {
vk_free2(&device->alloc, pAllocator, fence);
return VK_ERROR_OUT_OF_HOST_MEMORY;
}
fence->syncobj = 0;
}
*pFence = radv_fence_to_handle(fence);
@@ -2581,21 +2637,23 @@ void radv_DestroyFence(
if (!fence)
return;
device->ws->destroy_fence(fence->fence);
if (fence->temp_syncobj)
device->ws->destroy_syncobj(device->ws, fence->temp_syncobj);
if (fence->syncobj)
device->ws->destroy_syncobj(device->ws, fence->syncobj);
if (fence->fence)
device->ws->destroy_fence(fence->fence);
vk_free2(&device->alloc, pAllocator, fence);
}
static uint64_t radv_get_current_time()
{
struct timespec tv;
clock_gettime(CLOCK_MONOTONIC, &tv);
return tv.tv_nsec + tv.tv_sec*1000000000ull;
}
static uint64_t radv_get_absolute_timeout(uint64_t timeout)
{
uint64_t current_time = radv_get_current_time();
uint64_t current_time;
struct timespec tv;
clock_gettime(CLOCK_MONOTONIC, &tv);
current_time = tv.tv_nsec + tv.tv_sec*1000000000ull;
timeout = MIN2(UINT64_MAX - current_time, timeout);
@@ -2613,33 +2671,30 @@ VkResult radv_WaitForFences(
timeout = radv_get_absolute_timeout(timeout);
if (!waitAll && fenceCount > 1) {
while(radv_get_current_time() <= timeout) {
for (uint32_t i = 0; i < fenceCount; ++i) {
if (radv_GetFenceStatus(_device, pFences[i]) == VK_SUCCESS)
return VK_SUCCESS;
}
}
return VK_TIMEOUT;
fprintf(stderr, "radv: WaitForFences without waitAll not implemented yet\n");
}
for (uint32_t i = 0; i < fenceCount; ++i) {
RADV_FROM_HANDLE(radv_fence, fence, pFences[i]);
bool expired = false;
if (fence->temp_syncobj) {
if (!device->ws->wait_syncobj(device->ws, fence->temp_syncobj, timeout))
return VK_TIMEOUT;
continue;
}
if (fence->syncobj) {
if (!device->ws->wait_syncobj(device->ws, fence->syncobj, timeout))
return VK_TIMEOUT;
continue;
}
if (fence->signalled)
continue;
if (!fence->submitted) {
while(radv_get_current_time() <= timeout && !fence->submitted)
/* Do nothing */;
if (!fence->submitted)
return VK_TIMEOUT;
/* Recheck as it may have been set by submitting operations. */
if (fence->signalled)
continue;
}
if (!fence->submitted)
return VK_TIMEOUT;
expired = device->ws->fence_wait(device->ws, fence->fence, true, timeout);
if (!expired)
@@ -2651,13 +2706,26 @@ VkResult radv_WaitForFences(
return VK_SUCCESS;
}
VkResult radv_ResetFences(VkDevice device,
VkResult radv_ResetFences(VkDevice _device,
uint32_t fenceCount,
const VkFence *pFences)
{
RADV_FROM_HANDLE(radv_device, device, _device);
for (unsigned i = 0; i < fenceCount; ++i) {
RADV_FROM_HANDLE(radv_fence, fence, pFences[i]);
fence->submitted = fence->signalled = false;
/* Per spec, we first restore the permanent payload, and then reset, so
* having a temp syncobj should not skip resetting the permanent syncobj. */
if (fence->temp_syncobj) {
device->ws->destroy_syncobj(device->ws, fence->temp_syncobj);
fence->temp_syncobj = 0;
}
if (fence->syncobj) {
device->ws->reset_syncobj(device->ws, fence->syncobj);
}
}
return VK_SUCCESS;
@@ -2668,11 +2736,20 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence _fence)
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_fence, fence, _fence);
if (fence->temp_syncobj) {
bool success = device->ws->wait_syncobj(device->ws, fence->temp_syncobj, 0);
return success ? VK_SUCCESS : VK_NOT_READY;
}
if (fence->syncobj) {
bool success = device->ws->wait_syncobj(device->ws, fence->syncobj, 0);
return success ? VK_SUCCESS : VK_NOT_READY;
}
if (fence->signalled)
return VK_SUCCESS;
if (!fence->submitted)
return VK_NOT_READY;
if (!device->ws->fence_wait(device->ws, fence->fence, false, 0))
return VK_NOT_READY;
@@ -2702,9 +2779,8 @@ VkResult radv_CreateSemaphore(
sem->temp_syncobj = 0;
/* create a syncobject if we are going to export this semaphore */
if (handleTypes) {
if (device->always_use_syncobj || handleTypes) {
assert (device->physical_device->rad_info.has_syncobj);
assert (handleTypes == VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR);
int ret = device->ws->create_syncobj(device->ws, &sem->syncobj);
if (ret) {
vk_free2(&device->alloc, pAllocator, sem);
@@ -3553,18 +3629,59 @@ VkResult radv_GetMemoryFdPropertiesKHR(VkDevice _device,
return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR;
}
static VkResult radv_import_opaque_fd(struct radv_device *device,
int fd,
uint32_t *syncobj)
{
uint32_t syncobj_handle = 0;
int ret = device->ws->import_syncobj(device->ws, fd, &syncobj_handle);
if (ret != 0)
return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
if (*syncobj)
device->ws->destroy_syncobj(device->ws, *syncobj);
*syncobj = syncobj_handle;
close(fd);
return VK_SUCCESS;
}
static VkResult radv_import_sync_fd(struct radv_device *device,
int fd,
uint32_t *syncobj)
{
/* If we create a syncobj we do it locally so that if we have an error, we don't
* leave a syncobj in an undetermined state in the fence. */
uint32_t syncobj_handle = *syncobj;
if (!syncobj_handle) {
int ret = device->ws->create_syncobj(device->ws, &syncobj_handle);
if (ret) {
return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
}
}
if (fd == -1) {
device->ws->signal_syncobj(device->ws, syncobj_handle);
} else {
int ret = device->ws->import_syncobj_from_sync_file(device->ws, syncobj_handle, fd);
if (ret != 0)
return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
}
*syncobj = syncobj_handle;
if (fd != -1)
close(fd);
return VK_SUCCESS;
}
VkResult radv_ImportSemaphoreFdKHR(VkDevice _device,
const VkImportSemaphoreFdInfoKHR *pImportSemaphoreFdInfo)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_semaphore, sem, pImportSemaphoreFdInfo->semaphore);
uint32_t syncobj_handle = 0;
uint32_t *syncobj_dst = NULL;
assert(pImportSemaphoreFdInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR);
int ret = device->ws->import_syncobj(device->ws, pImportSemaphoreFdInfo->fd, &syncobj_handle);
if (ret != 0)
return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR;
if (pImportSemaphoreFdInfo->flags & VK_SEMAPHORE_IMPORT_TEMPORARY_BIT_KHR) {
syncobj_dst = &sem->temp_syncobj;
@@ -3572,12 +3689,14 @@ VkResult radv_ImportSemaphoreFdKHR(VkDevice _device,
syncobj_dst = &sem->syncobj;
}
if (*syncobj_dst)
device->ws->destroy_syncobj(device->ws, *syncobj_dst);
*syncobj_dst = syncobj_handle;
close(pImportSemaphoreFdInfo->fd);
return VK_SUCCESS;
switch(pImportSemaphoreFdInfo->handleType) {
case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
return radv_import_opaque_fd(device, pImportSemaphoreFdInfo->fd, syncobj_dst);
case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
return radv_import_sync_fd(device, pImportSemaphoreFdInfo->fd, syncobj_dst);
default:
unreachable("Unhandled semaphore handle type");
}
}
VkResult radv_GetSemaphoreFdKHR(VkDevice _device,
@@ -3589,12 +3708,30 @@ VkResult radv_GetSemaphoreFdKHR(VkDevice _device,
int ret;
uint32_t syncobj_handle;
assert(pGetFdInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR);
if (sem->temp_syncobj)
syncobj_handle = sem->temp_syncobj;
else
syncobj_handle = sem->syncobj;
ret = device->ws->export_syncobj(device->ws, syncobj_handle, pFd);
switch(pGetFdInfo->handleType) {
case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
ret = device->ws->export_syncobj(device->ws, syncobj_handle, pFd);
break;
case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
ret = device->ws->export_syncobj_to_sync_file(device->ws, syncobj_handle, pFd);
if (!ret) {
if (sem->temp_syncobj) {
close (sem->temp_syncobj);
sem->temp_syncobj = 0;
} else {
device->ws->reset_syncobj(device->ws, syncobj_handle);
}
}
break;
default:
unreachable("Unhandled semaphore handle type");
}
if (ret)
return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
return VK_SUCCESS;
@@ -3605,7 +3742,17 @@ void radv_GetPhysicalDeviceExternalSemaphorePropertiesKHR(
const VkPhysicalDeviceExternalSemaphoreInfoKHR* pExternalSemaphoreInfo,
VkExternalSemaphorePropertiesKHR* pExternalSemaphoreProperties)
{
if (pExternalSemaphoreInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR) {
RADV_FROM_HANDLE(radv_physical_device, pdevice, physicalDevice);
/* Require has_syncobj_wait_for_submit for the syncobj signal ioctl introduced at virtually the same time */
if (pdevice->rad_info.has_syncobj_wait_for_submit &&
(pExternalSemaphoreInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR ||
pExternalSemaphoreInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR)) {
pExternalSemaphoreProperties->exportFromImportedHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR | VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
pExternalSemaphoreProperties->compatibleHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR | VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
pExternalSemaphoreProperties->externalSemaphoreFeatures = VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHR |
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHR;
} else if (pExternalSemaphoreInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR) {
pExternalSemaphoreProperties->exportFromImportedHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
pExternalSemaphoreProperties->compatibleHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
pExternalSemaphoreProperties->externalSemaphoreFeatures = VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHR |
@@ -3616,3 +3763,86 @@ void radv_GetPhysicalDeviceExternalSemaphorePropertiesKHR(
pExternalSemaphoreProperties->externalSemaphoreFeatures = 0;
}
}
VkResult radv_ImportFenceFdKHR(VkDevice _device,
const VkImportFenceFdInfoKHR *pImportFenceFdInfo)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_fence, fence, pImportFenceFdInfo->fence);
uint32_t *syncobj_dst = NULL;
if (pImportFenceFdInfo->flags & VK_FENCE_IMPORT_TEMPORARY_BIT_KHR) {
syncobj_dst = &fence->temp_syncobj;
} else {
syncobj_dst = &fence->syncobj;
}
switch(pImportFenceFdInfo->handleType) {
case VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
return radv_import_opaque_fd(device, pImportFenceFdInfo->fd, syncobj_dst);
case VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
return radv_import_sync_fd(device, pImportFenceFdInfo->fd, syncobj_dst);
default:
unreachable("Unhandled fence handle type");
}
}
VkResult radv_GetFenceFdKHR(VkDevice _device,
const VkFenceGetFdInfoKHR *pGetFdInfo,
int *pFd)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_fence, fence, pGetFdInfo->fence);
int ret;
uint32_t syncobj_handle;
if (fence->temp_syncobj)
syncobj_handle = fence->temp_syncobj;
else
syncobj_handle = fence->syncobj;
switch(pGetFdInfo->handleType) {
case VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
ret = device->ws->export_syncobj(device->ws, syncobj_handle, pFd);
break;
case VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
ret = device->ws->export_syncobj_to_sync_file(device->ws, syncobj_handle, pFd);
if (!ret) {
if (fence->temp_syncobj) {
close (fence->temp_syncobj);
fence->temp_syncobj = 0;
} else {
device->ws->reset_syncobj(device->ws, syncobj_handle);
}
}
break;
default:
unreachable("Unhandled fence handle type");
}
if (ret)
return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
return VK_SUCCESS;
}
void radv_GetPhysicalDeviceExternalFencePropertiesKHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalFenceInfoKHR* pExternalFenceInfo,
VkExternalFencePropertiesKHR* pExternalFenceProperties)
{
RADV_FROM_HANDLE(radv_physical_device, pdevice, physicalDevice);
if (pdevice->rad_info.has_syncobj_wait_for_submit &&
(pExternalFenceInfo->handleType == VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR ||
pExternalFenceInfo->handleType == VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR)) {
pExternalFenceProperties->exportFromImportedHandleTypes = VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR | VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
pExternalFenceProperties->compatibleHandleTypes = VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR | VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
pExternalFenceProperties->externalFenceFeatures = VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT_KHR |
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHR;
} else {
pExternalFenceProperties->exportFromImportedHandleTypes = 0;
pExternalFenceProperties->compatibleHandleTypes = 0;
pExternalFenceProperties->externalFenceFeatures = 0;
}
}

View File

@@ -237,7 +237,9 @@ def get_entrypoints(doc, entrypoints_to_defines, start_index):
if extension.attrib['name'] not in supported:
continue
assert extension.attrib['supported'] == 'vulkan'
if extension.attrib['supported'] != 'vulkan':
continue
for command in extension.findall('./require/command'):
enabled_commands.add(command.attrib['name'])

View File

@@ -50,9 +50,13 @@ class Extension:
# the those extension strings, then tests dEQP-VK.api.info.instance.extensions
# and dEQP-VK.api.info.device fail due to the duplicated strings.
EXTENSIONS = [
Extension('VK_ANDROID_native_buffer', 5, 'ANDROID && device->rad_info.has_syncobj_wait_for_submit'),
Extension('VK_KHR_bind_memory2', 1, True),
Extension('VK_KHR_dedicated_allocation', 1, True),
Extension('VK_KHR_descriptor_update_template', 1, True),
Extension('VK_KHR_external_fence', 1, 'device->rad_info.has_syncobj_wait_for_submit'),
Extension('VK_KHR_external_fence_capabilities', 1, True),
Extension('VK_KHR_external_fence_fd', 1, 'device->rad_info.has_syncobj_wait_for_submit'),
Extension('VK_KHR_external_memory', 1, True),
Extension('VK_KHR_external_memory_capabilities', 1, True),
Extension('VK_KHR_external_memory_fd', 1, True),
@@ -76,7 +80,7 @@ EXTENSIONS = [
Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'),
Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'),
Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'),
Extension('VK_KHX_multiview', 1, False),
Extension('VK_KHX_multiview', 1, True),
Extension('VK_EXT_global_priority', 1, 'device->rad_info.has_ctx_priority'),
Extension('VK_AMD_draw_indirect_count', 1, True),
Extension('VK_AMD_rasterization_order', 1, 'device->rad_info.chip_class >= VI && device->rad_info.max_se >= 2'),

View File

@@ -1063,9 +1063,6 @@ static VkResult radv_get_image_format_properties(struct radv_physical_device *ph
if (format_feature_flags == 0)
goto unsupported;
if (info->type != VK_IMAGE_TYPE_2D && vk_format_is_depth_or_stencil(info->format))
goto unsupported;
switch (info->type) {
default:
unreachable("bad vkimage type\n");

View File

@@ -116,8 +116,7 @@ radv_init_surface(struct radv_device *device,
pCreateInfo->mipLevels <= 1 &&
device->physical_device->rad_info.chip_class >= VI &&
((pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
/* for some reason TC compat with 2/4/8 samples breaks some cts tests - disable for now */
(pCreateInfo->samples < 2 && pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT)) ||
pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) ||
(device->physical_device->rad_info.chip_class >= GFX9 &&
pCreateInfo->format == VK_FORMAT_D16_UNORM)))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;
@@ -128,7 +127,7 @@ radv_init_surface(struct radv_device *device,
surface->flags |= RADEON_SURF_OPTIMIZE_FOR_SPACE;
bool dcc_compatible_formats = radv_is_colorbuffer_format_supported(pCreateInfo->format, &blendable);
bool dcc_compatible_formats = !radv_is_colorbuffer_format_supported(pCreateInfo->format, &blendable);
if (pCreateInfo->flags & VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT) {
const struct VkImageFormatListCreateInfoKHR *format_list =
(const struct VkImageFormatListCreateInfoKHR *)
@@ -345,7 +344,7 @@ static unsigned radv_tex_dim(VkImageType image_type, VkImageViewType view_type,
}
}
static unsigned gfx9_border_color_swizzle(const enum vk_swizzle swizzle[4])
static unsigned gfx9_border_color_swizzle(const unsigned char swizzle[4])
{
unsigned bc_swizzle = V_008F20_BC_SWIZZLE_XYZW;
@@ -450,7 +449,7 @@ si_make_texture_descriptor(struct radv_device *device,
state[7] = 0;
if (device->physical_device->rad_info.chip_class >= GFX9) {
unsigned bc_swizzle = gfx9_border_color_swizzle(swizzle);
unsigned bc_swizzle = gfx9_border_color_swizzle(desc->swizzle);
/* Depth is the the last accessible layer on Gfx9.
* The hw doesn't need to know the total number of layers.
@@ -905,29 +904,34 @@ radv_image_create(VkDevice _device,
image->size = image->surface.surf_size;
image->alignment = image->surface.surf_alignment;
/* Try to enable DCC first. */
if (radv_image_can_enable_dcc(image)) {
radv_image_alloc_dcc(image);
} else {
/* When DCC cannot be enabled, try CMASK. */
image->surface.dcc_size = 0;
if (radv_image_can_enable_cmask(image)) {
radv_image_alloc_cmask(device, image);
}
}
/* Try to enable FMASK for multisampled images. */
if (radv_image_can_enable_fmask(image)) {
radv_image_alloc_fmask(device, image);
} else {
/* Otherwise, try to enable HTILE for depth surfaces. */
if (radv_image_can_enable_htile(image) &&
!(device->instance->debug_flags & RADV_DEBUG_NO_HIZ)) {
radv_image_alloc_htile(image);
image->tc_compatible_htile = image->surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
if (!create_info->no_metadata_planes) {
/* Try to enable DCC first. */
if (radv_image_can_enable_dcc(image)) {
radv_image_alloc_dcc(image);
} else {
image->surface.htile_size = 0;
/* When DCC cannot be enabled, try CMASK. */
image->surface.dcc_size = 0;
if (radv_image_can_enable_cmask(image)) {
radv_image_alloc_cmask(device, image);
}
}
/* Try to enable FMASK for multisampled images. */
if (radv_image_can_enable_fmask(image)) {
radv_image_alloc_fmask(device, image);
} else {
/* Otherwise, try to enable HTILE for depth surfaces. */
if (radv_image_can_enable_htile(image) &&
!(device->instance->debug_flags & RADV_DEBUG_NO_HIZ)) {
radv_image_alloc_htile(image);
image->tc_compatible_htile = image->surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
} else {
image->surface.htile_size = 0;
}
}
} else {
image->surface.dcc_size = 0;
image->surface.htile_size = 0;
}
if (pCreateInfo->flags & VK_IMAGE_CREATE_SPARSE_BINDING_BIT) {
@@ -1048,55 +1052,10 @@ radv_image_view_init(struct radv_image_view *iview,
}
if (iview->vk_format != image->vk_format) {
unsigned view_bw = vk_format_get_blockwidth(iview->vk_format);
unsigned view_bh = vk_format_get_blockheight(iview->vk_format);
unsigned img_bw = vk_format_get_blockwidth(image->vk_format);
unsigned img_bh = vk_format_get_blockheight(image->vk_format);
iview->extent.width = round_up_u32(iview->extent.width * view_bw, img_bw);
iview->extent.height = round_up_u32(iview->extent.height * view_bh, img_bh);
/* Comment ported from amdvlk -
* If we have the following image:
* Uncompressed pixels Compressed block sizes (4x4)
* mip0: 22 x 22 6 x 6
* mip1: 11 x 11 3 x 3
* mip2: 5 x 5 2 x 2
* mip3: 2 x 2 1 x 1
* mip4: 1 x 1 1 x 1
*
* On GFX9 the descriptor is always programmed with the WIDTH and HEIGHT of the base level and the HW is
* calculating the degradation of the block sizes down the mip-chain as follows (straight-up
* divide-by-two integer math):
* mip0: 6x6
* mip1: 3x3
* mip2: 1x1
* mip3: 1x1
*
* This means that mip2 will be missing texels.
*
* Fix this by calculating the base mip's width and height, then convert that, and round it
* back up to get the level 0 size.
* Clamp the converted size between the original values, and next power of two, which
* means we don't oversize the image.
*/
if (device->physical_device->rad_info.chip_class >= GFX9 &&
vk_format_is_compressed(image->vk_format) &&
!vk_format_is_compressed(iview->vk_format)) {
unsigned rounded_img_w = util_next_power_of_two(iview->extent.width);
unsigned rounded_img_h = util_next_power_of_two(iview->extent.height);
unsigned lvl_width = radv_minify(image->info.width , range->baseMipLevel);
unsigned lvl_height = radv_minify(image->info.height, range->baseMipLevel);
lvl_width = round_up_u32(lvl_width * view_bw, img_bw);
lvl_height = round_up_u32(lvl_height * view_bh, img_bh);
lvl_width <<= range->baseMipLevel;
lvl_height <<= range->baseMipLevel;
iview->extent.width = CLAMP(lvl_width, iview->extent.width, rounded_img_w);
iview->extent.height = CLAMP(lvl_height, iview->extent.height, rounded_img_h);
}
iview->extent.width = round_up_u32(iview->extent.width * vk_format_get_blockwidth(iview->vk_format),
vk_format_get_blockwidth(image->vk_format));
iview->extent.height = round_up_u32(iview->extent.height * vk_format_get_blockheight(iview->vk_format),
vk_format_get_blockheight(image->vk_format));
}
iview->base_layer = range->baseArrayLayer;
@@ -1160,6 +1119,15 @@ radv_CreateImage(VkDevice device,
const VkAllocationCallbacks *pAllocator,
VkImage *pImage)
{
#ifdef ANDROID
const VkNativeBufferANDROID *gralloc_info =
vk_find_struct_const(pCreateInfo->pNext, NATIVE_BUFFER_ANDROID);
if (gralloc_info)
return radv_image_from_gralloc(device, pCreateInfo, gralloc_info,
pAllocator, pImage);
#endif
return radv_image_create(device,
&(struct radv_image_create_info) {
.vk_info = pCreateInfo,
@@ -1182,6 +1150,9 @@ radv_DestroyImage(VkDevice _device, VkImage _image,
if (image->flags & VK_IMAGE_CREATE_SPARSE_BINDING_BIT)
device->ws->buffer_destroy(image->bo);
if (image->owned_memory != VK_NULL_HANDLE)
radv_FreeMemory(_device, image->owned_memory, pAllocator);
vk_free2(&device->alloc, pAllocator, image);
}

View File

@@ -377,9 +377,9 @@ fail_resolve_fragment:
fail_resolve_compute:
radv_device_finish_meta_fast_clear_flush_state(device);
fail_fast_clear:
radv_device_finish_meta_query_state(device);
fail_query:
radv_device_finish_meta_buffer_state(device);
fail_query:
radv_device_finish_meta_query_state(device);
fail_buffer:
radv_device_finish_meta_depth_decomp_state(device);
fail_depth_decomp:
@@ -533,7 +533,7 @@ void radv_meta_build_resolve_shader_core(nir_builder *b,
nir_ssa_dest_init(&tex_all_same->instr, &tex_all_same->dest, 1, 32, "tex");
nir_builder_instr_insert(b, &tex_all_same->instr);
nir_ssa_def *all_same = nir_ieq(b, &tex_all_same->dest.ssa, nir_imm_int(b, 0));
nir_ssa_def *all_same = nir_ine(b, &tex_all_same->dest.ssa, nir_imm_int(b, 0));
nir_if *if_stmt = nir_if_create(b->shader);
if_stmt->condition = nir_src_for_ssa(all_same);
nir_cf_node_insert(b->cursor, &if_stmt->cf_node);

View File

@@ -109,7 +109,6 @@ struct radv_meta_blit2d_surf {
unsigned level;
unsigned layer;
VkImageAspectFlags aspect_mask;
VkImageLayout current_layout;
};
struct radv_meta_blit2d_buffer {

View File

@@ -265,14 +265,12 @@ static void
meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *src_image,
struct radv_image_view *src_iview,
VkImageLayout src_image_layout,
VkOffset3D src_offset_0,
VkOffset3D src_offset_1,
struct radv_image *dest_image,
struct radv_image_view *dest_iview,
VkImageLayout dest_image_layout,
VkOffset2D dest_offset_0,
VkOffset2D dest_offset_1,
VkOffset3D dest_offset_0,
VkOffset3D dest_offset_1,
VkRect2D dest_box,
VkFilter blit_filter)
{
@@ -353,12 +351,11 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
}
break;
}
case VK_IMAGE_ASPECT_DEPTH_BIT: {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dest_image_layout);
case VK_IMAGE_ASPECT_DEPTH_BIT:
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit.depth_only_rp[ds_layout],
.renderPass = device->meta_state.blit.depth_only_rp,
.framebuffer = fb,
.renderArea = {
.offset = { dest_box.offset.x, dest_box.offset.y },
@@ -381,13 +378,11 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
unreachable(!"bad VkImageType");
}
break;
}
case VK_IMAGE_ASPECT_STENCIL_BIT: {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dest_image_layout);
case VK_IMAGE_ASPECT_STENCIL_BIT:
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit.stencil_only_rp[ds_layout],
.renderPass = device->meta_state.blit.stencil_only_rp,
.framebuffer = fb,
.renderArea = {
.offset = { dest_box.offset.x, dest_box.offset.y },
@@ -410,7 +405,6 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
unreachable(!"bad VkImageType");
}
break;
}
default:
unreachable(!"bad VkImageType");
}
@@ -524,6 +518,21 @@ void radv_CmdBlitImage(
for (unsigned r = 0; r < regionCount; r++) {
const VkImageSubresourceLayers *src_res = &pRegions[r].srcSubresource;
const VkImageSubresourceLayers *dst_res = &pRegions[r].dstSubresource;
struct radv_image_view src_iview;
radv_image_view_init(&src_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = srcImage,
.viewType = radv_meta_get_view_type(src_image),
.format = src_image->vk_format,
.subresourceRange = {
.aspectMask = src_res->aspectMask,
.baseMipLevel = src_res->mipLevel,
.levelCount = 1,
.baseArrayLayer = src_res->baseArrayLayer,
.layerCount = 1
},
});
unsigned dst_start, dst_end;
if (dest_image->type == VK_IMAGE_TYPE_3D) {
@@ -570,17 +579,18 @@ void radv_CmdBlitImage(
dest_box.extent.width = abs(dst_x1 - dst_x0);
dest_box.extent.height = abs(dst_y1 - dst_y0);
struct radv_image_view dest_iview;
const unsigned num_layers = dst_end - dst_start;
for (unsigned i = 0; i < num_layers; i++) {
struct radv_image_view dest_iview, src_iview;
const VkOffset2D dest_offset_0 = {
const VkOffset3D dest_offset_0 = {
.x = dst_x0,
.y = dst_y0,
.z = dst_start + i ,
};
const VkOffset2D dest_offset_1 = {
const VkOffset3D dest_offset_1 = {
.x = dst_x1,
.y = dst_y1,
.z = dst_start + i ,
};
VkOffset3D src_offset_0 = {
.x = src_x0,
@@ -592,10 +602,9 @@ void radv_CmdBlitImage(
.y = src_y1,
.z = src_start + i * src_z_step,
};
const uint32_t dest_array_slice = dst_start + i;
/* 3D images have just 1 layer */
const uint32_t src_array_slice = src_image->type == VK_IMAGE_TYPE_3D ? 0 : src_start + i;
const uint32_t dest_array_slice =
radv_meta_get_iview_layer(dest_image, dst_res,
&dest_offset_0);
radv_image_view_init(&dest_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
@@ -611,24 +620,10 @@ void radv_CmdBlitImage(
.layerCount = 1
},
});
radv_image_view_init(&src_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = srcImage,
.viewType = radv_meta_get_view_type(src_image),
.format = src_image->vk_format,
.subresourceRange = {
.aspectMask = src_res->aspectMask,
.baseMipLevel = src_res->mipLevel,
.levelCount = 1,
.baseArrayLayer = src_array_slice,
.layerCount = 1
},
});
meta_emit_blit(cmd_buffer,
src_image, &src_iview, srcImageLayout,
src_image, &src_iview,
src_offset_0, src_offset_1,
dest_image, &dest_iview, destImageLayout,
dest_image, &dest_iview,
dest_offset_0, dest_offset_1,
dest_box,
filter);
@@ -658,13 +653,8 @@ radv_device_finish_meta_blit_state(struct radv_device *device)
&state->alloc);
}
for (enum radv_blit_ds_layout i = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; i < RADV_BLIT_DS_LAYOUT_COUNT; i++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit.depth_only_rp[i], &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit.stencil_only_rp[i], &state->alloc);
}
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit.depth_only_rp, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit.depth_only_1d_pipeline, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
@@ -672,6 +662,8 @@ radv_device_finish_meta_blit_state(struct radv_device *device)
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit.depth_only_3d_pipeline, &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit.stencil_only_rp, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit.stencil_only_1d_pipeline,
&state->alloc);
@@ -682,7 +674,6 @@ radv_device_finish_meta_blit_state(struct radv_device *device)
state->blit.stencil_only_3d_pipeline,
&state->alloc);
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->blit.pipeline_layout, &state->alloc);
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
@@ -876,38 +867,35 @@ radv_device_init_meta_blit_depth(struct radv_device *device,
fs_2d.nir = build_nir_copy_fragment_shader_depth(GLSL_SAMPLER_DIM_2D);
fs_3d.nir = build_nir_copy_fragment_shader_depth(GLSL_SAMPLER_DIM_3D);
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = &(VkAttachmentDescription) {
.format = VK_FORMAT_D32_SFLOAT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = layout,
.finalLayout = layout,
},
.format = VK_FORMAT_D32_SFLOAT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = VK_IMAGE_LAYOUT_GENERAL,
.finalLayout = VK_IMAGE_LAYOUT_GENERAL,
},
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = layout,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit.depth_only_rp[ds_layout]);
if (result != VK_SUCCESS)
goto fail;
}
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_GENERAL,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit.depth_only_rp);
if (result != VK_SUCCESS)
goto fail;
VkPipelineVertexInputStateCreateInfo vi_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,
@@ -987,7 +975,7 @@ radv_device_init_meta_blit_depth(struct radv_device *device,
},
.flags = 0,
.layout = device->meta_state.blit.pipeline_layout,
.renderPass = device->meta_state.blit.depth_only_rp[0],
.renderPass = device->meta_state.blit.depth_only_rp,
.subpass = 0,
};
@@ -1037,36 +1025,33 @@ radv_device_init_meta_blit_stencil(struct radv_device *device,
fs_2d.nir = build_nir_copy_fragment_shader_stencil(GLSL_SAMPLER_DIM_2D);
fs_3d.nir = build_nir_copy_fragment_shader_stencil(GLSL_SAMPLER_DIM_3D);
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = &(VkAttachmentDescription) {
.format = VK_FORMAT_S8_UINT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = layout,
.finalLayout = layout,
},
.format = VK_FORMAT_S8_UINT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = VK_IMAGE_LAYOUT_GENERAL,
.finalLayout = VK_IMAGE_LAYOUT_GENERAL,
},
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = layout,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_GENERAL,
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit.stencil_only_rp[ds_layout]);
}
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit.stencil_only_rp);
if (result != VK_SUCCESS)
goto fail;
@@ -1150,6 +1135,7 @@ radv_device_init_meta_blit_stencil(struct radv_device *device,
},
.depthCompareOp = VK_COMPARE_OP_ALWAYS,
},
.pDynamicState = &(VkPipelineDynamicStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
.dynamicStateCount = 6,
@@ -1164,7 +1150,7 @@ radv_device_init_meta_blit_stencil(struct radv_device *device,
},
.flags = 0,
.layout = device->meta_state.blit.pipeline_layout,
.renderPass = device->meta_state.blit.stencil_only_rp[0],
.renderPass = device->meta_state.blit.stencil_only_rp,
.subpass = 0,
};
@@ -1196,7 +1182,6 @@ radv_device_init_meta_blit_stencil(struct radv_device *device,
if (result != VK_SUCCESS)
goto fail;
fail:
ralloc_free(fs_1d.nir);
ralloc_free(fs_2d.nir);

View File

@@ -30,7 +30,6 @@
enum blit2d_src_type {
BLIT2D_SRC_TYPE_IMAGE,
BLIT2D_SRC_TYPE_IMAGE_3D,
BLIT2D_SRC_TYPE_BUFFER,
BLIT2D_NUM_SRC_TYPES,
};
@@ -42,8 +41,6 @@ create_iview(struct radv_cmd_buffer *cmd_buffer,
VkImageAspectFlagBits aspects)
{
VkFormat format;
VkImageViewType view_type = cmd_buffer->device->physical_device->rad_info.chip_class < GFX9 ? VK_IMAGE_VIEW_TYPE_2D :
radv_meta_get_view_type(surf->image);
if (depth_format)
format = depth_format;
@@ -54,7 +51,7 @@ create_iview(struct radv_cmd_buffer *cmd_buffer,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(surf->image),
.viewType = view_type,
.viewType = VK_IMAGE_VIEW_TYPE_2D,
.format = format,
.subresourceRange = {
.aspectMask = aspects,
@@ -129,12 +126,6 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
} else {
create_iview(cmd_buffer, src_img, &tmp->iview, depth_format, aspects);
if (src_type == BLIT2D_SRC_TYPE_IMAGE_3D)
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4,
&src_img->layer);
radv_meta_push_descriptor_set(cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.blit2d.p_layouts[src_type],
0, /* set */
@@ -278,11 +269,10 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
bind_pipeline(cmd_buffer, src_type, fs_key);
} else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT) {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dst->current_layout);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.depth_only_rp[ds_layout],
.renderPass = device->meta_state.blit2d.depth_only_rp,
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -296,11 +286,10 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
bind_depth_pipeline(cmd_buffer, src_type);
} else if (aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dst->current_layout);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.stencil_only_rp[ds_layout],
.renderPass = device->meta_state.blit2d.stencil_only_rp,
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -352,10 +341,8 @@ radv_meta_blit2d(struct radv_cmd_buffer *cmd_buffer,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects)
{
bool use_3d = cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9 &&
(src_img && src_img->image->type == VK_IMAGE_TYPE_3D);
enum blit2d_src_type src_type = src_buf ? BLIT2D_SRC_TYPE_BUFFER :
use_3d ? BLIT2D_SRC_TYPE_IMAGE_3D : BLIT2D_SRC_TYPE_IMAGE;
BLIT2D_SRC_TYPE_IMAGE;
radv_meta_blit2d_normal_dst(cmd_buffer, src_img, src_buf, dst,
num_rects, rects, src_type);
}
@@ -420,46 +407,29 @@ build_nir_vertex_shader(void)
typedef nir_ssa_def* (*texel_fetch_build_func)(struct nir_builder *,
struct radv_device *,
nir_ssa_def *, bool);
nir_ssa_def *);
static nir_ssa_def *
build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
nir_ssa_def *tex_pos, bool is_3d)
nir_ssa_def *tex_pos)
{
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *sampler_type =
glsl_sampler_type(dim, false, false, GLSL_TYPE_UINT);
glsl_sampler_type(GLSL_SAMPLER_DIM_2D, false, false, GLSL_TYPE_UINT);
nir_variable *sampler = nir_variable_create(b->shader, nir_var_uniform,
sampler_type, "s_tex");
sampler->data.descriptor_set = 0;
sampler->data.binding = 0;
nir_ssa_def *tex_pos_3d = NULL;
if (is_3d) {
nir_intrinsic_instr *layer = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(layer, 16);
nir_intrinsic_set_range(layer, 4);
layer->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
layer->num_components = 1;
nir_ssa_dest_init(&layer->instr, &layer->dest, 1, 32, "layer");
nir_builder_instr_insert(b, &layer->instr);
nir_ssa_def *chans[3];
chans[0] = nir_channel(b, tex_pos, 0);
chans[1] = nir_channel(b, tex_pos, 1);
chans[2] = &layer->dest.ssa;
tex_pos_3d = nir_vec(b, chans, 3);
}
nir_tex_instr *tex = nir_tex_instr_create(b->shader, 2);
tex->sampler_dim = dim;
tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
tex->op = nir_texop_txf;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(is_3d ? tex_pos_3d : tex_pos);
tex->src[0].src = nir_src_for_ssa(tex_pos);
tex->src[1].src_type = nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(b, 0));
tex->dest_type = nir_type_uint;
tex->is_array = false;
tex->coord_components = is_3d ? 3 : 2;
tex->coord_components = 2;
tex->texture = nir_deref_var_create(tex, sampler);
tex->sampler = NULL;
@@ -472,7 +442,7 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
static nir_ssa_def *
build_nir_buffer_fetch(struct nir_builder *b, struct radv_device *device,
nir_ssa_def *tex_pos, bool is_3d)
nir_ssa_def *tex_pos)
{
const struct glsl_type *sampler_type =
glsl_sampler_type(GLSL_SAMPLER_DIM_BUF, false, false, GLSL_TYPE_UINT);
@@ -520,7 +490,7 @@ static const VkPipelineVertexInputStateCreateInfo normal_vi_create_info = {
static nir_shader *
build_nir_copy_fragment_shader(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -541,7 +511,7 @@ build_nir_copy_fragment_shader(struct radv_device *device,
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos);
nir_store_var(&b, color_out, color, 0xf);
return b.shader;
@@ -549,7 +519,7 @@ build_nir_copy_fragment_shader(struct radv_device *device,
static nir_shader *
build_nir_copy_fragment_shader_depth(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -570,7 +540,7 @@ build_nir_copy_fragment_shader_depth(struct radv_device *device,
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos);
nir_store_var(&b, color_out, color, 0x1);
return b.shader;
@@ -578,7 +548,7 @@ build_nir_copy_fragment_shader_depth(struct radv_device *device,
static nir_shader *
build_nir_copy_fragment_shader_stencil(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -599,7 +569,7 @@ build_nir_copy_fragment_shader_stencil(struct radv_device *device,
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos);
nir_store_var(&b, color_out, color, 0x1);
return b.shader;
@@ -616,12 +586,10 @@ radv_device_finish_meta_blit2d_state(struct radv_device *device)
&state->alloc);
}
for (enum radv_blit_ds_layout j = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; j < RADV_BLIT_DS_LAYOUT_COUNT; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.depth_only_rp[j], &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.stencil_only_rp[j], &state->alloc);
}
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.depth_only_rp, &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.stencil_only_rp, &state->alloc);
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
@@ -661,10 +629,6 @@ blit2d_init_color_pipeline(struct radv_device *device,
src_func = build_nir_texel_fetch;
name = "meta_blit2d_image_fs";
break;
case BLIT2D_SRC_TYPE_IMAGE_3D:
src_func = build_nir_texel_fetch;
name = "meta_blit3d_image_fs";
break;
case BLIT2D_SRC_TYPE_BUFFER:
src_func = build_nir_buffer_fetch;
name = "meta_blit2d_buffer_fs";
@@ -678,7 +642,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader(device, src_func, name);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -824,10 +788,6 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
src_func = build_nir_texel_fetch;
name = "meta_blit2d_depth_image_fs";
break;
case BLIT2D_SRC_TYPE_IMAGE_3D:
src_func = build_nir_texel_fetch;
name = "meta_blit3d_depth_image_fs";
break;
case BLIT2D_SRC_TYPE_BUFFER:
src_func = build_nir_buffer_fetch;
name = "meta_blit2d_depth_buffer_fs";
@@ -840,7 +800,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader_depth(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader_depth(device, src_func, name);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -863,37 +823,34 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
},
};
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
if (!device->meta_state.blit2d.depth_only_rp[ds_layout]) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
if (!device->meta_state.blit2d.depth_only_rp) {
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = &(VkAttachmentDescription) {
.format = VK_FORMAT_D32_SFLOAT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = layout,
.finalLayout = layout,
},
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = layout,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.depth_only_rp[ds_layout]);
}
.format = VK_FORMAT_D32_SFLOAT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
.finalLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
},
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.depth_only_rp);
}
const VkGraphicsPipelineCreateInfo vk_pipeline_info = {
@@ -952,7 +909,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.depth_only_rp[0],
.renderPass = device->meta_state.blit2d.depth_only_rp,
.subpass = 0,
};
@@ -986,10 +943,6 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
src_func = build_nir_texel_fetch;
name = "meta_blit2d_stencil_image_fs";
break;
case BLIT2D_SRC_TYPE_IMAGE_3D:
src_func = build_nir_texel_fetch;
name = "meta_blit3d_stencil_image_fs";
break;
case BLIT2D_SRC_TYPE_BUFFER:
src_func = build_nir_buffer_fetch;
name = "meta_blit2d_stencil_buffer_fs";
@@ -1002,7 +955,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader_stencil(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader_stencil(device, src_func, name);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -1025,37 +978,34 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
},
};
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
if (!device->meta_state.blit2d.stencil_only_rp[ds_layout]) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
if (!device->meta_state.blit2d.stencil_only_rp) {
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = &(VkAttachmentDescription) {
.format = VK_FORMAT_S8_UINT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = layout,
.finalLayout = layout,
},
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = layout,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.stencil_only_rp[ds_layout]);
}
.format = VK_FORMAT_S8_UINT,
.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.storeOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
.finalLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
},
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
},
.preserveAttachmentCount = 1,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.stencil_only_rp);
}
const VkGraphicsPipelineCreateInfo vk_pipeline_info = {
@@ -1130,7 +1080,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.stencil_only_rp[0],
.renderPass = device->meta_state.blit2d.stencil_only_rp,
.subpass = 0,
};
@@ -1170,7 +1120,6 @@ VkResult
radv_device_init_meta_blit2d_state(struct radv_device *device)
{
VkResult result;
bool create_3d = device->physical_device->rad_info.chip_class >= GFX9;
const VkPushConstantRange push_constant_ranges[] = {
{VK_SHADER_STAGE_VERTEX_BIT, 0, 16},
@@ -1206,37 +1155,6 @@ radv_device_init_meta_blit2d_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (create_3d) {
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&(VkDescriptorSetLayoutCreateInfo) {
.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
.flags = VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR,
.bindingCount = 1,
.pBindings = (VkDescriptorSetLayoutBinding[]) {
{
.binding = 0,
.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT,
.pImmutableSamplers = NULL
},
}
}, &device->meta_state.alloc, &device->meta_state.blit2d.ds_layouts[BLIT2D_SRC_TYPE_IMAGE_3D]);
if (result != VK_SUCCESS)
goto fail;
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&(VkPipelineLayoutCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.blit2d.ds_layouts[BLIT2D_SRC_TYPE_IMAGE_3D],
.pushConstantRangeCount = 2,
.pPushConstantRanges = push_constant_ranges,
},
&device->meta_state.alloc, &device->meta_state.blit2d.p_layouts[BLIT2D_SRC_TYPE_IMAGE_3D]);
if (result != VK_SUCCESS)
goto fail;
}
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&(VkDescriptorSetLayoutCreateInfo) {
.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
@@ -1269,8 +1187,6 @@ radv_device_init_meta_blit2d_state(struct radv_device *device)
goto fail;
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
if (src == BLIT2D_SRC_TYPE_IMAGE_3D && !create_3d)
continue;
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
result = blit2d_init_color_pipeline(device, src, pipeline_formats[j]);
if (result != VK_SUCCESS)

View File

@@ -29,15 +29,11 @@
* Compute queue: implementation also of buffer->image, image->image, and image clear.
*/
/* GFX9 needs to use a 3D sampler to access 3D resources, so the shader has the options
* for that.
*/
static nir_shader *
build_nir_itob_compute_shader(struct radv_device *dev, bool is_3d)
build_nir_itob_compute_shader(struct radv_device *dev)
{
nir_builder b;
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *sampler_type = glsl_sampler_type(dim,
const struct glsl_type *sampler_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
@@ -46,7 +42,7 @@ build_nir_itob_compute_shader(struct radv_device *dev, bool is_3d)
false,
GLSL_TYPE_FLOAT);
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, is_3d ? "meta_itob_cs_3d" : "meta_itob_cs");
b.shader->info.name = ralloc_strdup(b.shader, "meta_itob_cs");
b.shader->info.cs.local_size[0] = 16;
b.shader->info.cs.local_size[1] = 16;
b.shader->info.cs.local_size[2] = 1;
@@ -73,31 +69,32 @@ build_nir_itob_compute_shader(struct radv_device *dev, bool is_3d)
nir_intrinsic_instr *offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(offset, 0);
nir_intrinsic_set_range(offset, 16);
nir_intrinsic_set_range(offset, 12);
offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
offset->num_components = is_3d ? 3 : 2;
nir_ssa_dest_init(&offset->instr, &offset->dest, is_3d ? 3 : 2, 32, "offset");
offset->num_components = 2;
nir_ssa_dest_init(&offset->instr, &offset->dest, 2, 32, "offset");
nir_builder_instr_insert(&b, &offset->instr);
nir_intrinsic_instr *stride = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(stride, 0);
nir_intrinsic_set_range(stride, 16);
stride->src[0] = nir_src_for_ssa(nir_imm_int(&b, 12));
nir_intrinsic_set_range(stride, 12);
stride->src[0] = nir_src_for_ssa(nir_imm_int(&b, 8));
stride->num_components = 1;
nir_ssa_dest_init(&stride->instr, &stride->dest, 1, 32, "stride");
nir_builder_instr_insert(&b, &stride->instr);
nir_ssa_def *img_coord = nir_iadd(&b, global_id, &offset->dest.ssa);
nir_tex_instr *tex = nir_tex_instr_create(b.shader, 2);
tex->sampler_dim = dim;
tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
tex->op = nir_texop_txf;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(nir_channels(&b, img_coord, is_3d ? 0x7 : 0x3));
tex->src[0].src = nir_src_for_ssa(nir_channels(&b, img_coord, 0x3));
tex->src[1].src_type = nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(&b, 0));
tex->dest_type = nir_type_float;
tex->is_array = false;
tex->coord_components = is_3d ? 3 : 2;
tex->coord_components = 2;
tex->texture = nir_deref_var_create(tex, input_img);
tex->sampler = NULL;
@@ -129,11 +126,8 @@ radv_device_init_meta_itob_state(struct radv_device *device)
{
VkResult result;
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_itob_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_itob_compute_shader(device, true);
cs.nir = build_nir_itob_compute_shader(device);
/*
* two descriptors one for the image being sampled
@@ -174,7 +168,7 @@ radv_device_init_meta_itob_state(struct radv_device *device)
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.itob.img_ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 16},
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 12},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
@@ -208,36 +202,10 @@ radv_device_init_meta_itob_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class >= GFX9) {
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&cs_3d),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo vk_pipeline_info_3d = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = pipeline_shader_stage_3d,
.flags = 0,
.layout = device->meta_state.itob.img_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &vk_pipeline_info_3d, NULL,
&device->meta_state.itob.pipeline_3d);
if (result != VK_SUCCESS)
goto fail;
ralloc_free(cs_3d.nir);
}
ralloc_free(cs.nir);
return VK_SUCCESS;
fail:
ralloc_free(cs.nir);
ralloc_free(cs_3d.nir);
return result;
}
@@ -253,26 +221,22 @@ radv_device_finish_meta_itob_state(struct radv_device *device)
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->itob.pipeline, &state->alloc);
if (device->physical_device->rad_info.chip_class >= GFX9)
radv_DestroyPipeline(radv_device_to_handle(device),
state->itob.pipeline_3d, &state->alloc);
}
static nir_shader *
build_nir_btoi_compute_shader(struct radv_device *dev, bool is_3d)
build_nir_btoi_compute_shader(struct radv_device *dev)
{
nir_builder b;
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *buf_type = glsl_sampler_type(GLSL_SAMPLER_DIM_BUF,
false,
false,
GLSL_TYPE_FLOAT);
const struct glsl_type *img_type = glsl_sampler_type(dim,
const struct glsl_type *img_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, is_3d ? "meta_btoi_cs_3d" : "meta_btoi_cs");
b.shader->info.name = ralloc_strdup(b.shader, "meta_btoi_cs");
b.shader->info.cs.local_size[0] = 16;
b.shader->info.cs.local_size[1] = 16;
b.shader->info.cs.local_size[2] = 1;
@@ -297,16 +261,16 @@ build_nir_btoi_compute_shader(struct radv_device *dev, bool is_3d)
nir_intrinsic_instr *offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(offset, 0);
nir_intrinsic_set_range(offset, 16);
nir_intrinsic_set_range(offset, 12);
offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
offset->num_components = is_3d ? 3 : 2;
nir_ssa_dest_init(&offset->instr, &offset->dest, is_3d ? 3 : 2, 32, "offset");
offset->num_components = 2;
nir_ssa_dest_init(&offset->instr, &offset->dest, 2, 32, "offset");
nir_builder_instr_insert(&b, &offset->instr);
nir_intrinsic_instr *stride = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(stride, 0);
nir_intrinsic_set_range(stride, 16);
stride->src[0] = nir_src_for_ssa(nir_imm_int(&b, 12));
nir_intrinsic_set_range(stride, 12);
stride->src[0] = nir_src_for_ssa(nir_imm_int(&b, 8));
stride->num_components = 1;
nir_ssa_dest_init(&stride->instr, &stride->dest, 1, 32, "stride");
nir_builder_instr_insert(&b, &stride->instr);
@@ -354,10 +318,9 @@ radv_device_init_meta_btoi_state(struct radv_device *device)
{
VkResult result;
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_btoi_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_btoi_compute_shader(device, true);
cs.nir = build_nir_btoi_compute_shader(device);
/*
* two descriptors one for the image being sampled
* one for the buffer being written.
@@ -397,7 +360,7 @@ radv_device_init_meta_btoi_state(struct radv_device *device)
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.btoi.img_ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 16},
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 12},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
@@ -431,33 +394,9 @@ radv_device_init_meta_btoi_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class >= GFX9) {
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&cs_3d),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo vk_pipeline_info_3d = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = pipeline_shader_stage_3d,
.flags = 0,
.layout = device->meta_state.btoi.img_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &vk_pipeline_info_3d, NULL,
&device->meta_state.btoi.pipeline_3d);
ralloc_free(cs_3d.nir);
}
ralloc_free(cs.nir);
return VK_SUCCESS;
fail:
ralloc_free(cs_3d.nir);
ralloc_free(cs.nir);
return result;
}
@@ -474,25 +413,22 @@ radv_device_finish_meta_btoi_state(struct radv_device *device)
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->btoi.pipeline, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->btoi.pipeline_3d, &state->alloc);
}
static nir_shader *
build_nir_itoi_compute_shader(struct radv_device *dev, bool is_3d)
build_nir_itoi_compute_shader(struct radv_device *dev)
{
nir_builder b;
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *buf_type = glsl_sampler_type(dim,
const struct glsl_type *buf_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
const struct glsl_type *img_type = glsl_sampler_type(dim,
const struct glsl_type *img_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, is_3d ? "meta_itoi_cs_3d" : "meta_itoi_cs");
b.shader->info.name = ralloc_strdup(b.shader, "meta_itoi_cs");
b.shader->info.cs.local_size[0] = 16;
b.shader->info.cs.local_size[1] = 16;
b.shader->info.cs.local_size[2] = 1;
@@ -517,18 +453,18 @@ build_nir_itoi_compute_shader(struct radv_device *dev, bool is_3d)
nir_intrinsic_instr *src_offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(src_offset, 0);
nir_intrinsic_set_range(src_offset, 24);
nir_intrinsic_set_range(src_offset, 16);
src_offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
src_offset->num_components = is_3d ? 3 : 2;
nir_ssa_dest_init(&src_offset->instr, &src_offset->dest, is_3d ? 3 : 2, 32, "src_offset");
src_offset->num_components = 2;
nir_ssa_dest_init(&src_offset->instr, &src_offset->dest, 2, 32, "src_offset");
nir_builder_instr_insert(&b, &src_offset->instr);
nir_intrinsic_instr *dst_offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(dst_offset, 0);
nir_intrinsic_set_range(dst_offset, 24);
dst_offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 12));
dst_offset->num_components = is_3d ? 3 : 2;
nir_ssa_dest_init(&dst_offset->instr, &dst_offset->dest, is_3d ? 3 : 2, 32, "dst_offset");
nir_intrinsic_set_range(dst_offset, 16);
dst_offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 8));
dst_offset->num_components = 2;
nir_ssa_dest_init(&dst_offset->instr, &dst_offset->dest, 2, 32, "dst_offset");
nir_builder_instr_insert(&b, &dst_offset->instr);
nir_ssa_def *src_coord = nir_iadd(&b, global_id, &src_offset->dest.ssa);
@@ -536,15 +472,15 @@ build_nir_itoi_compute_shader(struct radv_device *dev, bool is_3d)
nir_ssa_def *dst_coord = nir_iadd(&b, global_id, &dst_offset->dest.ssa);
nir_tex_instr *tex = nir_tex_instr_create(b.shader, 2);
tex->sampler_dim = dim;
tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
tex->op = nir_texop_txf;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(nir_channels(&b, src_coord, is_3d ? 0x7 : 0x3));
tex->src[0].src = nir_src_for_ssa(nir_channels(&b, src_coord, 3));
tex->src[1].src_type = nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(&b, 0));
tex->dest_type = nir_type_float;
tex->is_array = false;
tex->coord_components = is_3d ? 3 : 2;
tex->coord_components = 2;
tex->texture = nir_deref_var_create(tex, input_img);
tex->sampler = NULL;
@@ -568,10 +504,9 @@ radv_device_init_meta_itoi_state(struct radv_device *device)
{
VkResult result;
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_itoi_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_itoi_compute_shader(device, true);
cs.nir = build_nir_itoi_compute_shader(device);
/*
* two descriptors one for the image being sampled
* one for the buffer being written.
@@ -611,7 +546,7 @@ radv_device_init_meta_itoi_state(struct radv_device *device)
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.itoi.img_ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 24},
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 16},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
@@ -645,35 +580,10 @@ radv_device_init_meta_itoi_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class >= GFX9) {
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&cs_3d),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo vk_pipeline_info_3d = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = pipeline_shader_stage_3d,
.flags = 0,
.layout = device->meta_state.itoi.img_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &vk_pipeline_info_3d, NULL,
&device->meta_state.itoi.pipeline_3d);
ralloc_free(cs_3d.nir);
}
ralloc_free(cs.nir);
return VK_SUCCESS;
fail:
ralloc_free(cs.nir);
ralloc_free(cs_3d.nir);
return result;
}
@@ -689,22 +599,18 @@ radv_device_finish_meta_itoi_state(struct radv_device *device)
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->itoi.pipeline, &state->alloc);
if (device->physical_device->rad_info.chip_class >= GFX9)
radv_DestroyPipeline(radv_device_to_handle(device),
state->itoi.pipeline_3d, &state->alloc);
}
static nir_shader *
build_nir_cleari_compute_shader(struct radv_device *dev, bool is_3d)
build_nir_cleari_compute_shader(struct radv_device *dev)
{
nir_builder b;
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *img_type = glsl_sampler_type(dim,
const struct glsl_type *img_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, is_3d ? "meta_cleari_cs_3d" : "meta_cleari_cs");
b.shader->info.name = ralloc_strdup(b.shader, "meta_cleari_cs");
b.shader->info.cs.local_size[0] = 16;
b.shader->info.cs.local_size[1] = 16;
b.shader->info.cs.local_size[2] = 1;
@@ -725,29 +631,12 @@ build_nir_cleari_compute_shader(struct radv_device *dev, bool is_3d)
nir_intrinsic_instr *clear_val = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(clear_val, 0);
nir_intrinsic_set_range(clear_val, 20);
nir_intrinsic_set_range(clear_val, 16);
clear_val->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
clear_val->num_components = 4;
nir_ssa_dest_init(&clear_val->instr, &clear_val->dest, 4, 32, "clear_value");
nir_builder_instr_insert(&b, &clear_val->instr);
nir_intrinsic_instr *layer = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(layer, 0);
nir_intrinsic_set_range(layer, 20);
layer->src[0] = nir_src_for_ssa(nir_imm_int(&b, 16));
layer->num_components = 1;
nir_ssa_dest_init(&layer->instr, &layer->dest, 1, 32, "layer");
nir_builder_instr_insert(&b, &layer->instr);
nir_ssa_def *global_z = nir_iadd(&b, nir_channel(&b, global_id, 2), &layer->dest.ssa);
nir_ssa_def *comps[4];
comps[0] = nir_channel(&b, global_id, 0);
comps[1] = nir_channel(&b, global_id, 1);
comps[2] = global_z;
comps[3] = nir_imm_int(&b, 0);
global_id = nir_vec(&b, comps, 4);
nir_intrinsic_instr *store = nir_intrinsic_instr_create(b.shader, nir_intrinsic_image_store);
store->src[0] = nir_src_for_ssa(global_id);
store->src[1] = nir_src_for_ssa(nir_ssa_undef(&b, 1, 32));
@@ -763,10 +652,8 @@ radv_device_init_meta_cleari_state(struct radv_device *device)
{
VkResult result;
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_cleari_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_cleari_compute_shader(device, true);
cs.nir = build_nir_cleari_compute_shader(device);
/*
* two descriptors one for the image being sampled
@@ -800,7 +687,7 @@ radv_device_init_meta_cleari_state(struct radv_device *device)
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.cleari.img_ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 20},
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 16},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
@@ -834,38 +721,10 @@ radv_device_init_meta_cleari_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class >= GFX9) {
/* compute shader */
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&cs_3d),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo vk_pipeline_info_3d = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = pipeline_shader_stage_3d,
.flags = 0,
.layout = device->meta_state.cleari.img_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &vk_pipeline_info_3d, NULL,
&device->meta_state.cleari.pipeline_3d);
if (result != VK_SUCCESS)
goto fail;
ralloc_free(cs_3d.nir);
}
ralloc_free(cs.nir);
return VK_SUCCESS;
fail:
ralloc_free(cs.nir);
ralloc_free(cs_3d.nir);
return result;
}
@@ -881,8 +740,6 @@ radv_device_finish_meta_cleari_state(struct radv_device *device)
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->cleari.pipeline, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->cleari.pipeline_3d, &state->alloc);
}
void
@@ -901,23 +758,21 @@ radv_device_init_meta_bufimage_state(struct radv_device *device)
result = radv_device_init_meta_itob_state(device);
if (result != VK_SUCCESS)
goto fail_itob;
return result;
result = radv_device_init_meta_btoi_state(device);
if (result != VK_SUCCESS)
goto fail_btoi;
goto fail_itob;
result = radv_device_init_meta_itoi_state(device);
if (result != VK_SUCCESS)
goto fail_itoi;
goto fail_btoi;
result = radv_device_init_meta_cleari_state(device);
if (result != VK_SUCCESS)
goto fail_cleari;
goto fail_itoi;
return VK_SUCCESS;
fail_cleari:
radv_device_finish_meta_cleari_state(device);
fail_itoi:
radv_device_finish_meta_itoi_state(device);
fail_btoi:
@@ -932,13 +787,12 @@ create_iview(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *surf,
struct radv_image_view *iview)
{
VkImageViewType view_type = cmd_buffer->device->physical_device->rad_info.chip_class < GFX9 ? VK_IMAGE_VIEW_TYPE_2D :
radv_meta_get_view_type(surf->image);
radv_image_view_init(iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(surf->image),
.viewType = view_type,
.viewType = VK_IMAGE_VIEW_TYPE_2D,
.format = surf->format,
.subresourceRange = {
.aspectMask = surf->aspect_mask,
@@ -1023,23 +877,19 @@ radv_meta_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
create_bview(cmd_buffer, dst->buffer, dst->offset, dst->format, &dst_view);
itob_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class >= GFX9 &&
src->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.itob.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
for (unsigned r = 0; r < num_rects; ++r) {
unsigned push_constants[4] = {
unsigned push_constants[3] = {
rects[r].src_x,
rects[r].src_y,
src->layer,
dst->pitch
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.itob.img_p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 16,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 12,
push_constants);
radv_unaligned_dispatch(cmd_buffer, rects[r].width, rects[r].height, 1);
@@ -1100,22 +950,18 @@ radv_meta_buffer_to_image_cs(struct radv_cmd_buffer *cmd_buffer,
create_iview(cmd_buffer, dst, &dst_view);
btoi_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class >= GFX9 &&
dst->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.btoi.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
for (unsigned r = 0; r < num_rects; ++r) {
unsigned push_constants[4] = {
unsigned push_constants[3] = {
rects[r].dst_x,
rects[r].dst_y,
dst->layer,
src->pitch,
src->pitch
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.btoi.img_p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 16,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 12,
push_constants);
radv_unaligned_dispatch(cmd_buffer, rects[r].width, rects[r].height, 1);
@@ -1182,24 +1028,19 @@ radv_meta_image_to_image_cs(struct radv_cmd_buffer *cmd_buffer,
itoi_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class >= GFX9 &&
src->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.itoi.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
for (unsigned r = 0; r < num_rects; ++r) {
unsigned push_constants[6] = {
unsigned push_constants[4] = {
rects[r].src_x,
rects[r].src_y,
src->layer,
rects[r].dst_x,
rects[r].dst_y,
dst->layer,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.itoi.img_p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 24,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 16,
push_constants);
radv_unaligned_dispatch(cmd_buffer, rects[r].width, rects[r].height, 1);
@@ -1247,24 +1088,19 @@ radv_meta_clear_image_cs(struct radv_cmd_buffer *cmd_buffer,
create_iview(cmd_buffer, dst, &dst_iview);
cleari_bind_descriptors(cmd_buffer, &dst_iview);
if (device->physical_device->rad_info.chip_class >= GFX9 &&
dst->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.cleari.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
unsigned push_constants[5] = {
unsigned push_constants[4] = {
clear_color->uint32[0],
clear_color->uint32[1],
clear_color->uint32[2],
clear_color->uint32[3],
dst->layer,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.cleari.img_p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 20,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 16,
push_constants);
radv_unaligned_dispatch(cmd_buffer, dst->image->info.width, dst->image->info.height, 1);

View File

@@ -628,7 +628,6 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
VK_SHADER_STAGE_VERTEX_BIT, 0, 4,
&clear_value.depth);
uint32_t prev_reference = cmd_buffer->state.dynamic.stencil_reference.front;
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
radv_CmdSetStencilReference(cmd_buffer_h, VK_STENCIL_FACE_FRONT_BIT,
clear_value.stencil);
@@ -663,11 +662,6 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &clear_rect->rect);
radv_CmdDraw(cmd_buffer_h, 3, clear_rect->layerCount, 0, clear_rect->baseArrayLayer);
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
radv_CmdSetStencilReference(cmd_buffer_h, VK_STENCIL_FACE_FRONT_BIT,
prev_reference);
}
}
static bool

View File

@@ -37,11 +37,10 @@ meta_image_block_size(const struct radv_image *image)
*/
static struct VkExtent3D
meta_region_extent_el(const struct radv_image *image,
const VkImageType imageType,
const struct VkExtent3D *extent)
{
const VkExtent3D block = meta_image_block_size(image);
return radv_sanitize_image_extent(imageType, (VkExtent3D) {
return radv_sanitize_image_extent(image->type, (VkExtent3D) {
.width = DIV_ROUND_UP(extent->width , block.width),
.height = DIV_ROUND_UP(extent->height, block.height),
.depth = DIV_ROUND_UP(extent->depth , block.depth),
@@ -80,7 +79,6 @@ vk_format_for_size(int bs)
static struct radv_meta_blit2d_surf
blit_surf_for_image_level_layer(struct radv_image *image,
VkImageLayout layout,
const VkImageSubresourceLayers *subres)
{
VkFormat format = image->vk_format;
@@ -89,8 +87,7 @@ blit_surf_for_image_level_layer(struct radv_image *image,
else if (subres->aspectMask & VK_IMAGE_ASPECT_STENCIL_BIT)
format = vk_format_stencil_only(format);
if (!image->surface.dcc_size &&
!(image->surface.htile_size && image->tc_compatible_htile))
if (!image->surface.dcc_size)
format = vk_format_for_size(vk_format_get_blocksize(format));
return (struct radv_meta_blit2d_surf) {
@@ -100,7 +97,6 @@ blit_surf_for_image_level_layer(struct radv_image *image,
.layer = subres->baseArrayLayer,
.image = image,
.aspect_mask = subres->aspectMask,
.current_layout = layout,
};
}
@@ -108,7 +104,6 @@ static void
meta_copy_buffer_to_image(struct radv_cmd_buffer *cmd_buffer,
struct radv_buffer* buffer,
struct radv_image* image,
VkImageLayout layout,
uint32_t regionCount,
const VkBufferImageCopy* pRegions)
{
@@ -147,11 +142,11 @@ meta_copy_buffer_to_image(struct radv_cmd_buffer *cmd_buffer,
pRegions[r].bufferImageHeight : pRegions[r].imageExtent.height,
};
const VkExtent3D buf_extent_el =
meta_region_extent_el(image, image->type, &bufferExtent);
meta_region_extent_el(image, &bufferExtent);
/* Start creating blit rect */
const VkExtent3D img_extent_el =
meta_region_extent_el(image, image->type, &pRegions[r].imageExtent);
meta_region_extent_el(image, &pRegions[r].imageExtent);
struct radv_meta_blit2d_rect rect = {
.width = img_extent_el.width,
.height = img_extent_el.height,
@@ -160,7 +155,6 @@ meta_copy_buffer_to_image(struct radv_cmd_buffer *cmd_buffer,
/* Create blit surfaces */
struct radv_meta_blit2d_surf img_bsurf =
blit_surf_for_image_level_layer(image,
layout,
&pRegions[r].imageSubresource);
struct radv_meta_blit2d_buffer buf_bsurf = {
@@ -220,7 +214,7 @@ void radv_CmdCopyBufferToImage(
RADV_FROM_HANDLE(radv_image, dest_image, destImage);
RADV_FROM_HANDLE(radv_buffer, src_buffer, srcBuffer);
meta_copy_buffer_to_image(cmd_buffer, src_buffer, dest_image, destImageLayout,
meta_copy_buffer_to_image(cmd_buffer, src_buffer, dest_image,
regionCount, pRegions);
}
@@ -228,7 +222,6 @@ static void
meta_copy_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radv_buffer* buffer,
struct radv_image* image,
VkImageLayout layout,
uint32_t regionCount,
const VkBufferImageCopy* pRegions)
{
@@ -260,11 +253,11 @@ meta_copy_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
pRegions[r].bufferImageHeight : pRegions[r].imageExtent.height,
};
const VkExtent3D buf_extent_el =
meta_region_extent_el(image, image->type, &bufferExtent);
meta_region_extent_el(image, &bufferExtent);
/* Start creating blit rect */
const VkExtent3D img_extent_el =
meta_region_extent_el(image, image->type, &pRegions[r].imageExtent);
meta_region_extent_el(image, &pRegions[r].imageExtent);
struct radv_meta_blit2d_rect rect = {
.width = img_extent_el.width,
.height = img_extent_el.height,
@@ -273,7 +266,6 @@ meta_copy_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
/* Create blit surfaces */
struct radv_meta_blit2d_surf img_info =
blit_surf_for_image_level_layer(image,
layout,
&pRegions[r].imageSubresource);
struct radv_meta_blit2d_buffer buf_info = {
@@ -326,16 +318,13 @@ void radv_CmdCopyImageToBuffer(
RADV_FROM_HANDLE(radv_buffer, dst_buffer, destBuffer);
meta_copy_image_to_buffer(cmd_buffer, dst_buffer, src_image,
srcImageLayout,
regionCount, pRegions);
}
static void
meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *src_image,
VkImageLayout src_image_layout,
struct radv_image *dest_image,
VkImageLayout dest_image_layout,
uint32_t regionCount,
const VkImageCopy *pRegions)
{
@@ -362,12 +351,10 @@ meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
/* Create blit surfaces */
struct radv_meta_blit2d_surf b_src =
blit_surf_for_image_level_layer(src_image,
src_image_layout,
&pRegions[r].srcSubresource);
struct radv_meta_blit2d_surf b_dst =
blit_surf_for_image_level_layer(dest_image,
dest_image_layout,
&pRegions[r].dstSubresource);
/* for DCC */
@@ -386,18 +373,8 @@ meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
meta_region_offset_el(dest_image, &pRegions[r].dstOffset);
const VkOffset3D src_offset_el =
meta_region_offset_el(src_image, &pRegions[r].srcOffset);
/*
* From Vulkan 1.0.68, "Copying Data Between Images":
* "When copying between compressed and uncompressed formats
* the extent members represent the texel dimensions of the
* source image and not the destination."
* However, we must use the destination image type to avoid
* clamping depth when copying multiple layers of a 2D image to
* a 3D image.
*/
const VkExtent3D img_extent_el =
meta_region_extent_el(src_image, dest_image->type, &pRegions[r].extent);
meta_region_extent_el(dest_image, &pRegions[r].extent);
/* Start creating blit rect */
struct radv_meta_blit2d_rect rect = {
@@ -405,9 +382,6 @@ meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
.height = img_extent_el.height,
};
if (src_image->type == VK_IMAGE_TYPE_3D)
b_src.layer = src_offset_el.z;
if (dest_image->type == VK_IMAGE_TYPE_3D)
b_dst.layer = dst_offset_el.z;
@@ -455,9 +429,7 @@ void radv_CmdCopyImage(
RADV_FROM_HANDLE(radv_image, src_image, srcImage);
RADV_FROM_HANDLE(radv_image, dest_image, destImage);
meta_copy_image(cmd_buffer,
src_image, srcImageLayout,
dest_image, destImageLayout,
meta_copy_image(cmd_buffer, src_image, dest_image,
regionCount, pRegions);
}
@@ -477,7 +449,6 @@ void radv_blit_to_prime_linear(struct radv_cmd_buffer *cmd_buffer,
image_copy.extent.height = image->info.height;
image_copy.extent.depth = 1;
meta_copy_image(cmd_buffer, image, VK_IMAGE_LAYOUT_GENERAL, linear_image,
VK_IMAGE_LAYOUT_GENERAL,
meta_copy_image(cmd_buffer, image, linear_image,
1, &image_copy);
}

View File

@@ -75,29 +75,11 @@ create_pass(struct radv_device *device,
return result;
}
static VkResult
create_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout)
{
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 0,
.pSetLayouts = NULL,
.pushConstantRangeCount = 0,
.pPushConstantRanges = NULL,
};
return radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
layout);
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h,
uint32_t samples,
VkRenderPass pass,
VkPipelineLayout layout,
VkPipeline *decompress_pipeline,
VkPipeline *resummarize_pipeline)
{
@@ -183,7 +165,6 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = layout,
.renderPass = pass,
.subpass = 0,
};
@@ -231,9 +212,6 @@ radv_device_finish_meta_depth_decomp_state(struct radv_device *device)
radv_DestroyRenderPass(radv_device_to_handle(device),
state->depth_decomp[i].pass,
&state->alloc);
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->depth_decomp[i].p_layout,
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->depth_decomp[i].decompress_pipeline,
&state->alloc);
@@ -265,14 +243,8 @@ radv_device_init_meta_depth_decomp_state(struct radv_device *device)
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline_layout(device,
&state->depth_decomp[i].p_layout);
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline(device, vs_module_h, samples,
state->depth_decomp[i].pass,
state->depth_decomp[i].p_layout,
&state->depth_decomp[i].decompress_pipeline,
&state->depth_decomp[i].resummarize_pipeline);
if (res != VK_SUCCESS)

View File

@@ -74,27 +74,9 @@ create_pass(struct radv_device *device)
return result;
}
static VkResult
create_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout)
{
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 0,
.pSetLayouts = NULL,
.pushConstantRangeCount = 0,
.pPushConstantRanges = NULL,
};
return radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
layout);
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h,
VkPipelineLayout layout)
VkShaderModule vs_module_h)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -191,7 +173,6 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = layout,
.renderPass = device->meta_state.fast_clear_flush.pass,
.subpass = 0,
},
@@ -237,7 +218,6 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = layout,
.renderPass = device->meta_state.fast_clear_flush.pass,
.subpass = 0,
},
@@ -265,9 +245,6 @@ radv_device_finish_meta_fast_clear_flush_state(struct radv_device *device)
radv_DestroyRenderPass(radv_device_to_handle(device),
state->fast_clear_flush.pass, &state->alloc);
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->fast_clear_flush.p_layout,
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->fast_clear_flush.cmask_eliminate_pipeline,
&state->alloc);
@@ -292,14 +269,8 @@ radv_device_init_meta_fast_clear_flush_state(struct radv_device *device)
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline_layout(device,
&device->meta_state.fast_clear_flush.p_layout);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h,
device->meta_state.fast_clear_flush.p_layout);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;

View File

@@ -26,7 +26,6 @@
#include "radv_meta.h"
#include "radv_private.h"
#include "vk_format.h"
#include "nir/nir_builder.h"
#include "sid.h"
@@ -51,7 +50,7 @@ build_nir_fs(void)
}
static VkResult
create_pass(struct radv_device *device, VkFormat vk_format, VkRenderPass *pass)
create_pass(struct radv_device *device)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -60,7 +59,7 @@ create_pass(struct radv_device *device, VkFormat vk_format, VkRenderPass *pass)
int i;
for (i = 0; i < 2; i++) {
attachments[i].format = vk_format;
attachments[i].format = VK_FORMAT_UNDEFINED;
attachments[i].samples = 1;
attachments[i].loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachments[i].storeOp = VK_ATTACHMENT_STORE_OP_STORE;
@@ -100,16 +99,14 @@ create_pass(struct radv_device *device, VkFormat vk_format, VkRenderPass *pass)
.dependencyCount = 0,
},
alloc,
pass);
&device->meta_state.resolve.pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h,
VkPipeline *pipeline,
VkRenderPass pass)
VkShaderModule vs_module_h)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -124,23 +121,6 @@ create_pipeline(struct radv_device *device,
goto cleanup;
}
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 0,
.pSetLayouts = NULL,
.pushConstantRangeCount = 0,
.pPushConstantRanges = NULL,
};
if (!device->meta_state.resolve.p_layout) {
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve.p_layout);
if (result != VK_SUCCESS)
goto cleanup;
}
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&(VkGraphicsPipelineCreateInfo) {
@@ -216,15 +196,15 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = device->meta_state.resolve.p_layout,
.renderPass = pass,
.renderPass = device->meta_state.resolve.pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_RESOLVE,
},
&device->meta_state.alloc, pipeline);
&device->meta_state.alloc,
&device->meta_state.resolve.pipeline);
if (result != VK_SUCCESS)
goto cleanup;
@@ -240,37 +220,17 @@ radv_device_finish_meta_resolve_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
for (uint32_t j = 0; j < NUM_META_FS_KEYS; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass[j], &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline[j], &state->alloc);
}
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->resolve.p_layout, &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline, &state->alloc);
}
static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,
VK_FORMAT_R16G16B16A16_SINT,
VK_FORMAT_R32_SFLOAT,
VK_FORMAT_R32G32_SFLOAT,
VK_FORMAT_R32G32B32A32_SFLOAT
};
VkResult
radv_device_init_meta_resolve_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
struct radv_meta_state *state = &device->meta_state;
struct radv_shader_module vs_module = { .nir = radv_meta_build_nir_vs_generate_vertices() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
@@ -278,19 +238,14 @@ radv_device_init_meta_resolve_state(struct radv_device *device)
goto fail;
}
for (uint32_t i = 0; i < ARRAY_SIZE(pipeline_formats); ++i) {
VkFormat format = pipeline_formats[i];
unsigned fs_key = radv_format_meta_fs_key(format);
res = create_pass(device, format, &state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h,
&state->resolve.pipeline[fs_key], state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
}
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
goto cleanup;
@@ -305,18 +260,16 @@ cleanup:
static void
emit_resolve(struct radv_cmd_buffer *cmd_buffer,
VkFormat vk_format,
const VkOffset2D *dest_offset,
const VkExtent2D *resolve_extent)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
unsigned fs_key = radv_format_meta_fs_key(vk_format);
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.resolve.pipeline[fs_key]);
device->meta_state.resolve.pipeline);
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) {
.x = dest_offset->x,
@@ -347,16 +300,11 @@ static void radv_pick_resolve_method_images(struct radv_image *src_image,
enum radv_resolve_method *method)
{
if (src_image->vk_format == VK_FORMAT_R16G16_UNORM ||
src_image->vk_format == VK_FORMAT_R16G16_SNORM)
*method = RESOLVE_COMPUTE;
else if (vk_format_is_int(src_image->vk_format))
*method = RESOLVE_COMPUTE;
if (dest_image->surface.num_dcc_levels > 0) {
*method = RESOLVE_FRAGMENT;
} else if (dest_image->surface.micro_tile_mode != src_image->surface.micro_tile_mode) {
*method = RESOLVE_COMPUTE;
if (dest_image->surface.micro_tile_mode != src_image->surface.micro_tile_mode) {
if (dest_image->surface.num_dcc_levels > 0)
*method = RESOLVE_FRAGMENT;
else
*method = RESOLVE_COMPUTE;
}
}
@@ -442,7 +390,6 @@ void radv_CmdResolveImage(
if (dest_image->surface.dcc_size) {
radv_initialize_dcc(cmd_buffer, dest_image, 0xffffffff);
}
unsigned fs_key = radv_format_meta_fs_key(dest_image->vk_format);
for (uint32_t r = 0; r < region_count; ++r) {
const VkImageResolve *region = &regions[r];
@@ -542,7 +489,7 @@ void radv_CmdResolveImage(
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.resolve.pass[fs_key],
.renderPass = device->meta_state.resolve.pass,
.framebuffer = fb_h,
.renderArea = {
.offset = {
@@ -560,7 +507,6 @@ void radv_CmdResolveImage(
VK_SUBPASS_CONTENTS_INLINE);
emit_resolve(cmd_buffer,
dest_iview.vk_format,
&(VkOffset2D) {
.x = dstOffset.x,
.y = dstOffset.y,
@@ -614,7 +560,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
struct radv_image *dst_img = cmd_buffer->state.framebuffer->attachments[dest_att.attachment].attachment->image;
struct radv_image *src_img = cmd_buffer->state.framebuffer->attachments[src_att.attachment].attachment->image;
radv_pick_resolve_method_images(src_img, dst_img, &resolve_method);
radv_pick_resolve_method_images(dst_img, src_img, &resolve_method);
if (resolve_method == RESOLVE_FRAGMENT) {
break;
}
@@ -655,7 +601,6 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
emit_resolve(cmd_buffer,
dst_img->vk_format,
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
}

View File

@@ -253,31 +253,22 @@ radv_device_init_meta_resolve_compute_state(struct radv_device *device)
res = create_layout(device);
if (res != VK_SUCCESS)
goto fail;
return res;
for (uint32_t i = 0; i < MAX_SAMPLES_LOG2; ++i) {
uint32_t samples = 1 << i;
res = create_resolve_pipeline(device, samples, false, false,
&state->resolve_compute.rc[i].pipeline);
if (res != VK_SUCCESS)
goto fail;
res = create_resolve_pipeline(device, samples, true, false,
&state->resolve_compute.rc[i].i_pipeline);
if (res != VK_SUCCESS)
goto fail;
res = create_resolve_pipeline(device, samples, false, true,
&state->resolve_compute.rc[i].srgb_pipeline);
if (res != VK_SUCCESS)
goto fail;
}
return VK_SUCCESS;
fail:
radv_device_finish_meta_resolve_compute_state(device);
return res;
}
@@ -496,14 +487,6 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer)
if (!subpass->has_resolve)
return;
/* Resolves happen before the end-of-subpass barriers get executed,
* so we have to make the attachment shader-readable */
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
RADV_CMD_FLAG_FLUSH_AND_INV_CB |
RADV_CMD_FLAG_FLUSH_AND_INV_CB_META |
RADV_CMD_FLAG_INV_GLOBAL_L2 |
RADV_CMD_FLAG_INV_VMEM_L1;
for (uint32_t i = 0; i < subpass->color_count; ++i) {
VkAttachmentReference src_att = subpass->color_attachments[i];
VkAttachmentReference dest_att = subpass->resolve_attachments[i];

View File

@@ -316,9 +316,16 @@ create_resolve_pipeline(struct radv_device *device,
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
pipeline);
ralloc_free(vs.nir);
ralloc_free(fs.nir);
if (result != VK_SUCCESS)
goto fail;
return VK_SUCCESS;
fail:
ralloc_free(vs.nir);
ralloc_free(fs.nir);
return result;
}
@@ -329,19 +336,14 @@ radv_device_init_meta_resolve_fragment_state(struct radv_device *device)
res = create_layout(device);
if (res != VK_SUCCESS)
goto fail;
return res;
for (uint32_t i = 0; i < MAX_SAMPLES_LOG2; ++i) {
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
res = create_resolve_pipeline(device, i, pipeline_formats[j]);
if (res != VK_SUCCESS)
goto fail;
}
}
return VK_SUCCESS;
fail:
radv_device_finish_meta_resolve_fragment_state(device);
return res;
}
@@ -405,8 +407,8 @@ emit_resolve(struct radv_cmd_buffer *cmd_buffer,
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
unsigned push_constants[2] = {
src_offset->x - dest_offset->x,
src_offset->y - dest_offset->y,
src_offset->x,
src_offset->y,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.resolve_fragment.p_layout,
@@ -538,8 +540,8 @@ void radv_meta_resolve_fragment_image(struct radv_cmd_buffer *cmd_buffer,
.pAttachments = (VkImageView[]) {
radv_image_view_to_handle(&dest_iview),
},
.width = extent.width + dstOffset.x,
.height = extent.height + dstOffset.y,
.width = extent.width,
.height = extent.height,
.layers = 1
}, &cmd_buffer->pool->alloc, &fb);
@@ -602,16 +604,6 @@ radv_cmd_buffer_resolve_subpass_fs(struct radv_cmd_buffer *cmd_buffer)
RADV_META_SAVE_CONSTANTS |
RADV_META_SAVE_DESCRIPTORS);
/* Resolves happen before the end-of-subpass barriers get executed,
* so we have to make the attachment shader-readable */
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
RADV_CMD_FLAG_FLUSH_AND_INV_CB |
RADV_CMD_FLAG_FLUSH_AND_INV_CB_META |
RADV_CMD_FLAG_FLUSH_AND_INV_DB |
RADV_CMD_FLAG_FLUSH_AND_INV_DB_META |
RADV_CMD_FLAG_INV_GLOBAL_L2 |
RADV_CMD_FLAG_INV_VMEM_L1;
for (uint32_t i = 0; i < subpass->color_count; ++i) {
VkAttachmentReference src_att = subpass->color_attachments[i];
VkAttachmentReference dest_att = subpass->resolve_attachments[i];

View File

@@ -879,8 +879,6 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
S_028BE0_MAX_SAMPLE_DIST(radv_cayman_get_maxdist(log_samples)) |
S_028BE0_MSAA_EXPOSED_SAMPLES(log_samples); /* CM_R_028BE0_PA_SC_AA_CONFIG */
ms->pa_sc_mode_cntl_1 |= S_028A4C_PS_ITER_SAMPLE(ps_iter_samples > 1);
if (ps_iter_samples > 1)
pipeline->graphics.spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
}
const struct VkPipelineRasterizationStateRasterizationOrderAMD *raster_order =
@@ -1177,7 +1175,7 @@ static void calculate_gfx9_gs_info(const VkGraphicsPipelineCreateInfo *pCreateIn
case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY:
case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY:
case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY:
uses_adjacency = true;
uses_adjacency = false;
break;
default:
uses_adjacency = false;
@@ -1699,60 +1697,14 @@ radv_link_shaders(struct radv_pipeline *pipeline, nir_shader **shaders)
ordered_shaders[i - 1]);
if (progress) {
if (nir_lower_global_vars_to_local(ordered_shaders[i])) {
radv_lower_indirect_derefs(ordered_shaders[i],
pipeline->device->physical_device);
}
nir_lower_global_vars_to_local(ordered_shaders[i]);
radv_optimize_nir(ordered_shaders[i]);
if (nir_lower_global_vars_to_local(ordered_shaders[i - 1])) {
radv_lower_indirect_derefs(ordered_shaders[i - 1],
pipeline->device->physical_device);
}
nir_lower_global_vars_to_local(ordered_shaders[i - 1]);
radv_optimize_nir(ordered_shaders[i - 1]);
}
}
}
static void
merge_tess_info(struct shader_info *tes_info,
const struct shader_info *tcs_info)
{
/* The Vulkan 1.0.38 spec, section 21.1 Tessellator says:
*
* "PointMode. Controls generation of points rather than triangles
* or lines. This functionality defaults to disabled, and is
* enabled if either shader stage includes the execution mode.
*
* and about Triangles, Quads, IsoLines, VertexOrderCw, VertexOrderCcw,
* PointMode, SpacingEqual, SpacingFractionalEven, SpacingFractionalOdd,
* and OutputVertices, it says:
*
* "One mode must be set in at least one of the tessellation
* shader stages."
*
* So, the fields can be set in either the TCS or TES, but they must
* agree if set in both. Our backend looks at TES, so bitwise-or in
* the values from the TCS.
*/
assert(tcs_info->tess.tcs_vertices_out == 0 ||
tes_info->tess.tcs_vertices_out == 0 ||
tcs_info->tess.tcs_vertices_out == tes_info->tess.tcs_vertices_out);
tes_info->tess.tcs_vertices_out |= tcs_info->tess.tcs_vertices_out;
assert(tcs_info->tess.spacing == TESS_SPACING_UNSPECIFIED ||
tes_info->tess.spacing == TESS_SPACING_UNSPECIFIED ||
tcs_info->tess.spacing == tes_info->tess.spacing);
tes_info->tess.spacing |= tcs_info->tess.spacing;
assert(tcs_info->tess.primitive_mode == 0 ||
tes_info->tess.primitive_mode == 0 ||
tcs_info->tess.primitive_mode == tes_info->tess.primitive_mode);
tes_info->tess.primitive_mode |= tcs_info->tess.primitive_mode;
tes_info->tess.ccw |= tcs_info->tess.ccw;
tes_info->tess.point_mode |= tcs_info->tess.point_mode;
}
static
void radv_create_shaders(struct radv_pipeline *pipeline,
struct radv_device *device,
@@ -1830,7 +1782,6 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
keys[MESA_SHADER_TESS_CTRL].tcs.tes_reads_tess_factors = !!(nir[MESA_SHADER_TESS_EVAL]->info.inputs_read & (VARYING_BIT_TESS_LEVEL_INNER | VARYING_BIT_TESS_LEVEL_OUTER));
nir_lower_tes_patch_vertices(nir[MESA_SHADER_TESS_EVAL], nir[MESA_SHADER_TESS_CTRL]->info.tess.tcs_vertices_out);
merge_tess_info(&nir[MESA_SHADER_TESS_EVAL]->info, &nir[MESA_SHADER_TESS_CTRL]->info);
}
radv_link_shaders(pipeline, nir);
@@ -2004,7 +1955,6 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
radv_create_shaders(pipeline, device, cache, keys, pStages);
pipeline->graphics.spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
radv_pipeline_init_depth_stencil_state(pipeline, pCreateInfo, extra);
radv_pipeline_init_raster_state(pipeline, pCreateInfo);
radv_pipeline_init_multisample_state(pipeline, pCreateInfo);

View File

@@ -375,7 +375,6 @@ radv_pipeline_cache_insert_shaders(struct radv_device *device,
char* p = entry->code;
struct cache_entry_variant_info info;
memset(&info, 0, sizeof(info));
for (int i = 0; i < MESA_SHADER_STAGES; ++i) {
if (!variants[i])

View File

@@ -69,6 +69,7 @@ typedef uint32_t xcb_window_t;
#include <vulkan/vulkan.h>
#include <vulkan/vulkan_intel.h>
#include <vulkan/vk_icd.h>
#include <vulkan/vk_android_native_buffer.h>
#include "radv_entrypoints.h"
@@ -83,9 +84,7 @@ typedef uint32_t xcb_window_t;
#define MAX_SCISSORS 16
#define MAX_PUSH_CONSTANTS_SIZE 128
#define MAX_PUSH_DESCRIPTORS 32
#define MAX_DYNAMIC_UNIFORM_BUFFERS 16
#define MAX_DYNAMIC_STORAGE_BUFFERS 8
#define MAX_DYNAMIC_BUFFERS (MAX_DYNAMIC_UNIFORM_BUFFERS + MAX_DYNAMIC_STORAGE_BUFFERS)
#define MAX_DYNAMIC_BUFFERS 16
#define MAX_SAMPLES_LOG2 4
#define NUM_META_FS_KEYS 13
#define RADV_MAX_DRM_DEVICES 8
@@ -268,7 +267,7 @@ struct radv_physical_device {
struct radeon_winsys *ws;
struct radeon_info rad_info;
char path[20];
char name[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
const char * name;
uint8_t driver_uuid[VK_UUID_SIZE];
uint8_t device_uuid[VK_UUID_SIZE];
uint8_t cache_uuid[VK_UUID_SIZE];
@@ -350,22 +349,6 @@ radv_pipeline_cache_insert_shaders(struct radv_device *device,
const void *const *codes,
const unsigned *code_sizes);
enum radv_blit_ds_layout {
RADV_BLIT_DS_LAYOUT_TILE_ENABLE,
RADV_BLIT_DS_LAYOUT_TILE_DISABLE,
RADV_BLIT_DS_LAYOUT_COUNT,
};
static inline enum radv_blit_ds_layout radv_meta_blit_ds_to_type(VkImageLayout layout)
{
return (layout == VK_IMAGE_LAYOUT_GENERAL) ? RADV_BLIT_DS_LAYOUT_TILE_DISABLE : RADV_BLIT_DS_LAYOUT_TILE_ENABLE;
}
static inline VkImageLayout radv_meta_blit_ds_to_layout(enum radv_blit_ds_layout ds_layout)
{
return ds_layout == RADV_BLIT_DS_LAYOUT_TILE_ENABLE ? VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL : VK_IMAGE_LAYOUT_GENERAL;
}
struct radv_meta_state {
VkAllocationCallbacks alloc;
@@ -398,12 +381,12 @@ struct radv_meta_state {
/** Pipeline that blits from a 3D image. */
VkPipeline pipeline_3d_src[NUM_META_FS_KEYS];
VkRenderPass depth_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkRenderPass depth_only_rp;
VkPipeline depth_only_1d_pipeline;
VkPipeline depth_only_2d_pipeline;
VkPipeline depth_only_3d_pipeline;
VkRenderPass stencil_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkRenderPass stencil_only_rp;
VkPipeline stencil_only_1d_pipeline;
VkPipeline stencil_only_2d_pipeline;
VkPipeline stencil_only_3d_pipeline;
@@ -414,46 +397,41 @@ struct radv_meta_state {
struct {
VkRenderPass render_passes[NUM_META_FS_KEYS];
VkPipelineLayout p_layouts[3];
VkDescriptorSetLayout ds_layouts[3];
VkPipeline pipelines[3][NUM_META_FS_KEYS];
VkPipelineLayout p_layouts[2];
VkDescriptorSetLayout ds_layouts[2];
VkPipeline pipelines[2][NUM_META_FS_KEYS];
VkRenderPass depth_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkPipeline depth_only_pipeline[3];
VkRenderPass depth_only_rp;
VkPipeline depth_only_pipeline[2];
VkRenderPass stencil_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkPipeline stencil_only_pipeline[3];
VkRenderPass stencil_only_rp;
VkPipeline stencil_only_pipeline[2];
} blit2d;
struct {
VkPipelineLayout img_p_layout;
VkDescriptorSetLayout img_ds_layout;
VkPipeline pipeline;
VkPipeline pipeline_3d;
} itob;
struct {
VkPipelineLayout img_p_layout;
VkDescriptorSetLayout img_ds_layout;
VkPipeline pipeline;
VkPipeline pipeline_3d;
} btoi;
struct {
VkPipelineLayout img_p_layout;
VkDescriptorSetLayout img_ds_layout;
VkPipeline pipeline;
VkPipeline pipeline_3d;
} itoi;
struct {
VkPipelineLayout img_p_layout;
VkDescriptorSetLayout img_ds_layout;
VkPipeline pipeline;
VkPipeline pipeline_3d;
} cleari;
struct {
VkPipelineLayout p_layout;
VkPipeline pipeline[NUM_META_FS_KEYS];
VkRenderPass pass[NUM_META_FS_KEYS];
VkPipeline pipeline;
VkRenderPass pass;
} resolve;
struct {
@@ -477,14 +455,12 @@ struct radv_meta_state {
} resolve_fragment;
struct {
VkPipelineLayout p_layout;
VkPipeline decompress_pipeline;
VkPipeline resummarize_pipeline;
VkRenderPass pass;
} depth_decomp[1 + MAX_SAMPLES_LOG2];
struct {
VkPipelineLayout p_layout;
VkPipeline cmask_eliminate_pipeline;
VkPipeline fmask_decompress_pipeline;
VkRenderPass pass;
@@ -557,6 +533,7 @@ struct radv_device {
int queue_count[RADV_MAX_QUEUE_FAMILIES];
struct radeon_winsys_cs *empty_cs[RADV_MAX_QUEUE_FAMILIES];
bool always_use_syncobj;
bool llvm_supports_spill;
bool has_distributed_tess;
uint32_t tess_offchip_block_dw_size;
@@ -920,6 +897,7 @@ void si_emit_wait_fence(struct radeon_winsys_cs *cs,
uint64_t va, uint32_t ref,
uint32_t mask);
void si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
bool predicated,
enum chip_class chip_class,
uint32_t *fence_ptr, uint64_t va,
bool is_mec,
@@ -1133,7 +1111,6 @@ struct radv_pipeline {
struct radv_gs_state gs;
uint32_t db_shader_control;
uint32_t shader_z_format;
uint32_t spi_baryc_cntl;
unsigned prim;
unsigned gs_out;
uint32_t vgt_gs_mode;
@@ -1274,6 +1251,9 @@ struct radv_image {
struct radv_cmask_info cmask;
uint64_t clear_value_offset;
uint64_t dcc_pred_offset;
/* For VK_ANDROID_native_buffer, the WSI image owns the memory, */
VkDeviceMemory owned_memory;
};
/* Whether the image has a htile that is known consistent with the contents of
@@ -1358,6 +1338,7 @@ struct radv_image_view {
struct radv_image_create_info {
const VkImageCreateInfo *vk_info;
bool scanout;
bool no_metadata_planes;
};
VkResult radv_image_create(VkDevice _device,
@@ -1365,6 +1346,13 @@ VkResult radv_image_create(VkDevice _device,
const VkAllocationCallbacks* alloc,
VkImage *pImage);
VkResult
radv_image_from_gralloc(VkDevice device_h,
const VkImageCreateInfo *base_info,
const VkNativeBufferANDROID *gralloc_info,
const VkAllocationCallbacks *alloc,
VkImage *out_image_h);
void radv_image_view_init(struct radv_image_view *view,
struct radv_device *device,
const VkImageViewCreateInfo* pCreateInfo);
@@ -1546,7 +1534,8 @@ VkResult radv_alloc_sem_info(struct radv_winsys_sem_info *sem_info,
int num_wait_sems,
const VkSemaphore *wait_sems,
int num_signal_sems,
const VkSemaphore *signal_sems);
const VkSemaphore *signal_sems,
VkFence fence);
void radv_free_sem_info(struct radv_winsys_sem_info *sem_info);
void
@@ -1581,6 +1570,9 @@ struct radv_fence {
struct radeon_winsys_fence *fence;
bool submitted;
bool signalled;
uint32_t syncobj;
uint32_t temp_syncobj;
};
struct radeon_winsys_sem;

View File

@@ -1152,7 +1152,7 @@ void radv_CmdEndQuery(
si_cs_emit_write_event_eop(cs,
false,
cmd_buffer->device->physical_device->rad_info.chip_class,
radv_cmd_buffer_uses_mec(cmd_buffer),
false,
V_028A90_BOTTOM_OF_PIPE_TS, 0,
1, avail_va, 0, 1);
break;

View File

@@ -256,9 +256,18 @@ struct radeon_winsys {
int (*create_syncobj)(struct radeon_winsys *ws, uint32_t *handle);
void (*destroy_syncobj)(struct radeon_winsys *ws, uint32_t handle);
void (*reset_syncobj)(struct radeon_winsys *ws, uint32_t handle);
void (*signal_syncobj)(struct radeon_winsys *ws, uint32_t handle);
bool (*wait_syncobj)(struct radeon_winsys *ws, uint32_t handle, uint64_t timeout);
int (*export_syncobj)(struct radeon_winsys *ws, uint32_t syncobj, int *fd);
int (*import_syncobj)(struct radeon_winsys *ws, int fd, uint32_t *syncobj);
int (*export_syncobj_to_sync_file)(struct radeon_winsys *ws, uint32_t syncobj, int *fd);
/* Note that this, unlike the normal import, uses an existing syncobj. */
int (*import_syncobj_from_sync_file)(struct radeon_winsys *ws, uint32_t syncobj, int fd);
};
static inline void radeon_emit(struct radeon_winsys_cs *cs, uint32_t value)

View File

@@ -110,45 +110,6 @@ void radv_DestroyShaderModule(
vk_free2(&device->alloc, pAllocator, module);
}
bool
radv_lower_indirect_derefs(struct nir_shader *nir,
struct radv_physical_device *device)
{
/* While it would be nice not to have this flag, we are constrained
* by the reality that LLVM 5.0 doesn't have working VGPR indexing
* on GFX9.
*/
bool llvm_has_working_vgpr_indexing =
device->rad_info.chip_class <= VI;
/* TODO: Indirect indexing of GS inputs is unimplemented.
*
* TCS and TES load inputs directly from LDS or offchip memory, so
* indirect indexing is trivial.
*/
nir_variable_mode indirect_mask = 0;
if (nir->info.stage == MESA_SHADER_GEOMETRY ||
(nir->info.stage != MESA_SHADER_TESS_CTRL &&
nir->info.stage != MESA_SHADER_TESS_EVAL &&
!llvm_has_working_vgpr_indexing)) {
indirect_mask |= nir_var_shader_in;
}
if (!llvm_has_working_vgpr_indexing &&
nir->info.stage != MESA_SHADER_TESS_CTRL)
indirect_mask |= nir_var_shader_out;
/* TODO: We shouldn't need to do this, however LLVM isn't currently
* smart enough to handle indirects without causing excess spilling
* causing the gpu to hang.
*
* See the following thread for more details of the problem:
* https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html
*/
indirect_mask |= nir_var_local;
return nir_lower_indirect_derefs(nir, indirect_mask);
}
void
radv_optimize_nir(struct nir_shader *shader)
{
@@ -233,19 +194,22 @@ radv_shader_compile_to_nir(struct radv_device *device,
spec_entries[i].data32 = *(const uint32_t *)data;
}
}
const struct nir_spirv_supported_extensions supported_ext = {
.draw_parameters = true,
.float64 = true,
.image_read_without_format = true,
.image_write_without_format = true,
.tessellation = true,
.int64 = true,
.multiview = true,
.variable_pointers = true,
const struct spirv_to_nir_options spirv_options = {
.caps = {
.draw_parameters = true,
.float64 = true,
.image_read_without_format = true,
.image_write_without_format = true,
.tessellation = true,
.int64 = true,
.multiview = true,
.variable_pointers = true,
},
};
entry_point = spirv_to_nir(spirv, module->size / 4,
spec_entries, num_spec_entries,
stage, entrypoint_name, &supported_ext, &nir_options);
stage, entrypoint_name,
&spirv_options, &nir_options);
nir = entry_point->shader;
assert(nir->info.stage == stage);
nir_validate_shader(nir);
@@ -284,6 +248,40 @@ radv_shader_compile_to_nir(struct radv_device *device,
nir_shader_gather_info(nir, entry_point->impl);
/* While it would be nice not to have this flag, we are constrained
* by the reality that LLVM 5.0 doesn't have working VGPR indexing
* on GFX9.
*/
bool llvm_has_working_vgpr_indexing =
device->physical_device->rad_info.chip_class <= VI;
/* TODO: Indirect indexing of GS inputs is unimplemented.
*
* TCS and TES load inputs directly from LDS or offchip memory, so
* indirect indexing is trivial.
*/
nir_variable_mode indirect_mask = 0;
if (nir->info.stage == MESA_SHADER_GEOMETRY ||
(nir->info.stage != MESA_SHADER_TESS_CTRL &&
nir->info.stage != MESA_SHADER_TESS_EVAL &&
!llvm_has_working_vgpr_indexing)) {
indirect_mask |= nir_var_shader_in;
}
if (!llvm_has_working_vgpr_indexing &&
nir->info.stage != MESA_SHADER_TESS_CTRL)
indirect_mask |= nir_var_shader_out;
/* TODO: We shouldn't need to do this, however LLVM isn't currently
* smart enough to handle indirects without causing excess spilling
* causing the gpu to hang.
*
* See the following thread for more details of the problem:
* https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html
*/
indirect_mask |= nir_var_local;
nir_lower_indirect_derefs(nir, indirect_mask);
static const nir_lower_tex_options tex_options = {
.lower_txp = ~0,
};
@@ -294,7 +292,6 @@ radv_shader_compile_to_nir(struct radv_device *device,
nir_lower_var_copies(nir);
nir_lower_global_vars_to_local(nir);
nir_remove_dead_variables(nir, nir_var_local);
radv_lower_indirect_derefs(nir, device->physical_device);
radv_optimize_nir(nir);
return nir;

View File

@@ -103,10 +103,6 @@ void
radv_shader_variant_destroy(struct radv_device *device,
struct radv_shader_variant *variant);
bool
radv_lower_indirect_derefs(struct nir_shader *nir,
struct radv_physical_device *device);
uint32_t
radv_shader_stage_to_user_data_0(gl_shader_stage stage, enum chip_class chip_class,
bool has_gs, bool has_tess);

View File

@@ -445,13 +445,14 @@ VkResult radv_GetSwapchainImagesKHR(
}
VkResult radv_AcquireNextImageKHR(
VkDevice device,
VkDevice _device,
VkSwapchainKHR _swapchain,
uint64_t timeout,
VkSemaphore semaphore,
VkFence _fence,
uint32_t* pImageIndex)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain);
RADV_FROM_HANDLE(radv_fence, fence, _fence);
@@ -461,6 +462,11 @@ VkResult radv_AcquireNextImageKHR(
if (fence && (result == VK_SUCCESS || result == VK_SUBOPTIMAL_KHR)) {
fence->submitted = true;
fence->signalled = true;
if (fence->temp_syncobj) {
device->ws->signal_syncobj(device->ws, fence->temp_syncobj);
} else if (fence->syncobj) {
device->ws->signal_syncobj(device->ws, fence->syncobj);
}
}
return result;
}
@@ -479,20 +485,6 @@ VkResult radv_QueuePresentKHR(
struct radeon_winsys_cs *cs;
const VkPresentRegionKHR *region = NULL;
VkResult item_result;
struct radv_winsys_sem_info sem_info;
item_result = radv_alloc_sem_info(&sem_info,
pPresentInfo->waitSemaphoreCount,
pPresentInfo->pWaitSemaphores,
0,
NULL);
if (pPresentInfo->pResults != NULL)
pPresentInfo->pResults[i] = item_result;
result = result == VK_SUCCESS ? item_result : result;
if (item_result != VK_SUCCESS) {
radv_free_sem_info(&sem_info);
continue;
}
assert(radv_device_from_handle(swapchain->device) == queue->device);
if (swapchain->fences[0] == VK_NULL_HANDLE) {
@@ -505,7 +497,6 @@ VkResult radv_QueuePresentKHR(
pPresentInfo->pResults[i] = item_result;
result = result == VK_SUCCESS ? item_result : result;
if (item_result != VK_SUCCESS) {
radv_free_sem_info(&sem_info);
continue;
}
} else {
@@ -513,6 +504,22 @@ VkResult radv_QueuePresentKHR(
1, &swapchain->fences[0]);
}
struct radv_winsys_sem_info sem_info;
item_result = radv_alloc_sem_info(&sem_info,
pPresentInfo->waitSemaphoreCount,
pPresentInfo->pWaitSemaphores,
0,
NULL,
swapchain->fences[0]);
if (pPresentInfo->pResults != NULL)
pPresentInfo->pResults[i] = item_result;
result = result == VK_SUCCESS ? item_result : result;
if (item_result != VK_SUCCESS) {
radv_free_sem_info(&sem_info);
continue;
}
if (swapchain->needs_linear_copy) {
int idx = (queue->queue_family_index * swapchain->image_count) + pPresentInfo->pImageIndices[i];
cs = radv_cmd_buffer_from_handle(swapchain->cmd_buffers[idx])->cs;

View File

@@ -676,8 +676,7 @@ si_write_scissors(struct radeon_winsys_cs *cs, int first,
int i;
float scale[3], translate[3], guardband_x = INFINITY, guardband_y = INFINITY;
const float max_range = 32767.0f;
if (!count)
return;
assert(count);
radeon_set_context_reg_seq(cs, R_028250_PA_SC_VPORT_SCISSOR_0_TL + first * 4 * 2, count * 2);
for (i = 0; i < count; i++) {
@@ -919,6 +918,7 @@ si_emit_acquire_mem(struct radeon_winsys_cs *cs,
void
si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
bool predicated,
enum chip_class chip_class,
uint32_t *flush_cnt,
uint64_t flush_va,
@@ -949,7 +949,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
/* Necessary for DCC */
if (chip_class >= VI) {
si_cs_emit_write_event_eop(cs,
false,
predicated,
chip_class,
is_mec,
V_028A90_FLUSH_AND_INV_CB_DATA_TS,
@@ -963,12 +963,12 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_FLUSH_AND_INV_CB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_CB_META) | EVENT_INDEX(0));
}
if (flush_bits & RADV_CMD_FLAG_FLUSH_AND_INV_DB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_DB_META) | EVENT_INDEX(0));
}
@@ -981,18 +981,13 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_CS_PARTIAL_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, EVENT_TYPE(V_028A90_CS_PARTIAL_FLUSH) | EVENT_INDEX(4));
}
if (chip_class >= GFX9 && flush_cb_db) {
unsigned cb_db_event, tc_flags;
#if 0
/* This breaks a bunch of:
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.input*.
use the big hammer always.
*/
/* Set the CB/DB flush event. */
switch (flush_cb_db) {
case RADV_CMD_FLAG_FLUSH_AND_INV_CB:
@@ -1005,9 +1000,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
/* both CB & DB */
cb_db_event = V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT;
}
#else
cb_db_event = V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT;
#endif
/* TC | TC_WB = invalidate L2 data
* TC_MD | TC_WB = invalidate L2 metadata
* TC | TC_WB | TC_MD = invalidate L2 data & metadata
@@ -1035,14 +1028,14 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
assert(flush_cnt);
uint32_t old_fence = (*flush_cnt)++;
si_cs_emit_write_event_eop(cs, false, chip_class, false, cb_db_event, tc_flags, 1,
si_cs_emit_write_event_eop(cs, predicated, chip_class, false, cb_db_event, tc_flags, 1,
flush_va, old_fence, *flush_cnt);
si_emit_wait_fence(cs, false, flush_va, *flush_cnt, 0xffffffff);
si_emit_wait_fence(cs, predicated, flush_va, *flush_cnt, 0xffffffff);
}
/* VGT state sync */
if (flush_bits & RADV_CMD_FLAG_VGT_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | EVENT_INDEX(0));
}
@@ -1055,13 +1048,13 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
RADV_CMD_FLAG_INV_GLOBAL_L2 |
RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2))) &&
!is_mec) {
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, predicated));
radeon_emit(cs, 0);
}
if ((flush_bits & RADV_CMD_FLAG_INV_GLOBAL_L2) ||
(chip_class <= CIK && (flush_bits & RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2))) {
si_emit_acquire_mem(cs, is_mec, false, chip_class >= GFX9,
si_emit_acquire_mem(cs, is_mec, predicated, chip_class >= GFX9,
cp_coher_cntl |
S_0085F0_TC_ACTION_ENA(1) |
S_0085F0_TCL1_ACTION_ENA(1) |
@@ -1075,7 +1068,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
*
* WB doesn't work without NC.
*/
si_emit_acquire_mem(cs, is_mec, false,
si_emit_acquire_mem(cs, is_mec, predicated,
chip_class >= GFX9,
cp_coher_cntl |
S_0301F0_TC_WB_ACTION_ENA(1) |
@@ -1084,7 +1077,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_INV_VMEM_L1) {
si_emit_acquire_mem(cs, is_mec,
false, chip_class >= GFX9,
predicated, chip_class >= GFX9,
cp_coher_cntl |
S_0085F0_TCL1_ACTION_ENA(1));
cp_coher_cntl = 0;
@@ -1095,7 +1088,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
* Therefore, it should be last. Done in PFP.
*/
if (cp_coher_cntl)
si_emit_acquire_mem(cs, is_mec, false, chip_class >= GFX9, cp_coher_cntl);
si_emit_acquire_mem(cs, is_mec, predicated, chip_class >= GFX9, cp_coher_cntl);
}
void
@@ -1125,6 +1118,7 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
ptr = &cmd_buffer->gfx9_fence_idx;
}
si_cs_emit_cache_flush(cmd_buffer->cs,
cmd_buffer->state.predicating,
cmd_buffer->device->physical_device->rad_info.chip_class,
ptr, va,
radv_cmd_buffer_uses_mec(cmd_buffer),

View File

@@ -1257,6 +1257,43 @@ static void radv_amdgpu_destroy_syncobj(struct radeon_winsys *_ws,
amdgpu_cs_destroy_syncobj(ws->dev, handle);
}
static void radv_amdgpu_reset_syncobj(struct radeon_winsys *_ws,
uint32_t handle)
{
struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws);
amdgpu_cs_syncobj_reset(ws->dev, &handle, 1);
}
static void radv_amdgpu_signal_syncobj(struct radeon_winsys *_ws,
uint32_t handle)
{
struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws);
amdgpu_cs_syncobj_signal(ws->dev, &handle, 1);
}
static bool radv_amdgpu_wait_syncobj(struct radeon_winsys *_ws,
uint32_t handle, uint64_t timeout)
{
struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws);
uint32_t tmp;
/* The timeouts are signed, while vulkan timeouts are unsigned. */
timeout = MIN2(timeout, INT64_MAX);
int ret = amdgpu_cs_syncobj_wait(ws->dev, &handle, 1, timeout,
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT |
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL,
&tmp);
if (ret == 0) {
return true;
} else if (ret == -1 && errno == ETIME) {
return false;
} else {
fprintf(stderr, "amdgpu: radv_amdgpu_wait_syncobj failed!\nerrno: %d\n", errno);
return false;
}
}
static int radv_amdgpu_export_syncobj(struct radeon_winsys *_ws,
uint32_t syncobj,
int *fd)
@@ -1275,6 +1312,25 @@ static int radv_amdgpu_import_syncobj(struct radeon_winsys *_ws,
return amdgpu_cs_import_syncobj(ws->dev, fd, syncobj);
}
static int radv_amdgpu_export_syncobj_to_sync_file(struct radeon_winsys *_ws,
uint32_t syncobj,
int *fd)
{
struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws);
return amdgpu_cs_syncobj_export_sync_file(ws->dev, syncobj, fd);
}
static int radv_amdgpu_import_syncobj_from_sync_file(struct radeon_winsys *_ws,
uint32_t syncobj,
int fd)
{
struct radv_amdgpu_winsys *ws = radv_amdgpu_winsys(_ws);
return amdgpu_cs_syncobj_import_sync_file(ws->dev, syncobj, fd);
}
void radv_amdgpu_cs_init_functions(struct radv_amdgpu_winsys *ws)
{
ws->base.ctx_create = radv_amdgpu_ctx_create;
@@ -1295,7 +1351,12 @@ void radv_amdgpu_cs_init_functions(struct radv_amdgpu_winsys *ws)
ws->base.destroy_sem = radv_amdgpu_destroy_sem;
ws->base.create_syncobj = radv_amdgpu_create_syncobj;
ws->base.destroy_syncobj = radv_amdgpu_destroy_syncobj;
ws->base.reset_syncobj = radv_amdgpu_reset_syncobj;
ws->base.signal_syncobj = radv_amdgpu_signal_syncobj;
ws->base.wait_syncobj = radv_amdgpu_wait_syncobj;
ws->base.export_syncobj = radv_amdgpu_export_syncobj;
ws->base.import_syncobj = radv_amdgpu_import_syncobj;
ws->base.export_syncobj_to_sync_file = radv_amdgpu_export_syncobj_to_sync_file;
ws->base.import_syncobj_from_sync_file = radv_amdgpu_import_syncobj_from_sync_file;
ws->base.fence_wait = radv_amdgpu_fence_wait;
}

View File

@@ -227,28 +227,19 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
val = ((ir_swizzle *)val)->val;
}
for (;;) {
if (val->ir_type == ir_type_dereference_array) {
val = ((ir_dereference_array *)val)->array;
} else if (val->ir_type == ir_type_dereference_record &&
!state->es_shader) {
val = ((ir_dereference_record *)val)->record;
} else
break;
while (val->ir_type == ir_type_dereference_array) {
val = ((ir_dereference_array *)val)->array;
}
ir_variable *var = NULL;
if (const ir_dereference_variable *deref_var = val->as_dereference_variable())
var = deref_var->variable_referenced();
if (!var || var->data.mode != ir_var_shader_in) {
if (!val->as_dereference_variable() ||
val->variable_referenced()->data.mode != ir_var_shader_in) {
_mesa_glsl_error(&loc, state,
"parameter `%s` must be a shader input",
formal->name);
return false;
}
var->data.must_be_shader_input = 1;
val->variable_referenced()->data.must_be_shader_input = 1;
}
/* Verify that 'out' and 'inout' actual parameters are lvalues. */
@@ -676,13 +667,8 @@ generate_array_index(void *mem_ctx, exec_list *instructions,
ir_variable *sub_var = NULL;
*function_name = array->primary_expression.identifier;
if (!match_subroutine_by_name(*function_name, actual_parameters,
state, &sub_var)) {
_mesa_glsl_error(&loc, state, "Unknown subroutine `%s'",
*function_name);
*function_name = NULL; /* indicate error condition to caller */
return NULL;
}
match_subroutine_by_name(*function_name, actual_parameters,
state, &sub_var);
ir_rvalue *outer_array_idx = idx->hir(instructions, state);
return new(mem_ctx) ir_dereference_array(sub_var, outer_array_idx);

View File

@@ -90,9 +90,9 @@ static const struct gl_builtin_uniform_element gl_LightSource_elements[] = {
SWIZZLE_Y,
SWIZZLE_Z,
SWIZZLE_Z)},
{"spotExponent", {STATE_LIGHT, 0, STATE_ATTENUATION}, SWIZZLE_WWWW},
{"spotCutoff", {STATE_LIGHT, 0, STATE_SPOT_CUTOFF}, SWIZZLE_XXXX},
{"spotCosCutoff", {STATE_LIGHT, 0, STATE_SPOT_DIRECTION}, SWIZZLE_WWWW},
{"spotCutoff", {STATE_LIGHT, 0, STATE_SPOT_CUTOFF}, SWIZZLE_XXXX},
{"spotExponent", {STATE_LIGHT, 0, STATE_ATTENUATION}, SWIZZLE_WWWW},
{"constantAttenuation", {STATE_LIGHT, 0, STATE_ATTENUATION}, SWIZZLE_XXXX},
{"linearAttenuation", {STATE_LIGHT, 0, STATE_ATTENUATION}, SWIZZLE_YYYY},
{"quadraticAttenuation", {STATE_LIGHT, 0, STATE_ATTENUATION}, SWIZZLE_ZZZZ},

View File

@@ -225,10 +225,12 @@ expanded_line:
glcpp_error(& @1, parser, "undefined macro %s in expression (illegal in GLES)", $2.undefined_macro);
_glcpp_parser_skip_stack_change_if (parser, & @1, "elif", $2.value);
}
| LINE_EXPANDED integer_constant NEWLINE {
| LINE_EXPANDED expression NEWLINE {
if (parser->is_gles && $2.undefined_macro)
glcpp_error(& @1, parser, "undefined macro %s in expression (illegal in GLES)", $2.undefined_macro);
parser->has_new_line_number = 1;
parser->new_line_number = $2;
_mesa_string_buffer_printf(parser->output, "#line %" PRIiMAX "\n", $2);
parser->new_line_number = $2.value;
_mesa_string_buffer_printf(parser->output, "#line %" PRIiMAX "\n", $2.value);
}
| LINE_EXPANDED integer_constant integer_constant NEWLINE {
parser->has_new_line_number = 1;
@@ -239,6 +241,19 @@ expanded_line:
"#line %" PRIiMAX " %" PRIiMAX "\n",
$2, $3);
}
| LINE_EXPANDED '(' expression ')' '(' expression ')' NEWLINE {
if (parser->is_gles && $3.undefined_macro)
glcpp_error(& @1, parser, "undefined macro %s in expression (illegal in GLES)", $3.undefined_macro);
if (parser->is_gles && $6.undefined_macro)
glcpp_error(& @1, parser, "undefined macro %s in expression (illegal in GLES)", $6.undefined_macro);
parser->has_new_line_number = 1;
parser->new_line_number = $3.value;
parser->has_new_source_number = 1;
parser->new_source_number = $6.value;
_mesa_string_buffer_printf(parser->output,
"#line %" PRIiMAX " %" PRIiMAX "\n",
$3.value, $6.value);
}
;
define:

View File

@@ -1863,49 +1863,6 @@ set_shader_inout_layout(struct gl_shader *shader,
shader->bound_image = state->bound_image_specified;
}
/* src can be NULL if only the symbols found in the exec_list should be
* copied
*/
void
_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir,
struct glsl_symbol_table *src,
struct glsl_symbol_table *dest)
{
foreach_in_list (ir_instruction, ir, shader_ir) {
switch (ir->ir_type) {
case ir_type_function:
dest->add_function((ir_function *) ir);
break;
case ir_type_variable: {
ir_variable *const var = (ir_variable *) ir;
if (var->data.mode != ir_var_temporary)
dest->add_variable(var);
break;
}
default:
break;
}
}
if (src != NULL) {
/* Explicitly copy the gl_PerVertex interface definitions because these
* are needed to check they are the same during the interstage link.
* They cant necessarily be found via the exec_list because the members
* might not be referenced. The GL spec still requires that they match
* in that case.
*/
const glsl_type *iface =
src->get_interface("gl_PerVertex", ir_var_shader_in);
if (iface)
dest->add_interface(iface->name, iface, ir_var_shader_in);
iface = src->get_interface("gl_PerVertex", ir_var_shader_out);
if (iface)
dest->add_interface(iface->name, iface, ir_var_shader_out);
}
}
extern "C" {
static void
@@ -1979,7 +1936,6 @@ do_late_parsing_checks(struct _mesa_glsl_parse_state *state)
static void
opt_shader_and_create_symbol_table(struct gl_context *ctx,
struct glsl_symbol_table *source_symbols,
struct gl_shader *shader)
{
assert(shader->CompileStatus != compile_failure &&
@@ -2037,8 +1993,22 @@ opt_shader_and_create_symbol_table(struct gl_context *ctx,
* We don't have to worry about types or interface-types here because those
* are fly-weights that are looked up by glsl_type.
*/
_mesa_glsl_copy_symbols_from_table(shader->ir, source_symbols,
shader->symbols);
foreach_in_list (ir_instruction, ir, shader->ir) {
switch (ir->ir_type) {
case ir_type_function:
shader->symbols->add_function((ir_function *) ir);
break;
case ir_type_variable: {
ir_variable *const var = (ir_variable *) ir;
if (var->data.mode != ir_var_temporary)
shader->symbols->add_variable(var);
break;
}
default:
break;
}
}
}
void
@@ -2075,9 +2045,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader,
return;
if (shader->CompileStatus == compiled_no_opts) {
opt_shader_and_create_symbol_table(ctx,
NULL, /* source_symbols */
shader);
opt_shader_and_create_symbol_table(ctx, shader);
shader->CompileStatus = compile_success;
return;
}
@@ -2138,7 +2106,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader,
lower_subroutine(shader->ir, state);
if (!ctx->Cache || force_recompile)
opt_shader_and_create_symbol_table(ctx, state->symbols, shader);
opt_shader_and_create_symbol_table(ctx, shader);
else {
reparent_ir(shader->ir, shader->ir);
shader->CompileStatus = compiled_no_opts;
@@ -2251,24 +2219,6 @@ do_common_optimization(exec_list *ir, bool linked,
loop_progress = false;
loop_progress |= do_constant_propagation(ir);
loop_progress |= do_if_simplification(ir);
/* Some drivers only call do_common_optimization() once rather
* than in a loop. So we must call do_lower_jumps() after
* unrolling a loop because for drivers that use LLVM validation
* will fail if a jump is not the last instruction in the block.
* For example the following will fail LLVM validation:
*
* (loop (
* ...
* break
* (assign (x) (var_ref v124) (expression int + (var_ref v124)
* (constant int (1)) ) )
* ))
*/
loop_progress |= do_lower_jumps(ir, true, true,
options->EmitNoMainReturn,
options->EmitNoCont,
options->EmitNoLoops);
}
progress |= loop_progress;
}

View File

@@ -948,11 +948,6 @@ extern int glcpp_preprocess(void *ctx, const char **shader, char **info_log,
extern void _mesa_destroy_shader_compiler(void);
extern void _mesa_destroy_shader_compiler_caches(void);
extern void
_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir,
struct glsl_symbol_table *src,
struct glsl_symbol_table *dest);
#ifdef __cplusplus
}
#endif

View File

@@ -364,35 +364,6 @@ validate_interstage_inout_blocks(struct gl_shader_program *prog,
consumer->Stage != MESA_SHADER_FRAGMENT) ||
consumer->Stage == MESA_SHADER_GEOMETRY;
/* Check that block re-declarations of gl_PerVertex are compatible
* across shaders: From OpenGL Shading Language 4.5, section
* "7.1 Built-In Language Variables", page 130 of the PDF:
*
* "If multiple shaders using members of a built-in block belonging
* to the same interface are linked together in the same program,
* they must all redeclare the built-in block in the same way, as
* described in section 4.3.9 “Interface Blocks” for interface-block
* matching, or a link-time error will result."
*
* This is done explicitly outside of iterating the member variable
* declarations because it is possible that the variables are not used and
* so they would have been optimised out.
*/
const glsl_type *consumer_iface =
consumer->symbols->get_interface("gl_PerVertex",
ir_var_shader_in);
const glsl_type *producer_iface =
producer->symbols->get_interface("gl_PerVertex",
ir_var_shader_out);
if (producer_iface && consumer_iface &&
interstage_member_mismatch(prog, consumer_iface, producer_iface)) {
linker_error(prog, "Incompatible or missing gl_PerVertex re-declaration "
"in consecutive shaders");
return;
}
/* Add output interfaces from the producer to the symbol table. */
foreach_in_list(ir_instruction, node, producer->ir) {
ir_variable *var = node->as_variable();

View File

@@ -637,6 +637,9 @@ private:
this->record_next_sampler))
return;
/* Avoid overflowing the sampler array. (crbug.com/141901) */
this->next_sampler = MIN2(this->next_sampler, MAX_SAMPLERS);
for (unsigned i = uniform->opaque[shader_type].index;
i < MIN2(this->next_sampler, MAX_SAMPLERS);
i++) {

View File

@@ -165,12 +165,10 @@ process_xfb_layout_qualifiers(void *mem_ctx, const gl_linked_shader *sh,
if (var->data.from_named_ifc_block) {
type = var->get_interface_type();
/* Find the member type before it was altered by lowering */
const glsl_type *type_wa = type->without_array();
member_type =
type_wa->fields.structure[type_wa->field_index(var->name)].type;
name = ralloc_strdup(NULL, type_wa->name);
type->fields.structure[type->field_index(var->name)].type;
name = ralloc_strdup(NULL, type->without_array()->name);
} else {
type = var->type;
member_type = NULL;
@@ -191,8 +189,7 @@ process_xfb_layout_qualifiers(void *mem_ctx, const gl_linked_shader *sh,
* matching input to another stage.
*/
static void
cross_validate_types_and_qualifiers(struct gl_context *ctx,
struct gl_shader_program *prog,
cross_validate_types_and_qualifiers(struct gl_shader_program *prog,
const ir_variable *input,
const ir_variable *output,
gl_shader_stage consumer_stage,
@@ -346,30 +343,17 @@ cross_validate_types_and_qualifiers(struct gl_context *ctx,
}
if (input_interpolation != output_interpolation &&
prog->data->Version < 440) {
if (!ctx->Const.AllowGLSLCrossStageInterpolationMismatch) {
linker_error(prog,
"%s shader output `%s' specifies %s "
"interpolation qualifier, "
"but %s shader input specifies %s "
"interpolation qualifier\n",
_mesa_shader_stage_to_string(producer_stage),
output->name,
interpolation_string(output->data.interpolation),
_mesa_shader_stage_to_string(consumer_stage),
interpolation_string(input->data.interpolation));
return;
} else {
linker_warning(prog,
"%s shader output `%s' specifies %s "
"interpolation qualifier, "
"but %s shader input specifies %s "
"interpolation qualifier\n",
_mesa_shader_stage_to_string(producer_stage),
output->name,
interpolation_string(output->data.interpolation),
_mesa_shader_stage_to_string(consumer_stage),
interpolation_string(input->data.interpolation));
}
linker_error(prog,
"%s shader output `%s' specifies %s "
"interpolation qualifier, "
"but %s shader input specifies %s "
"interpolation qualifier\n",
_mesa_shader_stage_to_string(producer_stage),
output->name,
interpolation_string(output->data.interpolation),
_mesa_shader_stage_to_string(consumer_stage),
interpolation_string(input->data.interpolation));
return;
}
}
@@ -377,8 +361,7 @@ cross_validate_types_and_qualifiers(struct gl_context *ctx,
* Validate front and back color outputs against single color input
*/
static void
cross_validate_front_and_back_color(struct gl_context *ctx,
struct gl_shader_program *prog,
cross_validate_front_and_back_color(struct gl_shader_program *prog,
const ir_variable *input,
const ir_variable *front_color,
const ir_variable *back_color,
@@ -386,11 +369,11 @@ cross_validate_front_and_back_color(struct gl_context *ctx,
gl_shader_stage producer_stage)
{
if (front_color != NULL && front_color->data.assigned)
cross_validate_types_and_qualifiers(ctx, prog, input, front_color,
cross_validate_types_and_qualifiers(prog, input, front_color,
consumer_stage, producer_stage);
if (back_color != NULL && back_color->data.assigned)
cross_validate_types_and_qualifiers(ctx, prog, input, back_color,
cross_validate_types_and_qualifiers(prog, input, back_color,
consumer_stage, producer_stage);
}
@@ -543,7 +526,7 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
const ir_variable *const back_color =
parameters.get_variable("gl_BackColor");
cross_validate_front_and_back_color(ctx, prog, input,
cross_validate_front_and_back_color(prog, input,
front_color, back_color,
consumer->Stage, producer->Stage);
} else if (strcmp(input->name, "gl_SecondaryColor") == 0 && input->data.used) {
@@ -553,7 +536,7 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
const ir_variable *const back_color =
parameters.get_variable("gl_BackSecondaryColor");
cross_validate_front_and_back_color(ctx, prog, input,
cross_validate_front_and_back_color(prog, input,
front_color, back_color,
consumer->Stage, producer->Stage);
} else {
@@ -596,7 +579,7 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
*/
if (!(input->get_interface_type() &&
output->get_interface_type()))
cross_validate_types_and_qualifiers(ctx, prog, input, output,
cross_validate_types_and_qualifiers(prog, input, output,
consumer->Stage,
producer->Stage);
} else {

View File

@@ -1111,10 +1111,15 @@ cross_validate_globals(struct gl_shader_program *prog,
return;
}
/* Check the precision qualifier matches for uniform variables on
* GLSL ES.
/* Only in GLSL ES 3.10, the precision qualifier should not match
* between block members defined in matched block names within a
* shader interface.
*
* In GLSL ES 3.00 and ES 3.20, precision qualifier for each block
* member should match.
*/
if (prog->IsES && !var->get_interface_type() &&
if (prog->IsES && (prog->data->Version != 310 ||
!var->get_interface_type()) &&
existing->data.precision != var->data.precision) {
if ((existing->data.used && var->data.used) || prog->data->Version >= 300) {
linker_error(prog, "declarations for %s `%s` have "
@@ -1256,11 +1261,21 @@ interstage_cross_validate_uniform_blocks(struct gl_shader_program *prog,
* Populates a shaders symbol table with all global declarations
*/
static void
populate_symbol_table(gl_linked_shader *sh, glsl_symbol_table *symbols)
populate_symbol_table(gl_linked_shader *sh)
{
sh->symbols = new(sh) glsl_symbol_table;
_mesa_glsl_copy_symbols_from_table(sh->ir, symbols, sh->symbols);
foreach_in_list(ir_instruction, inst, sh->ir) {
ir_variable *var;
ir_function *func;
if ((func = inst->as_function()) != NULL) {
sh->symbols->add_function(func);
} else if ((var = inst->as_variable()) != NULL) {
if (var->data.mode != ir_var_temporary)
sh->symbols->add_variable(var);
}
}
}
@@ -2278,7 +2293,7 @@ link_intrastage_shaders(void *mem_ctx,
link_bindless_layout_qualifiers(prog, shader_list, num_shaders);
populate_symbol_table(linked, shader_list[0]->symbols);
populate_symbol_table(linked);
/* The pointer to the main function in the final linked shader (i.e., the
* copy of the original shader that contained the main function).

View File

@@ -519,7 +519,7 @@ loop_unroll_visitor::visit_leave(ir_loop *ir)
* isn't any additional unknown terminators, or any other jumps nested
* inside futher ifs.
*/
if (ls->num_loop_jumps != 2 || ls->terminators.length() != 2)
if (ls->num_loop_jumps != 2)
return visit_continue;
ir_instruction *first_ir =
@@ -528,6 +528,8 @@ loop_unroll_visitor::visit_leave(ir_loop *ir)
unsigned term_count = 0;
bool first_term_then_continue = false;
foreach_in_list(loop_terminator, t, &ls->terminators) {
assert(term_count < 2);
ir_if *ir_if = t->ir->as_if();
assert(ir_if != NULL);

View File

@@ -72,22 +72,16 @@ lower_buffer_access::emit_access(void *mem_ctx,
new(mem_ctx) ir_dereference_record(deref->clone(mem_ctx, NULL),
field->name);
unsigned field_align;
if (packing == GLSL_INTERFACE_PACKING_STD430)
field_align = field->type->std430_base_alignment(row_major);
else
field_align = field->type->std140_base_alignment(row_major);
field_offset = glsl_align(field_offset, field_align);
field_offset =
glsl_align(field_offset,
field->type->std140_base_alignment(row_major));
emit_access(mem_ctx, is_write, field_deref, base_offset,
deref_offset + field_offset,
row_major, 1, packing,
writemask_for_size(field_deref->type->vector_elements));
if (packing == GLSL_INTERFACE_PACKING_STD430)
field_offset += field->type->std430_size(row_major);
else
field_offset += field->type->std140_size(row_major);
field_offset += field->type->std140_size(row_major);
}
return;
}

View File

@@ -115,7 +115,6 @@ public:
void run(exec_list *instructions);
virtual ir_visitor_status visit_leave(ir_assignment *);
virtual ir_visitor_status visit_leave(ir_expression *);
virtual void handle_rvalue(ir_rvalue **rvalue);
};
@@ -239,23 +238,6 @@ flatten_named_interface_blocks_declarations::visit_leave(ir_assignment *ir)
return rvalue_visit(ir);
}
ir_visitor_status
flatten_named_interface_blocks_declarations::visit_leave(ir_expression *ir)
{
ir_visitor_status status = rvalue_visit(ir);
if (ir->operation == ir_unop_interpolate_at_centroid ||
ir->operation == ir_binop_interpolate_at_offset ||
ir->operation == ir_binop_interpolate_at_sample) {
const ir_rvalue *val = ir->operands[0];
/* This disables varying packing for this input. */
val->variable_referenced()->data.must_be_shader_input = 1;
}
return status;
}
void
flatten_named_interface_blocks_declarations::handle_rvalue(ir_rvalue **rvalue)
{

View File

@@ -128,36 +128,7 @@ ir_vec_index_to_cond_assign_visitor::convert_vector_extract_to_cond_assign(ir_rv
{
ir_expression *const expr = ir->as_expression();
if (expr == NULL)
return ir;
if (expr->operation == ir_unop_interpolate_at_centroid ||
expr->operation == ir_binop_interpolate_at_offset ||
expr->operation == ir_binop_interpolate_at_sample) {
/* Lower interpolateAtXxx(some_vec[idx], ...) to
* interpolateAtXxx(some_vec, ...)[idx] before lowering to conditional
* assignments, to maintain the rule that the interpolant is an l-value
* referring to a (part of a) shader input.
*
* This is required when idx is dynamic (otherwise it gets lowered to
* a swizzle).
*/
ir_expression *const interpolant = expr->operands[0]->as_expression();
if (!interpolant || interpolant->operation != ir_binop_vector_extract)
return ir;
ir_rvalue *vec_input = interpolant->operands[0];
ir_expression *const vec_interpolate =
new(base_ir) ir_expression(expr->operation, vec_input->type,
vec_input, expr->operands[1]);
return convert_vec_index_to_cond_assign(ralloc_parent(ir),
vec_interpolate,
interpolant->operands[1],
ir->type);
}
if (expr->operation != ir_binop_vector_extract)
if (expr == NULL || expr->operation != ir_binop_vector_extract)
return ir;
return convert_vec_index_to_cond_assign(ralloc_parent(ir),

View File

@@ -628,7 +628,7 @@ TEST_F(array_refcount_test, visit_array_indexing_an_array)
ir_array_refcount_entry *const entry_c = v.get_variable_entry(var_c);
for (int i = 0; i < var_c->type->array_size(); i++) {
for (unsigned i = 0; i < var_c->type->array_size(); i++) {
EXPECT_EQ(true, entry_c->is_linearized_index_referenced(i)) <<
"array c, i = " << i;
}

View File

@@ -198,22 +198,6 @@ generate_array_data(void *mem_ctx, enum glsl_base_type base_type,
val = new(mem_ctx) ir_constant(array_type, &values_for_array);
}
static uint64_t
uint64_storage(union gl_constant_value *storage)
{
uint64_t val;
memcpy(&val, &storage->i, sizeof(uint64_t));
return val;
}
static uint64_t
double_storage(union gl_constant_value *storage)
{
double val;
memcpy(&val, &storage->i, sizeof(double));
return val;
}
/**
* Verify that the data stored for the uniform matches the initializer
*
@@ -262,13 +246,13 @@ verify_data(gl_constant_value *storage, unsigned storage_array_size,
EXPECT_EQ(val->value.b[i] ? boolean_true : 0, storage[i].i);
break;
case GLSL_TYPE_DOUBLE:
EXPECT_EQ(val->value.d[i], double_storage(&storage[i*2]));
EXPECT_EQ(val->value.d[i], *(double *)&storage[i*2].i);
break;
case GLSL_TYPE_UINT64:
EXPECT_EQ(val->value.u64[i], uint64_storage(&storage[i*2]));
EXPECT_EQ(val->value.u64[i], *(uint64_t *)&storage[i*2].i);
break;
case GLSL_TYPE_INT64:
EXPECT_EQ(val->value.i64[i], uint64_storage(&storage[i*2]));
EXPECT_EQ(val->value.i64[i], *(int64_t *)&storage[i*2].i);
break;
case GLSL_TYPE_ATOMIC_UINT:
case GLSL_TYPE_STRUCT:

View File

@@ -41,9 +41,9 @@
#include "compiler/shader_info.h"
#include <stdio.h>
#ifndef NDEBUG
#ifdef DEBUG
#include "util/debug.h"
#endif /* NDEBUG */
#endif /* DEBUG */
#include "nir_opcodes.h"
@@ -2341,7 +2341,7 @@ static inline void nir_metadata_set_validation_flag(nir_shader *shader) { (void)
static inline void nir_metadata_check_validation_flag(nir_shader *shader) { (void) shader; }
static inline bool should_clone_nir(void) { return false; }
static inline bool should_print_nir(void) { return false; }
#endif /* NDEBUG */
#endif /* DEBUG */
#define _PASS(nir, do_pass) do { \
do_pass \

View File

@@ -436,7 +436,7 @@ LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { offset }. const_index[] = { base, component } */
LOAD(output, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { vertex, offset }. const_index[] = { base, component } */
LOAD(per_vertex_output, 2, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
LOAD(per_vertex_output, 2, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { offset }. const_index[] = { base } */
LOAD(shared, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { offset }. const_index[] = { base, range } */

View File

@@ -95,15 +95,9 @@ emit_load_store(nir_builder *b, nir_intrinsic_instr *orig_instr,
if (src == NULL) {
/* This is a load instruction */
nir_intrinsic_instr *load =
nir_intrinsic_instr_create(b->shader, orig_instr->intrinsic);
nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_var);
load->num_components = orig_instr->num_components;
load->variables[0] = nir_deref_var_clone(deref, load);
/* Copy over any sources. This is needed for interp_var_at */
for (unsigned i = 0;
i < nir_intrinsic_infos[orig_instr->intrinsic].num_srcs; i++)
nir_src_copy(&load->src[i], &orig_instr->src[i], load);
unsigned bit_size = orig_instr->dest.ssa.bit_size;
nir_ssa_dest_init(&load->instr, &load->dest,
load->num_components, bit_size, NULL);
@@ -148,9 +142,6 @@ lower_indirect_block(nir_block *block, nir_builder *b,
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
if (intrin->intrinsic != nir_intrinsic_load_var &&
intrin->intrinsic != nir_intrinsic_interp_var_at_centroid &&
intrin->intrinsic != nir_intrinsic_interp_var_at_sample &&
intrin->intrinsic != nir_intrinsic_interp_var_at_offset &&
intrin->intrinsic != nir_intrinsic_store_var)
continue;
@@ -167,7 +158,7 @@ lower_indirect_block(nir_block *block, nir_builder *b,
b->cursor = nir_before_instr(&intrin->instr);
if (intrin->intrinsic != nir_intrinsic_store_var) {
if (intrin->intrinsic == nir_intrinsic_load_var) {
nir_ssa_def *result;
emit_load_store(b, intrin, intrin->variables[0],
&intrin->variables[0]->deref, &result, NULL);

View File

@@ -464,7 +464,7 @@ lower_copies_to_load_store(struct deref_node *node,
struct set_entry *arg_entry = _mesa_set_search(arg_node->copies, copy);
assert(arg_entry);
_mesa_set_remove(arg_node->copies, arg_entry);
_mesa_set_remove(node->copies, arg_entry);
}
nir_instr_remove(&copy->instr);

View File

@@ -230,7 +230,6 @@ lower_vec_to_movs_block(nir_block *block, nir_function_impl *impl)
continue; /* The loop */
}
bool vec_had_ssa_dest = vec->dest.dest.is_ssa;
if (vec->dest.dest.is_ssa) {
/* Since we insert multiple MOVs, we have a register destination. */
nir_register *reg = nir_local_reg_create(impl);
@@ -264,11 +263,7 @@ lower_vec_to_movs_block(nir_block *block, nir_function_impl *impl)
if (!(vec->dest.write_mask & (1 << i)))
continue;
/* Coalescing moves the register writes from the vec up to the ALU
* instruction in the source. We can only do this if the original
* vecN had an SSA destination.
*/
if (vec_had_ssa_dest && !(finished_write_mask & (1 << i)))
if (!(finished_write_mask & (1 << i)))
finished_write_mask |= try_coalesce(vec, i);
if (!(finished_write_mask & (1 << i)))

View File

@@ -59,7 +59,7 @@ nir_metadata_preserve(nir_function_impl *impl, nir_metadata preserved)
impl->valid_metadata &= preserved;
}
#ifndef NDEBUG
#ifdef DEBUG
/**
* Make sure passes properly invalidate metadata (part 1).
*

View File

@@ -397,8 +397,8 @@ binop("umul_high", tuint32, commutative,
"(uint32_t)(((uint64_t) src0 * (uint64_t) src1) >> 32)")
binop("fdiv", tfloat, "", "src0 / src1")
binop("idiv", tint, "", "src1 == 0 ? 0 : (src0 / src1)")
binop("udiv", tuint, "", "src1 == 0 ? 0 : (src0 / src1)")
binop("idiv", tint, "", "src0 / src1")
binop("udiv", tuint, "", "src0 / src1")
# returns a boolean representing the carry resulting from the addition of
# the two unsigned arguments.
@@ -717,12 +717,12 @@ opcode("bitfield_insert", 0, tuint32, [0, 0, 0, 0],
unsigned base = src0, insert = src1;
int offset = src2, bits = src3;
if (bits == 0) {
dst = base;
dst = 0;
} else if (offset < 0 || bits < 0 || bits + offset > 32) {
dst = 0;
} else {
unsigned mask = ((1ull << bits) - 1) << offset;
dst = (base & ~mask) | ((insert << offset) & mask);
dst = (base & ~mask) | ((insert << bits) & mask);
}
""")

View File

@@ -39,10 +39,10 @@
#define LOOP_UNROLL_LIMIT 96
/* Prepare this loop for unrolling by first converting to lcssa and then
* converting the phis from the top level of the loop body to regs.
* Partially converting out of SSA allows us to unroll the loop without having
* to keep track of and update phis along the way which gets tricky and
* doesn't add much value over converting to regs.
* converting the phis from the loops first block and the block that follows
* the loop into regs. Partially converting out of SSA allows us to unroll
* the loop without having to keep track of and update phis along the way
* which gets tricky and doesn't add much value over conveting to regs.
*
* The loop may have a continue instruction at the end of the loop which does
* nothing. Once we're out of SSA, we can safely delete it so we don't have
@@ -53,20 +53,13 @@ loop_prepare_for_unroll(nir_loop *loop)
{
nir_convert_loop_to_lcssa(loop);
/* Lower phis at the top level of the loop body */
foreach_list_typed_safe(nir_cf_node, node, node, &loop->body) {
if (nir_cf_node_block == node->type) {
nir_lower_phis_to_regs_block(nir_cf_node_as_block(node));
}
}
nir_lower_phis_to_regs_block(nir_loop_first_block(loop));
/* Lower phis after the loop */
nir_block *block_after_loop =
nir_cf_node_as_block(nir_cf_node_next(&loop->cf_node));
nir_lower_phis_to_regs_block(block_after_loop);
/* Remove continue if its the last instruction in the loop */
nir_instr *last_instr = nir_block_last_instr(nir_loop_last_block(loop));
if (last_instr && last_instr->type == nir_instr_type_jump) {
assert(nir_instr_as_jump(last_instr)->type == nir_jump_continue);

View File

@@ -35,7 +35,7 @@
/* Since this file is just a pile of asserts, don't bother compiling it if
* we're not building a debug build.
*/
#ifndef NDEBUG
#ifdef DEBUG
/*
* Per-register validation state.

View File

@@ -42,24 +42,34 @@ struct nir_spirv_specialization {
};
};
struct nir_spirv_supported_extensions {
bool float64;
bool image_ms_array;
bool tessellation;
bool draw_parameters;
bool image_read_without_format;
bool image_write_without_format;
bool int64;
bool multiview;
bool variable_pointers;
struct spirv_to_nir_options {
/* Whether or not to lower all workgroup variable access to offsets
* up-front. This means you will _shared intrinsics instead of _var
* for workgroup data access.
*
* This is currently required for full variable pointers support.
*/
bool lower_workgroup_access_to_offsets;
struct {
bool float64;
bool image_ms_array;
bool tessellation;
bool draw_parameters;
bool image_read_without_format;
bool image_write_without_format;
bool int64;
bool multiview;
bool variable_pointers;
} caps;
};
nir_function *spirv_to_nir(const uint32_t *words, size_t word_count,
struct nir_spirv_specialization *specializations,
unsigned num_specializations,
gl_shader_stage stage, const char *entry_point_name,
const struct nir_spirv_supported_extensions *ext,
const nir_shader_compiler_options *options);
const struct spirv_to_nir_options *options,
const nir_shader_compiler_options *nir_options);
#ifdef __cplusplus
}

View File

@@ -117,7 +117,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant *constant,
load->value = constant->values[0];
nir_instr_insert_before_cf_list(&b->impl->body, &load->instr);
nir_instr_insert_before_cf_list(&b->nb.impl->body, &load->instr);
val->def = &load->def;
} else {
assert(glsl_type_is_matrix(type));
@@ -133,7 +133,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant *constant,
load->value = constant->values[i];
nir_instr_insert_before_cf_list(&b->impl->body, &load->instr);
nir_instr_insert_before_cf_list(&b->nb.impl->body, &load->instr);
col_val->def = &load->def;
val->elems[i] = col_val;
@@ -729,6 +729,64 @@ translate_image_format(SpvImageFormat format)
}
}
static struct vtn_type *
vtn_type_layout_std430(struct vtn_builder *b, struct vtn_type *type,
uint32_t *size_out, uint32_t *align_out)
{
switch (type->base_type) {
case vtn_base_type_scalar: {
uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
*size_out = comp_size;
*align_out = comp_size;
return type;
}
case vtn_base_type_vector: {
uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
assert(type->length > 0 && type->length <= 4);
unsigned align_comps = type->length == 3 ? 4 : type->length;
*size_out = comp_size * type->length,
*align_out = comp_size * align_comps;
return type;
}
case vtn_base_type_matrix:
case vtn_base_type_array: {
/* We're going to add an array stride */
type = vtn_type_copy(b, type);
uint32_t elem_size, elem_align;
type->array_element = vtn_type_layout_std430(b, type->array_element,
&elem_size, &elem_align);
type->stride = vtn_align_u32(elem_size, elem_align);
*size_out = type->stride * type->length;
*align_out = elem_align;
return type;
}
case vtn_base_type_struct: {
/* We're going to add member offsets */
type = vtn_type_copy(b, type);
uint32_t offset = 0;
uint32_t align = 0;
for (unsigned i = 0; i < type->length; i++) {
uint32_t mem_size, mem_align;
type->members[i] = vtn_type_layout_std430(b, type->members[i],
&mem_size, &mem_align);
offset = vtn_align_u32(offset, mem_align);
type->offsets[i] = offset;
offset += mem_size;
align = MAX2(align, mem_align);
}
*size_out = offset;
*align_out = align;
return type;
}
default:
unreachable("Invalid SPIR-V type for std430");
}
}
static void
vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
const uint32_t *w, unsigned count)
@@ -878,6 +936,19 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
*/
val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
}
if (storage_class == SpvStorageClassWorkgroup &&
b->options->lower_workgroup_access_to_offsets) {
uint32_t size, align;
val->type->deref = vtn_type_layout_std430(b, val->type->deref,
&size, &align);
val->type->length = size;
val->type->align = align;
/* These can actually be stored to nir_variables and used as SSA
* values so they need a real glsl_type.
*/
val->type->type = glsl_uint_type();
}
break;
}
@@ -1394,8 +1465,11 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode,
const uint32_t *w, unsigned count)
{
struct vtn_type *res_type = vtn_value(b, w[1], vtn_value_type_type)->type;
struct nir_function *callee =
vtn_value(b, w[3], vtn_value_type_function)->func->impl->function;
struct vtn_function *vtn_callee =
vtn_value(b, w[3], vtn_value_type_function)->func;
struct nir_function *callee = vtn_callee->impl->function;
vtn_callee->referenced = true;
nir_call_instr *call = nir_call_instr_create(b->nb.shader, callee);
for (unsigned i = 0; i < call->num_params; i++) {
@@ -1410,7 +1484,7 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode,
/* Make a temporary to store the argument in */
nir_variable *tmp =
nir_local_variable_create(b->impl, arg_ssa->type, "arg_tmp");
nir_local_variable_create(b->nb.impl, arg_ssa->type, "arg_tmp");
call->params[i] = nir_deref_var_create(call, tmp);
vtn_local_store(b, arg_ssa, call->params[i]);
@@ -1420,7 +1494,7 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode,
nir_variable *out_tmp = NULL;
assert(res_type->type == callee->return_type);
if (!glsl_type_is_void(callee->return_type)) {
out_tmp = nir_local_variable_create(b->impl, callee->return_type,
out_tmp = nir_local_variable_create(b->nb.impl, callee->return_type,
"out_tmp");
call->return_deref = nir_deref_var_create(call, out_tmp);
}
@@ -1526,6 +1600,7 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
const struct glsl_type *image_type = sampled.type->type;
const enum glsl_sampler_dim sampler_dim = glsl_get_sampler_dim(image_type);
const bool is_array = glsl_sampler_type_is_array(image_type);
const bool is_shadow = glsl_sampler_type_is_shadow(image_type);
/* Figure out the base texture operation */
nir_texop texop;
@@ -1649,7 +1724,6 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
break;
}
bool is_shadow = false;
unsigned gather_component = 0;
switch (opcode) {
case SpvOpImageSampleDrefImplicitLod:
@@ -1658,7 +1732,6 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
case SpvOpImageSampleProjDrefExplicitLod:
case SpvOpImageDrefGather:
/* These all have an explicit depth value as their next source */
is_shadow = true;
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_comparator);
break;
@@ -2099,6 +2172,32 @@ get_ssbo_nir_atomic_op(SpvOp opcode)
static nir_intrinsic_op
get_shared_nir_atomic_op(SpvOp opcode)
{
switch (opcode) {
case SpvOpAtomicLoad: return nir_intrinsic_load_shared;
case SpvOpAtomicStore: return nir_intrinsic_store_shared;
#define OP(S, N) case SpvOp##S: return nir_intrinsic_shared_##N;
OP(AtomicExchange, atomic_exchange)
OP(AtomicCompareExchange, atomic_comp_swap)
OP(AtomicIIncrement, atomic_add)
OP(AtomicIDecrement, atomic_add)
OP(AtomicIAdd, atomic_add)
OP(AtomicISub, atomic_add)
OP(AtomicSMin, atomic_imin)
OP(AtomicUMin, atomic_umin)
OP(AtomicSMax, atomic_imax)
OP(AtomicUMax, atomic_umax)
OP(AtomicAnd, atomic_and)
OP(AtomicOr, atomic_or)
OP(AtomicXor, atomic_xor)
#undef OP
default:
unreachable("Invalid shared atomic");
}
}
static nir_intrinsic_op
get_var_nir_atomic_op(SpvOp opcode)
{
switch (opcode) {
case SpvOpAtomicLoad: return nir_intrinsic_load_var;
@@ -2162,10 +2261,11 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, SpvOp opcode,
SpvMemorySemanticsMask semantics = w[5];
*/
if (ptr->mode == vtn_variable_mode_workgroup) {
if (ptr->mode == vtn_variable_mode_workgroup &&
!b->options->lower_workgroup_access_to_offsets) {
nir_deref_var *deref = vtn_pointer_to_deref(b, ptr);
const struct glsl_type *deref_type = nir_deref_tail(&deref->deref)->type;
nir_intrinsic_op op = get_shared_nir_atomic_op(opcode);
nir_intrinsic_op op = get_var_nir_atomic_op(opcode);
atomic = nir_intrinsic_instr_create(b->nb.shader, op);
atomic->variables[0] = nir_deref_var_clone(deref, atomic);
@@ -2202,27 +2302,36 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, SpvOp opcode,
}
} else {
assert(ptr->mode == vtn_variable_mode_ssbo);
nir_ssa_def *offset, *index;
offset = vtn_pointer_to_offset(b, ptr, &index, NULL);
nir_intrinsic_op op = get_ssbo_nir_atomic_op(opcode);
nir_intrinsic_op op;
if (ptr->mode == vtn_variable_mode_ssbo) {
op = get_ssbo_nir_atomic_op(opcode);
} else {
assert(ptr->mode == vtn_variable_mode_workgroup &&
b->options->lower_workgroup_access_to_offsets);
op = get_shared_nir_atomic_op(opcode);
}
atomic = nir_intrinsic_instr_create(b->nb.shader, op);
int src = 0;
switch (opcode) {
case SpvOpAtomicLoad:
atomic->num_components = glsl_get_vector_elements(ptr->type->type);
atomic->src[0] = nir_src_for_ssa(index);
atomic->src[1] = nir_src_for_ssa(offset);
if (ptr->mode == vtn_variable_mode_ssbo)
atomic->src[src++] = nir_src_for_ssa(index);
atomic->src[src++] = nir_src_for_ssa(offset);
break;
case SpvOpAtomicStore:
atomic->num_components = glsl_get_vector_elements(ptr->type->type);
nir_intrinsic_set_write_mask(atomic, (1 << atomic->num_components) - 1);
atomic->src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[4])->def);
atomic->src[1] = nir_src_for_ssa(index);
atomic->src[2] = nir_src_for_ssa(offset);
atomic->src[src++] = nir_src_for_ssa(vtn_ssa_value(b, w[4])->def);
if (ptr->mode == vtn_variable_mode_ssbo)
atomic->src[src++] = nir_src_for_ssa(index);
atomic->src[src++] = nir_src_for_ssa(offset);
break;
case SpvOpAtomicExchange:
@@ -2239,9 +2348,10 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, SpvOp opcode,
case SpvOpAtomicAnd:
case SpvOpAtomicOr:
case SpvOpAtomicXor:
atomic->src[0] = nir_src_for_ssa(index);
atomic->src[1] = nir_src_for_ssa(offset);
fill_common_atomic_sources(b, opcode, w, &atomic->src[2]);
if (ptr->mode == vtn_variable_mode_ssbo)
atomic->src[src++] = nir_src_for_ssa(index);
atomic->src[src++] = nir_src_for_ssa(offset);
fill_common_atomic_sources(b, opcode, w, &atomic->src[src]);
break;
default:
@@ -2673,7 +2783,7 @@ stage_for_execution_model(SpvExecutionModel model)
}
#define spv_check_supported(name, cap) do { \
if (!(b->ext && b->ext->name)) \
if (!(b->options && b->options->caps.name)) \
vtn_warn("Unsupported SPIR-V capability: %s", \
spirv_capability_to_string(cap)); \
} while(0)
@@ -3314,8 +3424,8 @@ nir_function *
spirv_to_nir(const uint32_t *words, size_t word_count,
struct nir_spirv_specialization *spec, unsigned num_spec,
gl_shader_stage stage, const char *entry_point_name,
const struct nir_spirv_supported_extensions *ext,
const nir_shader_compiler_options *options)
const struct spirv_to_nir_options *options,
const nir_shader_compiler_options *nir_options)
{
const uint32_t *word_end = words + word_count;
@@ -3337,7 +3447,7 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
exec_list_make_empty(&b->functions);
b->entry_point_stage = stage;
b->entry_point_name = entry_point_name;
b->ext = ext;
b->options = options;
/* Handle all the preamble instructions */
words = vtn_foreach_instruction(b, words, word_end,
@@ -3349,7 +3459,7 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
return NULL;
}
b->shader = nir_shader_create(NULL, stage, options, NULL);
b->shader = nir_shader_create(NULL, stage, nir_options, NULL);
/* Set shader info defaults */
b->shader->info.gs.invocations = 1;
@@ -3367,13 +3477,22 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
vtn_build_cfg(b, words, word_end);
foreach_list_typed(struct vtn_function, func, node, &b->functions) {
b->impl = func->impl;
b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer,
_mesa_key_pointer_equal);
assert(b->entry_point->value_type == vtn_value_type_function);
b->entry_point->func->referenced = true;
vtn_function_emit(b, func, vtn_handle_body_instruction);
}
bool progress;
do {
progress = false;
foreach_list_typed(struct vtn_function, func, node, &b->functions) {
if (func->referenced && !func->emitted) {
b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer,
_mesa_key_pointer_equal);
vtn_function_emit(b, func, vtn_handle_body_instruction);
progress = true;
}
}
} while (progress);
assert(b->entry_point->value_type == vtn_value_type_function);
nir_function *entry_point = b->entry_point->func->impl->function;

View File

@@ -606,7 +606,7 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head *cf_list,
if ((*block->branch & SpvOpCodeMask) == SpvOpReturnValue) {
struct vtn_ssa_value *src = vtn_ssa_value(b, block->branch[1]);
vtn_local_store(b, src,
nir_deref_var_create(b, b->impl->return_var));
nir_deref_var_create(b, b->nb.impl->return_var));
}
if (block->branch_type != vtn_branch_type_none) {
@@ -783,4 +783,6 @@ vtn_function_emit(struct vtn_builder *b, struct vtn_function *func,
*/
if (b->has_loop_continue)
nir_repair_ssa_impl(func->impl);
func->emitted = true;
}

View File

@@ -159,6 +159,9 @@ struct vtn_block {
struct vtn_function {
struct exec_node node;
bool referenced;
bool emitted;
nir_function_impl *impl;
struct vtn_block *start_block;
@@ -217,7 +220,10 @@ struct vtn_type {
/* The value that declares this type. Used for finding decorations */
struct vtn_value *val;
/* Specifies the length of complex types. */
/* Specifies the length of complex types.
*
* For Workgroup pointers, this is the size of the referenced type.
*/
unsigned length;
/* for arrays, matrices and pointers, the array stride */
@@ -268,6 +274,9 @@ struct vtn_type {
/* Storage class for pointers */
SpvStorageClass storage_class;
/* Required alignment for pointers */
uint32_t align;
};
/* Members for image types */
@@ -369,13 +378,6 @@ struct vtn_pointer {
struct nir_ssa_def *offset;
};
static inline bool
vtn_pointer_uses_ssa_offset(struct vtn_pointer *ptr)
{
return ptr->mode == vtn_variable_mode_ubo ||
ptr->mode == vtn_variable_mode_ssbo;
}
struct vtn_variable {
enum vtn_variable_mode mode;
@@ -389,6 +391,8 @@ struct vtn_variable {
nir_variable *var;
nir_variable **members;
int shared_location;
/**
* In some early released versions of GLSLang, it implemented all function
* calls by making copies of all parameters into temporary variables and
@@ -464,8 +468,7 @@ struct vtn_builder {
nir_builder nb;
nir_shader *shader;
nir_function_impl *impl;
const struct nir_spirv_supported_extensions *ext;
const struct spirv_to_nir_options *options;
struct vtn_block *block;
/* Current file, line, and column. Useful for debugging. Set
@@ -631,6 +634,13 @@ void vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
bool vtn_handle_glsl450_instruction(struct vtn_builder *b, uint32_t ext_opcode,
const uint32_t *words, unsigned count);
static inline uint32_t
vtn_align_u32(uint32_t v, uint32_t a)
{
assert(a != 0 && a == (a & -a));
return (v + a - 1) & ~(a - 1);
}
static inline uint64_t
vtn_u64_literal(const uint32_t *w)
{

View File

@@ -57,6 +57,27 @@ vtn_access_chain_extend(struct vtn_builder *b, struct vtn_access_chain *old,
return chain;
}
static bool
vtn_pointer_uses_ssa_offset(struct vtn_builder *b,
struct vtn_pointer *ptr)
{
return ptr->mode == vtn_variable_mode_ubo ||
ptr->mode == vtn_variable_mode_ssbo ||
(ptr->mode == vtn_variable_mode_workgroup &&
b->options->lower_workgroup_access_to_offsets);
}
static bool
vtn_pointer_is_external_block(struct vtn_builder *b,
struct vtn_pointer *ptr)
{
return ptr->mode == vtn_variable_mode_ssbo ||
ptr->mode == vtn_variable_mode_ubo ||
ptr->mode == vtn_variable_mode_push_constant ||
(ptr->mode == vtn_variable_mode_workgroup &&
b->options->lower_workgroup_access_to_offsets);
}
/* Dereference the given base pointer by the access chain */
static struct vtn_pointer *
vtn_access_chain_pointer_dereference(struct vtn_builder *b,
@@ -150,7 +171,8 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
/* We need ptr_type for the stride */
assert(base->ptr_type);
/* This must be a pointer to an actual element somewhere */
assert(block_index && offset);
assert(offset);
assert(block_index || base->mode == vtn_variable_mode_workgroup);
/* We need at least one element in the chain */
assert(deref_chain->length >= 1);
@@ -161,24 +183,49 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
idx++;
}
if (!block_index) {
if (!offset) {
/* This is the first access chain so we don't have a block index */
assert(!block_index);
assert(base->var);
if (glsl_type_is_array(type->type)) {
/* We need at least one element in the chain */
assert(deref_chain->length >= 1);
assert(base->ptr_type);
switch (base->mode) {
case vtn_variable_mode_ubo:
case vtn_variable_mode_ssbo:
if (glsl_type_is_array(type->type)) {
/* We need at least one element in the chain */
assert(deref_chain->length >= 1);
nir_ssa_def *desc_arr_idx =
vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
block_index = vtn_variable_resource_index(b, base->var, desc_arr_idx);
type = type->array_element;
idx++;
} else {
block_index = vtn_variable_resource_index(b, base->var, NULL);
nir_ssa_def *desc_arr_idx =
vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
block_index = vtn_variable_resource_index(b, base->var, desc_arr_idx);
type = type->array_element;
idx++;
} else {
block_index = vtn_variable_resource_index(b, base->var, NULL);
}
offset = nir_imm_int(&b->nb, 0);
break;
case vtn_variable_mode_workgroup:
/* Assign location on first use so that we don't end up bloating SLM
* address space for variables which are never statically used.
*/
if (base->var->shared_location < 0) {
assert(base->ptr_type->length > 0 && base->ptr_type->align > 0);
b->shader->num_shared = vtn_align_u32(b->shader->num_shared,
base->ptr_type->align);
base->var->shared_location = b->shader->num_shared;
b->shader->num_shared += base->ptr_type->length;
}
block_index = NULL;
offset = nir_imm_int(&b->nb, base->var->shared_location);
break;
default:
unreachable("Invalid offset pointer mode");
}
/* This is the first access chain so we also need an offset */
assert(!offset);
offset = nir_imm_int(&b->nb, 0);
}
assert(offset);
@@ -228,7 +275,7 @@ vtn_pointer_dereference(struct vtn_builder *b,
struct vtn_pointer *base,
struct vtn_access_chain *deref_chain)
{
if (vtn_pointer_uses_ssa_offset(base)) {
if (vtn_pointer_uses_ssa_offset(b, base)) {
return vtn_ssa_offset_pointer_dereference(b, base, deref_chain);
} else {
return vtn_access_chain_pointer_dereference(b, base, deref_chain);
@@ -478,77 +525,57 @@ vtn_local_store(struct vtn_builder *b, struct vtn_ssa_value *src,
}
}
static nir_ssa_def *
get_vulkan_resource_index(struct vtn_builder *b, struct vtn_pointer *ptr,
struct vtn_type **type, unsigned *chain_idx)
{
/* Push constants have no explicit binding */
if (ptr->mode == vtn_variable_mode_push_constant) {
*chain_idx = 0;
*type = ptr->var->type;
return NULL;
}
if (glsl_type_is_array(ptr->var->type->type)) {
assert(ptr->chain->length > 0);
nir_ssa_def *desc_array_index =
vtn_access_link_as_ssa(b, ptr->chain->link[0], 1);
*chain_idx = 1;
*type = ptr->var->type->array_element;
return vtn_variable_resource_index(b, ptr->var, desc_array_index);
} else {
*chain_idx = 0;
*type = ptr->var->type;
return vtn_variable_resource_index(b, ptr->var, NULL);
}
}
nir_ssa_def *
vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer *ptr,
nir_ssa_def **index_out, unsigned *end_idx_out)
{
if (ptr->offset) {
assert(ptr->block_index);
if (vtn_pointer_uses_ssa_offset(b, ptr)) {
if (!ptr->offset) {
assert(ptr->mode == vtn_variable_mode_workgroup);
struct vtn_access_chain chain = {
.length = 0,
};
ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
}
*index_out = ptr->block_index;
return ptr->offset;
}
assert(ptr->mode == vtn_variable_mode_push_constant);
*index_out = NULL;
unsigned idx = 0;
struct vtn_type *type;
*index_out = get_vulkan_resource_index(b, ptr, &type, &idx);
struct vtn_type *type = ptr->var->type;
nir_ssa_def *offset = nir_imm_int(&b->nb, 0);
if (ptr->chain) {
for (; idx < ptr->chain->length; idx++) {
enum glsl_base_type base_type = glsl_get_base_type(type->type);
switch (base_type) {
case GLSL_TYPE_UINT:
case GLSL_TYPE_INT:
case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT64:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
case GLSL_TYPE_ARRAY:
offset = nir_iadd(&b->nb, offset,
vtn_access_link_as_ssa(b, ptr->chain->link[idx],
type->stride));
for (; idx < ptr->chain->length; idx++) {
enum glsl_base_type base_type = glsl_get_base_type(type->type);
switch (base_type) {
case GLSL_TYPE_UINT:
case GLSL_TYPE_INT:
case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT64:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
case GLSL_TYPE_ARRAY:
offset = nir_iadd(&b->nb, offset,
vtn_access_link_as_ssa(b, ptr->chain->link[idx],
type->stride));
type = type->array_element;
break;
type = type->array_element;
break;
case GLSL_TYPE_STRUCT: {
assert(ptr->chain->link[idx].mode == vtn_access_mode_literal);
unsigned member = ptr->chain->link[idx].id;
offset = nir_iadd(&b->nb, offset,
nir_imm_int(&b->nb, type->offsets[member]));
type = type->members[member];
break;
}
case GLSL_TYPE_STRUCT: {
assert(ptr->chain->link[idx].mode == vtn_access_mode_literal);
unsigned member = ptr->chain->link[idx].id;
offset = nir_iadd(&b->nb, offset,
nir_imm_int(&b->nb, type->offsets[member]));
type = type->members[member];
break;
}
default:
unreachable("Invalid type for deref");
}
default:
unreachable("Invalid type for deref");
}
}
@@ -831,6 +858,9 @@ vtn_block_load(struct vtn_builder *b, struct vtn_pointer *src)
vtn_access_chain_get_offset_size(src->chain, src->var->type,
&access_offset, &access_size);
break;
case vtn_variable_mode_workgroup:
op = nir_intrinsic_load_shared;
break;
default:
unreachable("Invalid block variable mode");
}
@@ -850,22 +880,26 @@ static void
vtn_block_store(struct vtn_builder *b, struct vtn_ssa_value *src,
struct vtn_pointer *dst)
{
nir_intrinsic_op op;
switch (dst->mode) {
case vtn_variable_mode_ssbo:
op = nir_intrinsic_store_ssbo;
break;
case vtn_variable_mode_workgroup:
op = nir_intrinsic_store_shared;
break;
default:
unreachable("Invalid block variable mode");
}
nir_ssa_def *offset, *index = NULL;
unsigned chain_idx;
offset = vtn_pointer_to_offset(b, dst, &index, &chain_idx);
_vtn_block_load_store(b, nir_intrinsic_store_ssbo, false, index, offset,
_vtn_block_load_store(b, op, false, index, offset,
0, 0, dst->chain, chain_idx, dst->type, &src);
}
static bool
vtn_pointer_is_external_block(struct vtn_pointer *ptr)
{
return ptr->mode == vtn_variable_mode_ssbo ||
ptr->mode == vtn_variable_mode_ubo ||
ptr->mode == vtn_variable_mode_push_constant;
}
static void
_vtn_variable_load_store(struct vtn_builder *b, bool load,
struct vtn_pointer *ptr,
@@ -925,7 +959,7 @@ _vtn_variable_load_store(struct vtn_builder *b, bool load,
struct vtn_ssa_value *
vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src)
{
if (vtn_pointer_is_external_block(src)) {
if (vtn_pointer_is_external_block(b, src)) {
return vtn_block_load(b, src);
} else {
struct vtn_ssa_value *val = NULL;
@@ -938,8 +972,9 @@ void
vtn_variable_store(struct vtn_builder *b, struct vtn_ssa_value *src,
struct vtn_pointer *dest)
{
if (vtn_pointer_is_external_block(dest)) {
assert(dest->mode == vtn_variable_mode_ssbo);
if (vtn_pointer_is_external_block(b, dest)) {
assert(dest->mode == vtn_variable_mode_ssbo ||
dest->mode == vtn_variable_mode_workgroup);
vtn_block_store(b, src, dest);
} else {
_vtn_variable_load_store(b, false, dest, &src);
@@ -1494,11 +1529,9 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct vtn_pointer *ptr)
assert(ptr->ptr_type);
assert(ptr->ptr_type->type);
if (ptr->offset && ptr->block_index) {
return nir_vec2(&b->nb, ptr->block_index, ptr->offset);
} else {
/* If we don't have an offset or block index, then we must be a pointer
* to the variable itself.
if (!ptr->offset) {
/* If we don't have an offset then we must be a pointer to the variable
* itself.
*/
assert(!ptr->offset && !ptr->block_index);
@@ -1508,8 +1541,20 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct vtn_pointer *ptr)
*/
assert(ptr->var && ptr->var->type->base_type == vtn_base_type_struct);
return nir_vec2(&b->nb, vtn_variable_resource_index(b, ptr->var, NULL),
nir_imm_int(&b->nb, 0));
struct vtn_access_chain chain = {
.length = 0,
};
ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
}
assert(ptr->offset);
if (ptr->block_index) {
assert(ptr->mode == vtn_variable_mode_ubo ||
ptr->mode == vtn_variable_mode_ssbo);
return nir_vec2(&b->nb, ptr->block_index, ptr->offset);
} else {
assert(ptr->mode == vtn_variable_mode_workgroup);
return ptr->offset;
}
}
@@ -1517,7 +1562,7 @@ struct vtn_pointer *
vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
struct vtn_type *ptr_type)
{
assert(ssa->num_components == 2 && ssa->bit_size == 32);
assert(ssa->num_components <= 2 && ssa->bit_size == 32);
assert(ptr_type->base_type == vtn_base_type_pointer);
assert(ptr_type->deref->base_type != vtn_base_type_pointer);
/* This pointer type needs to have actual storage */
@@ -1528,8 +1573,19 @@ vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
ptr_type, NULL);
ptr->type = ptr_type->deref;
ptr->ptr_type = ptr_type;
ptr->block_index = nir_channel(&b->nb, ssa, 0);
ptr->offset = nir_channel(&b->nb, ssa, 1);
if (ssa->num_components > 1) {
assert(ssa->num_components == 2);
assert(ptr->mode == vtn_variable_mode_ubo ||
ptr->mode == vtn_variable_mode_ssbo);
ptr->block_index = nir_channel(&b->nb, ssa, 0);
ptr->offset = nir_channel(&b->nb, ssa, 1);
} else {
assert(ssa->num_components == 1);
assert(ptr->mode == vtn_variable_mode_workgroup);
ptr->block_index = NULL;
ptr->offset = ssa;
}
return ptr;
}
@@ -1601,7 +1657,6 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
case vtn_variable_mode_global:
case vtn_variable_mode_image:
case vtn_variable_mode_sampler:
case vtn_variable_mode_workgroup:
/* For these, we create the variable normally */
var->var = rzalloc(b->shader, nir_variable);
var->var->name = ralloc_strdup(var->var, val->name);
@@ -1619,6 +1674,18 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
}
break;
case vtn_variable_mode_workgroup:
if (b->options->lower_workgroup_access_to_offsets) {
var->shared_location = -1;
} else {
/* Create the variable normally */
var->var = rzalloc(b->shader, nir_variable);
var->var->name = ralloc_strdup(var->var, val->name);
var->var->type = var->type->type;
var->var->data.mode = nir_var_shared;
}
break;
case vtn_variable_mode_input:
case vtn_variable_mode_output: {
/* In order to know whether or not we're a per-vertex inout, we need
@@ -1733,7 +1800,7 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
if (var->mode == vtn_variable_mode_local) {
assert(var->members == NULL && var->var != NULL);
nir_function_impl_add_variable(b->impl, var->var);
nir_function_impl_add_variable(b->nb.impl, var->var);
} else if (var->var) {
nir_shader_add_variable(b->shader, var->var);
} else if (var->members) {
@@ -1743,9 +1810,7 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
nir_shader_add_variable(b->shader, var->members[i]);
}
} else {
assert(var->mode == vtn_variable_mode_ubo ||
var->mode == vtn_variable_mode_ssbo ||
var->mode == vtn_variable_mode_push_constant);
assert(vtn_pointer_is_external_block(b, val->pointer));
}
}
@@ -1870,15 +1935,19 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
const uint32_t offset = ptr->var->type->offsets[w[4]];
const uint32_t stride = ptr->var->type->members[w[4]]->stride;
unsigned chain_idx;
struct vtn_type *type;
nir_ssa_def *index =
get_vulkan_resource_index(b, ptr, &type, &chain_idx);
if (!ptr->block_index) {
assert(ptr->mode == vtn_variable_mode_workgroup);
struct vtn_access_chain chain = {
.length = 0,
};
ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
assert(ptr->block_index);
}
nir_intrinsic_instr *instr =
nir_intrinsic_instr_create(b->nb.shader,
nir_intrinsic_get_buffer_size);
instr->src[0] = nir_src_for_ssa(index);
instr->src[0] = nir_src_for_ssa(ptr->block_index);
nir_ssa_dest_init(&instr->instr, &instr->dest, 1, 32, NULL);
nir_builder_instr_insert(&b->nb, &instr->instr);
nir_ssa_def *buf_size = &instr->dest.ssa;

View File

@@ -41,6 +41,7 @@ LOCAL_SRC_FILES := \
LOCAL_CFLAGS := \
-D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_ANDROID \
-D_EGL_BUILT_IN_DRIVER_DRI2 \
-DHAS_GRALLOC_DRM_HEADERS \
-DHAVE_ANDROID_PLATFORM
LOCAL_C_INCLUDES := \

View File

@@ -46,6 +46,7 @@ libEGL_common_la_SOURCES = \
$(LIBEGL_C_FILES)
libEGL_common_la_LIBADD = \
$(top_builddir)/src/mapi/shared-glapi/libglapi.la \
$(top_builddir)/src/util/libmesautil.la \
$(EGL_LIB_DEPS)
@@ -104,7 +105,9 @@ endif
if HAVE_PLATFORM_ANDROID
AM_CFLAGS += $(ANDROID_CFLAGS)
libEGL_common_la_LIBADD += $(ANDROID_LIBS)
dri2_backend_FILES += drivers/dri2/platform_android.c
dri2_backend_FILES += \
drivers/dri2/platform_android.c \
drivers/dri2/egl_dri2_drm_gralloc.h
endif
AM_CFLAGS += \
@@ -164,9 +167,7 @@ libEGL_mesa_la_SOURCES = \
main/egldispatchstubs.c \
g_egldispatchstubs.c \
g_egldispatchstubs.h
libEGL_mesa_la_LIBADD = \
libEGL_common.la \
$(top_builddir)/src/mapi/shared-glapi/libglapi.la
libEGL_mesa_la_LIBADD = libEGL_common.la
libEGL_mesa_la_LDFLAGS = \
-no-undefined \
-version-number 0 \
@@ -178,9 +179,7 @@ else # USE_LIBGLVND
lib_LTLIBRARIES = libEGL.la
libEGL_la_SOURCES =
libEGL_la_LIBADD = \
libEGL_common.la \
$(top_builddir)/src/mapi/shared-glapi/libglapi.la
libEGL_la_LIBADD = libEGL_common.la
libEGL_la_LDFLAGS = \
-no-undefined \
-version-number 1:0 \

View File

@@ -299,7 +299,10 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
_eglSetConfigKey(&base, EGL_MAX_PBUFFER_HEIGHT,
_EGL_MAX_PBUFFER_HEIGHT);
break;
case __DRI_ATTRIB_MUTABLE_RENDER_BUFFER:
if (disp->Extensions.KHR_mutable_render_buffer)
surface_type |= EGL_MUTABLE_RENDER_BUFFER_BIT_KHR;
break;
default:
key = dri2_to_egl_attribute_map[attrib];
if (key != 0)
@@ -457,6 +460,7 @@ static const struct dri2_extension_match optional_core_extensions[] = {
{ __DRI2_RENDERER_QUERY, 1, offsetof(struct dri2_egl_display, rendererQuery) },
{ __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) },
{ __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) },
{ __DRI_MUTABLE_RENDER_BUFFER_DRIVER, 1, offsetof(struct dri2_egl_display, mutable_render_buffer) },
{ NULL, 0, 0 }
};
@@ -973,7 +977,7 @@ dri2_display_destroy(_EGLDisplay *disp)
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
if (dri2_dpy->own_dri_screen) {
if (dri2_dpy->vtbl && dri2_dpy->vtbl->close_screen_notify)
if (dri2_dpy->vtbl->close_screen_notify)
dri2_dpy->vtbl->close_screen_notify(disp);
dri2_dpy->core->destroyScreen(dri2_dpy->dri_screen);
}
@@ -1325,12 +1329,6 @@ dri2_create_context(_EGLDriver *drv, _EGLDisplay *disp, _EGLConfig *conf,
dri_config = dri2_config->dri_config[1][0];
else
dri_config = dri2_config->dri_config[0][0];
/* EGL_WINDOW_BIT is set only when there is a double-buffered dri_config.
* This makes sure the back buffer will always be used.
*/
if (conf->SurfaceType & EGL_WINDOW_BIT)
dri2_ctx->base.WindowRenderBuffer = EGL_BACK_BUFFER;
}
else
dri_config = NULL;
@@ -1521,6 +1519,8 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
_EGLDisplay *old_disp = NULL;
struct dri2_egl_display *old_dri2_dpy = NULL;
_EGLContext *old_ctx;
_EGLSurface *old_dsurf, *old_rsurf;
_EGLSurface *tmp_dsurf, *tmp_rsurf;
@@ -1537,6 +1537,11 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
return EGL_FALSE;
}
if (old_ctx) {
old_disp = old_ctx->Resource.Display;
old_dri2_dpy = dri2_egl_display(old_disp);
}
/* flush before context switch */
if (old_ctx)
dri2_gl_flush();
@@ -1550,31 +1555,30 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
if (old_dsurf)
dri2_surf_update_fence_fd(old_ctx, disp, old_dsurf);
/* Disable shared buffer mode */
if (old_dsurf && _eglSurfaceInSharedBufferMode(old_dsurf) &&
old_dri2_dpy->vtbl->set_shared_buffer_mode) {
old_dri2_dpy->vtbl->set_shared_buffer_mode(old_disp, old_dsurf, false);
}
dri2_dpy->core->unbindContext(old_cctx);
}
unbind = (cctx == NULL && ddraw == NULL && rdraw == NULL);
if (unbind || dri2_dpy->core->bindContext(cctx, ddraw, rdraw)) {
dri2_destroy_surface(drv, disp, old_dsurf);
dri2_destroy_surface(drv, disp, old_rsurf);
if (!unbind)
dri2_dpy->ref_count++;
if (old_ctx) {
EGLDisplay old_disp = _eglGetDisplayHandle(old_ctx->Resource.Display);
dri2_destroy_context(drv, disp, old_ctx);
dri2_display_release(old_disp);
}
return EGL_TRUE;
} else {
if (!unbind && !dri2_dpy->core->bindContext(cctx, ddraw, rdraw)) {
/* undo the previous _eglBindContext */
_eglBindContext(old_ctx, old_dsurf, old_rsurf, &ctx, &tmp_dsurf, &tmp_rsurf);
assert(&dri2_ctx->base == ctx &&
tmp_dsurf == dsurf &&
tmp_rsurf == rsurf);
if (old_dsurf && _eglSurfaceInSharedBufferMode(old_dsurf) &&
old_dri2_dpy->vtbl->set_shared_buffer_mode) {
old_dri2_dpy->vtbl->set_shared_buffer_mode(old_disp, old_dsurf, true);
}
_eglPutSurface(dsurf);
_eglPutSurface(rsurf);
_eglPutContext(ctx);
@@ -1589,6 +1593,31 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
*/
return _eglError(EGL_BAD_MATCH, "eglMakeCurrent");
}
dri2_destroy_surface(drv, disp, old_dsurf);
dri2_destroy_surface(drv, disp, old_rsurf);
if (!unbind)
dri2_dpy->ref_count++;
if (old_ctx) {
dri2_destroy_context(drv, disp, old_ctx);
dri2_display_release(old_disp);
}
if (dsurf && _eglSurfaceHasMutableRenderBuffer(dsurf) &&
dri2_dpy->vtbl->set_shared_buffer_mode) {
/* Always update the shared buffer mode. This is obviously needed when
* the active EGL_RENDER_BUFFER is EGL_SINGLE_BUFFER. When
* EGL_RENDER_BUFFER is EGL_BACK_BUFFER, the update protects us in the
* case where external non-EGL API may have changed window's shared
* buffer mode since we last saw it.
*/
bool mode = (dsurf->ActiveRenderBuffer == EGL_SINGLE_BUFFER);
dri2_dpy->vtbl->set_shared_buffer_mode(disp, dsurf, mode);
}
return EGL_TRUE;
}
__DRIdrawable *

View File

@@ -44,7 +44,7 @@
#ifdef HAVE_WAYLAND_PLATFORM
#include <wayland-client.h>
#include "wayland/wayland-egl/wayland-egl-backend.h"
#include "wayland-egl-backend.h"
/* forward declarations of protocol elements */
struct zwp_linux_dmabuf_v1;
#endif
@@ -61,7 +61,7 @@ struct zwp_linux_dmabuf_v1;
#include <system/window.h>
#include <hardware/gralloc.h>
#include <gralloc_drm_handle.h>
#include "platform_android_gralloc_drm.h"
#endif /* HAVE_ANDROID_PLATFORM */
@@ -147,6 +147,12 @@ struct dri2_egl_display_vtbl {
__DRIdrawable *(*get_dri_drawable)(_EGLSurface *surf);
void (*close_screen_notify)(_EGLDisplay *dpy);
/* Used in EGL_KHR_mutable_render_buffer to update the native window's
* shared buffer mode.
*/
bool (*set_shared_buffer_mode)(_EGLDisplay *dpy, _EGLSurface *surf,
bool mode);
};
struct dri2_egl_display
@@ -172,6 +178,7 @@ struct dri2_egl_display
const __DRI2fenceExtension *fence;
const __DRI2rendererQueryExtension *rendererQuery;
const __DRI2interopExtension *interop;
const __DRImutableRenderBufferDriverExtension *mutable_render_buffer;
int fd;
/* dri2_initialize/dri2_terminate increment/decrement this count, so does

View File

@@ -37,7 +37,7 @@
#include "loader.h"
#include "egl_dri2.h"
#include "egl_dri2_fallbacks.h"
#include "gralloc_drm.h"
#include "platform_android_gralloc_drm.h"
#define ALIGN(val, align) (((val) + (align) - 1) & ~((align) - 1))
@@ -59,6 +59,10 @@ static const struct droid_yuv_format droid_yuv_formats[] = {
{ HAL_PIXEL_FORMAT_YCbCr_420_888, 0, 1, __DRI_IMAGE_FOURCC_YUV420 },
{ HAL_PIXEL_FORMAT_YCbCr_420_888, 1, 1, __DRI_IMAGE_FOURCC_YVU420 },
{ HAL_PIXEL_FORMAT_YV12, 1, 1, __DRI_IMAGE_FOURCC_YVU420 },
/* HACK: See droid_create_image_from_prime_fd() and b/32077885. */
{ HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED, 0, 2, __DRI_IMAGE_FOURCC_NV12 },
{ HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED, 0, 1, __DRI_IMAGE_FOURCC_YUV420 },
{ HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED, 1, 1, __DRI_IMAGE_FOURCC_YVU420 },
};
static int
@@ -90,6 +94,11 @@ get_format_bpp(int native)
switch (native) {
case HAL_PIXEL_FORMAT_RGBA_8888:
case HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED:
/*
* HACK: Hardcode this to RGBX_8888 as per cros_gralloc hack.
* TODO: Remove this once b/32077885 is fixed.
*/
case HAL_PIXEL_FORMAT_RGBX_8888:
case HAL_PIXEL_FORMAT_BGRA_8888:
bpp = 4;
@@ -112,6 +121,11 @@ static int get_fourcc(int native)
case HAL_PIXEL_FORMAT_RGB_565: return __DRI_IMAGE_FOURCC_RGB565;
case HAL_PIXEL_FORMAT_BGRA_8888: return __DRI_IMAGE_FOURCC_ARGB8888;
case HAL_PIXEL_FORMAT_RGBA_8888: return __DRI_IMAGE_FOURCC_ABGR8888;
case HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED:
/*
* HACK: Hardcode this to RGBX_8888 as per cros_gralloc hack.
* TODO: Remove this once b/32077885 is fixed.
*/
case HAL_PIXEL_FORMAT_RGBX_8888: return __DRI_IMAGE_FOURCC_XBGR8888;
default:
_eglLog(_EGL_WARNING, "unsupported native buffer format 0x%x", native);
@@ -125,6 +139,11 @@ static int get_format(int format)
case HAL_PIXEL_FORMAT_BGRA_8888: return __DRI_IMAGE_FORMAT_ARGB8888;
case HAL_PIXEL_FORMAT_RGB_565: return __DRI_IMAGE_FORMAT_RGB565;
case HAL_PIXEL_FORMAT_RGBA_8888: return __DRI_IMAGE_FORMAT_ABGR8888;
case HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED:
/*
* HACK: Hardcode this to RGBX_8888 as per cros_gralloc hack.
* TODO: Revert this once b/32077885 is fixed.
*/
case HAL_PIXEL_FORMAT_RGBX_8888: return __DRI_IMAGE_FORMAT_XBGR8888;
default:
_eglLog(_EGL_WARNING, "unsupported native buffer format 0x%x", format);
@@ -273,6 +292,32 @@ droid_window_cancel_buffer(struct dri2_egl_surface *dri2_surf)
}
}
static bool
droid_set_shared_buffer_mode(_EGLDisplay *disp, _EGLSurface *surf, bool mode)
{
#if __ANDROID_API__ >= 24
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
struct ANativeWindow *window = dri2_surf->window;
assert(surf->Type == EGL_WINDOW_BIT);
assert(_eglSurfaceHasMutableRenderBuffer(&dri2_surf->base));
_eglLog(_EGL_DEBUG, "%s: mode=%d", __func__, mode);
if (native_window_set_shared_buffer_mode(window, mode)) {
_eglLog(_EGL_WARNING, "failed native_window_set_shared_buffer_mode"
"(window=%p, mode=%d)", window, mode);
return false;
}
return true;
#else
_eglLog(_EGL_FATAL, "%s:%d: internal error: unreachable", __FILE__, __LINE__);
return false;
#endif
}
static _EGLSurface *
droid_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
_EGLConfig *conf, void *native_window,
@@ -547,6 +592,21 @@ droid_image_get_buffers(__DRIdrawable *driDrawable,
if (update_buffers(dri2_surf) < 0)
return 0;
if (_eglSurfaceInSharedBufferMode(&dri2_surf->base)) {
if (get_back_bo(dri2_surf) < 0)
return 0;
/* We have dri_image_back because this is a window surface and
* get_back_bo() succeeded.
*/
assert(dri2_surf->dri_image_back);
images->back = dri2_surf->dri_image_back;
images->image_mask |= __DRI_IMAGE_BUFFER_SHARED;
/* There exists no accompanying back nor front buffer. */
return 1;
}
if (buffer_mask & __DRI_IMAGE_BUFFER_FRONT) {
if (get_front_bo(dri2_surf, format) < 0)
return 0;
@@ -593,6 +653,21 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
if (dri2_surf->base.Type != EGL_WINDOW_BIT)
return EGL_TRUE;
const bool has_mutable_rb = _eglSurfaceHasMutableRenderBuffer(draw);
/* From the EGL_KHR_mutable_render_buffer spec (v12):
*
* If surface is a single-buffered window, pixmap, or pbuffer surface
* for which there is no pending change to the EGL_RENDER_BUFFER
* attribute, eglSwapBuffers has no effect.
*/
if (has_mutable_rb &&
draw->RequestedRenderBuffer == EGL_SINGLE_BUFFER &&
draw->ActiveRenderBuffer == EGL_SINGLE_BUFFER) {
_eglLog(_EGL_DEBUG, "%s: remain in shared buffer mode", __func__);
return EGL_TRUE;
}
for (int i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
if (dri2_surf->color_buffers[i].age > 0)
dri2_surf->color_buffers[i].age++;
@@ -617,6 +692,18 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
dri2_dpy->flush->invalidate(dri2_surf->dri_drawable);
/* Update the shared buffer mode */
if (has_mutable_rb &&
draw->ActiveRenderBuffer != draw->RequestedRenderBuffer) {
bool mode = (draw->RequestedRenderBuffer == EGL_SINGLE_BUFFER);
_eglLog(_EGL_DEBUG, "%s: change to shared buffer mode %d",
__func__, mode);
if (!droid_set_shared_buffer_mode(disp, draw, mode))
return EGL_FALSE;
draw->ActiveRenderBuffer = draw->RequestedRenderBuffer;
}
return EGL_TRUE;
}
@@ -678,6 +765,10 @@ droid_create_image_from_prime_fd_yuv(_EGLDisplay *disp, _EGLContext *ctx,
ret = dri2_dpy->gralloc->lock_ycbcr(dri2_dpy->gralloc, buf->handle,
0, 0, 0, 0, 0, &ycbcr);
if (ret) {
/* HACK: See droid_create_image_from_prime_fd() and b/32077885. */
if (buf->format == HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED)
return NULL;
_eglLog(_EGL_WARNING, "gralloc->lock_ycbcr failed: %d", ret);
return NULL;
}
@@ -757,8 +848,20 @@ droid_create_image_from_prime_fd(_EGLDisplay *disp, _EGLContext *ctx,
{
unsigned int pitch;
if (is_yuv(buf->format))
return droid_create_image_from_prime_fd_yuv(disp, ctx, buf, fd);
if (is_yuv(buf->format)) {
_EGLImage *image;
image = droid_create_image_from_prime_fd_yuv(disp, ctx, buf, fd);
/*
* HACK: b/32077885
* There is no API available to properly query the IMPLEMENTATION_DEFINED
* format. As a workaround we rely here on gralloc allocating either
* an arbitrary YCbCr 4:2:0 or RGBX_8888, with the latter being recognized
* by lock_ycbcr failing.
*/
if (image || buf->format != HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED)
return image;
}
const int fourcc = get_fourcc(buf->format);
if (fourcc == -1) {
@@ -1005,7 +1108,6 @@ droid_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *dpy)
{ HAL_PIXEL_FORMAT_RGBA_8888, { 0x000000ff, 0x0000ff00, 0x00ff0000, 0xff000000 } },
{ HAL_PIXEL_FORMAT_RGBX_8888, { 0x000000ff, 0x0000ff00, 0x00ff0000, 0x00000000 } },
{ HAL_PIXEL_FORMAT_RGB_565, { 0x0000f800, 0x000007e0, 0x0000001f, 0x00000000 } },
{ HAL_PIXEL_FORMAT_BGRA_8888, { 0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000 } },
};
unsigned int format_count[ARRAY_SIZE(visuals)] = { 0 };
@@ -1073,7 +1175,7 @@ droid_open_device(struct dri2_egl_display *dri2_dpy)
GRALLOC_MODULE_PERFORM_GET_DRM_FD,
&fd);
if (err || fd < 0) {
_eglLog(_EGL_WARNING, "fail to get drm fd");
_eglLog(_EGL_DEBUG, "fail to get drm fd");
fd = -1;
}
@@ -1102,6 +1204,7 @@ static const struct dri2_egl_display_vtbl droid_display_vtbl = {
.create_wayland_buffer_from_image = dri2_fallback_create_wayland_buffer_from_image,
.get_sync_values = dri2_fallback_get_sync_values,
.get_dri_drawable = dri2_surface_get_dri_drawable,
.set_shared_buffer_mode = droid_set_shared_buffer_mode,
};
static const __DRIdri2LoaderExtension droid_dri2_loader_extension = {
@@ -1121,10 +1224,89 @@ static const __DRIimageLoaderExtension droid_image_loader_extension = {
.getCapability = droid_get_capability,
};
static void
droid_display_shared_buffer(__DRIdrawable *driDrawable, int fence_fd,
void *loaderPrivate)
{
struct dri2_egl_surface *dri2_surf = loaderPrivate;
struct ANativeWindowBuffer *old_buffer UNUSED = dri2_surf->buffer;
if (!_eglSurfaceInSharedBufferMode(&dri2_surf->base)) {
_eglLog(_EGL_WARNING, "%s: internal error: buffer is not shared",
__func__);
return;
}
if (fence_fd >= 0) {
/* The driver's fence is more recent than the surface's out fence, if it
* exists at all. So use the driver's fence.
*/
if (dri2_surf->out_fence_fd >= 0) {
close(dri2_surf->out_fence_fd);
dri2_surf->out_fence_fd = -1;
}
} else if (dri2_surf->out_fence_fd >= 0) {
fence_fd = dri2_surf->out_fence_fd;
dri2_surf->out_fence_fd = -1;
}
if (dri2_surf->window->queueBuffer(dri2_surf->window, dri2_surf->buffer,
fence_fd)) {
_eglLog(_EGL_WARNING, "%s: ANativeWindow::queueBuffer failed", __func__);
close(fence_fd);
return;
}
fence_fd = -1;
if (dri2_surf->window->dequeueBuffer(dri2_surf->window, &dri2_surf->buffer,
&fence_fd)) {
/* Tear down the surface because it no longer has a back buffer. */
struct dri2_egl_display *dri2_dpy =
dri2_egl_display(dri2_surf->base.Resource.Display);
_eglLog(_EGL_WARNING, "%s: ANativeWindow::dequeueBuffer failed", __func__);
dri2_surf->base.Lost = true;
dri2_surf->buffer = NULL;
dri2_surf->back = NULL;
if (dri2_surf->dri_image_back) {
dri2_dpy->image->destroyImage(dri2_surf->dri_image_back);
dri2_surf->dri_image_back = NULL;
}
dri2_dpy->flush->invalidate(dri2_surf->dri_drawable);
return;
}
if (fence_fd < 0)
return;
/* Access to the buffer is controlled by a sync fence. Block on it.
*
* Ideally, we would submit the fence to the driver, and the driver would
* postpone command execution until it signalled. But DRI lacks API for
* that (as of 2018-04-11).
*
* SYNC_IOC_WAIT waits forever if timeout < 0
*/
sync_wait(fence_fd, -1);
close(fence_fd);
}
static const __DRImutableRenderBufferLoaderExtension droid_mutable_render_buffer_extension = {
.base = { __DRI_MUTABLE_RENDER_BUFFER_LOADER, 1 },
.displaySharedBuffer = droid_display_shared_buffer,
};
static const __DRIextension *droid_dri2_loader_extensions[] = {
&droid_dri2_loader_extension.base,
&image_lookup_extension.base,
&use_invalidate.base,
/* No __DRI_MUTABLE_RENDER_BUFFER_LOADER because it requires
* __DRI_IMAGE_LOADER.
*/
NULL,
};
@@ -1132,20 +1314,89 @@ static const __DRIextension *droid_image_loader_extensions[] = {
&droid_image_loader_extension.base,
&image_lookup_extension.base,
&use_invalidate.base,
&droid_mutable_render_buffer_extension.base,
NULL,
};
static bool
droid_probe_device(_EGLDisplay *dpy, bool swrast)
{
struct dri2_egl_display *dri2_dpy = dpy->DriverData;
bool loaded;
dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == DRM_NODE_RENDER;
if (!dri2_dpy->is_render_node && !gralloc_supports_gem_names()) {
_eglLog(_EGL_WARNING, "DRI2: control nodes not supported without GEM name suport in gralloc\n");
return false;
}
if (swrast)
dri2_dpy->driver_name = strdup("kms_swrast");
else
dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
if (dri2_dpy->driver_name == NULL) {
_eglLog(_EGL_WARNING, "DRI2: failed to get driver name");
return false;
}
/* render nodes cannot use Gem names, and thus do not support
* the __DRI_DRI2_LOADER extension */
if (!dri2_dpy->is_render_node) {
dri2_dpy->loader_extensions = droid_dri2_loader_extensions;
loaded = dri2_load_driver(dpy);
} else {
dri2_dpy->loader_extensions = droid_image_loader_extensions;
loaded = dri2_load_driver_dri3(dpy);
}
if (!loaded) {
_eglLog(_EGL_WARNING, "DRI2: failed to load driver");
free(dri2_dpy->driver_name);
dri2_dpy->driver_name = NULL;
return false;
}
return true;
}
static bool
droid_probe_devices(_EGLDisplay *dpy, bool swrast)
{
struct dri2_egl_display *dri2_dpy = dpy->DriverData;
const char *name_template = "%s/renderD%d";
const int base = 128;
const int limit = 64;
int minor;
for (minor = base; minor < base + limit; ++minor) {
char *card_path;
if (asprintf(&card_path, name_template, DRM_DIR_NAME, minor) < 0)
continue;
dri2_dpy->fd = loader_open_device(card_path);
free(card_path);
if (dri2_dpy->fd < 0)
continue;
if (droid_probe_device(dpy, swrast))
return true;
close(dri2_dpy->fd);
dri2_dpy->fd = -1;
}
return false;
}
EGLBoolean
dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp)
dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *dpy)
{
struct dri2_egl_display *dri2_dpy;
const char *err;
int ret;
/* Not supported yet */
if (disp->Options.UseFallback)
return EGL_FALSE;
loader_set_logger(_eglLog);
dri2_dpy = calloc(1, sizeof(*dri2_dpy));
@@ -1160,63 +1411,55 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp)
goto cleanup;
}
disp->DriverData = (void *) dri2_dpy;
dpy->DriverData = (void *) dri2_dpy;
dri2_dpy->fd = droid_open_device(dri2_dpy);
if (dri2_dpy->fd < 0) {
err = "DRI2: failed to open device";
if (dri2_dpy->fd >= 0 &&
!droid_probe_device(dpy, dpy->Options.UseFallback)) {
_eglLog(_EGL_WARNING, "DRI2: Failed to load %s driver",
dpy->Options.UseFallback ? "software" : "hardware");
goto cleanup;
} else if (!droid_probe_devices(dpy, dpy->Options.UseFallback)) {
_eglLog(_EGL_WARNING, "DRI2: Failed to load %s driver",
dpy->Options.UseFallback ? "software" : "hardware");
goto cleanup;
}
dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
if (dri2_dpy->driver_name == NULL) {
err = "DRI2: failed to get driver name";
goto cleanup;
}
dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == DRM_NODE_RENDER;
/* render nodes cannot use Gem names, and thus do not support
* the __DRI_DRI2_LOADER extension */
if (!dri2_dpy->is_render_node) {
dri2_dpy->loader_extensions = droid_dri2_loader_extensions;
if (!dri2_load_driver(disp)) {
err = "DRI2: failed to load driver";
goto cleanup;
}
} else {
dri2_dpy->loader_extensions = droid_image_loader_extensions;
if (!dri2_load_driver_dri3(disp)) {
err = "DRI3: failed to load driver";
goto cleanup;
}
}
if (!dri2_create_screen(disp)) {
if (!dri2_create_screen(dpy)) {
err = "DRI2: failed to create screen";
goto cleanup;
}
if (!dri2_setup_extensions(disp)) {
if (!dri2_setup_extensions(dpy)) {
err = "DRI2: failed to setup extensions";
goto cleanup;
}
dri2_setup_screen(disp);
dri2_setup_screen(dpy);
if (!droid_add_configs_for_visuals(drv, disp)) {
dpy->Extensions.ANDROID_framebuffer_target = EGL_TRUE;
dpy->Extensions.ANDROID_image_native_buffer = EGL_TRUE;
dpy->Extensions.ANDROID_recordable = EGL_TRUE;
dpy->Extensions.EXT_buffer_age = EGL_TRUE;
#if ANDROID_API_LEVEL >= 23
dpy->Extensions.KHR_partial_update = EGL_TRUE;
#endif
dpy->Extensions.KHR_image = EGL_TRUE;
#if __ANDROID_API__ >= 24
if (dri2_dpy->mutable_render_buffer &&
dri2_dpy->loader_extensions == droid_image_loader_extensions) {
dpy->Extensions.KHR_mutable_render_buffer = EGL_TRUE;
}
#endif
/* Create configs *after* enabling extensions because presence of DRI
* driver extensions can affect the capabilities of EGLConfigs.
*/
if (!droid_add_configs_for_visuals(drv, dpy)) {
err = "DRI2: failed to add configs";
goto cleanup;
}
disp->Extensions.ANDROID_framebuffer_target = EGL_TRUE;
disp->Extensions.ANDROID_image_native_buffer = EGL_TRUE;
disp->Extensions.ANDROID_recordable = EGL_TRUE;
disp->Extensions.EXT_buffer_age = EGL_TRUE;
#if ANDROID_API_LEVEL >= 23
disp->Extensions.KHR_partial_update = EGL_TRUE;
#endif
/* Fill vtbl last to prevent accidentally calling virtual function during
* initialization.
*/
@@ -1225,6 +1468,6 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp)
return EGL_TRUE;
cleanup:
dri2_display_destroy(disp);
dri2_display_destroy(dpy);
return _eglError(EGL_NOT_INITIALIZED, err);
}

View File

@@ -0,0 +1,45 @@
/*
* Copyright 2016 Google Inc. All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#pragma once
#ifdef HAS_GRALLOC_DRM_HEADERS
#include <gralloc_drm.h>
#include <gralloc_drm_handle.h>
static inline bool gralloc_supports_gem_names(void) { return true; }
#else
#define GRALLOC_MODULE_PERFORM_GET_DRM_FD 0x0FD4DEAD
static inline int gralloc_drm_get_gem_handle(buffer_handle_t handle)
{
return 0; /* Not supported, return invalid handle. */
}
static inline bool gralloc_supports_gem_names(void) { return false; }
#endif

View File

@@ -1227,8 +1227,8 @@ dri2_wl_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *disp)
int has_format;
unsigned int rgba_masks[4];
} visuals[] = {
{ "XRGB8888", HAS_XRGB8888, { 0xff0000, 0xff00, 0x00ff, 0 } },
{ "ARGB8888", HAS_ARGB8888, { 0xff0000, 0xff00, 0x00ff, 0xff000000 } },
{ "XRGB8888", HAS_XRGB8888, { 0xff0000, 0xff00, 0x00ff, 0xff000000 } },
{ "ARGB8888", HAS_ARGB8888, { 0xff0000, 0xff00, 0x00ff, 0 } },
{ "RGB565", HAS_RGB565, { 0x00f800, 0x07e0, 0x001f, 0 } },
};
unsigned int format_count[ARRAY_SIZE(visuals)] = { 0 };
@@ -1720,7 +1720,7 @@ dri2_wl_swrast_commit_backbuffer(struct dri2_egl_surface *dri2_surf)
* handle the commit and send a release event before checking for a free
* buffer */
if (dri2_surf->throttle_callback == NULL) {
dri2_surf->throttle_callback = wl_display_sync(dri2_surf->wl_dpy_wrapper);
dri2_surf->throttle_callback = wl_display_sync(dri2_dpy->wl_dpy_wrapper);
wl_callback_add_listener(dri2_surf->throttle_callback,
&throttle_listener, dri2_surf);
}

View File

@@ -75,17 +75,6 @@ egl_dri3_get_dri_context(struct loader_dri3_drawable *draw)
return dri2_ctx->dri_context;
}
static __DRIscreen *
egl_dri3_get_dri_screen(void)
{
_EGLContext *ctx = _eglGetCurrentContext();
struct dri2_egl_context *dri2_ctx;
if (!ctx)
return NULL;
dri2_ctx = dri2_egl_context(ctx);
return dri2_egl_display(dri2_ctx->base.Resource.Display)->dri_screen;
}
static void
egl_dri3_flush_drawable(struct loader_dri3_drawable *draw, unsigned flags)
{
@@ -99,7 +88,6 @@ static const struct loader_dri3_vtable egl_dri3_vtable = {
.set_drawable_size = egl_dri3_set_drawable_size,
.in_current_context = egl_dri3_in_current_context,
.get_dri_context = egl_dri3_get_dri_context,
.get_dri_screen = egl_dri3_get_dri_screen,
.flush_drawable = egl_dri3_flush_drawable,
.show_fps = NULL,
};

View File

@@ -504,9 +504,11 @@ _eglCreateExtensionsString(_EGLDisplay *dpy)
_EGL_CHECK_EXTENSION(KHR_gl_texture_3D_image);
_EGL_CHECK_EXTENSION(KHR_gl_texture_cubemap_image);
if (dpy->Extensions.KHR_image_base && dpy->Extensions.KHR_image_pixmap)
_eglAppendExtension(&exts, "EGL_KHR_image");
dpy->Extensions.KHR_image = EGL_TRUE;
_EGL_CHECK_EXTENSION(KHR_image);
_EGL_CHECK_EXTENSION(KHR_image_base);
_EGL_CHECK_EXTENSION(KHR_image_pixmap);
_EGL_CHECK_EXTENSION(KHR_mutable_render_buffer);
_EGL_CHECK_EXTENSION(KHR_no_config_context);
_EGL_CHECK_EXTENSION(KHR_partial_update);
_EGL_CHECK_EXTENSION(KHR_reusable_sync);

View File

@@ -268,6 +268,7 @@ static const struct {
EGLBoolean
_eglValidateConfig(const _EGLConfig *conf, EGLBoolean for_matching)
{
_EGLDisplay *disp = conf->Display;
EGLint i, attr, val;
EGLBoolean valid = EGL_TRUE;
@@ -331,6 +332,8 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean for_matching)
EGL_VG_ALPHA_FORMAT_PRE_BIT |
EGL_MULTISAMPLE_RESOLVE_BOX_BIT |
EGL_SWAP_BEHAVIOR_PRESERVED_BIT;
if (disp->Extensions.KHR_mutable_render_buffer)
mask |= EGL_MUTABLE_RENDER_BUFFER_BIT_KHR;
break;
case EGL_RENDERABLE_TYPE:
case EGL_CONFORMANT:

View File

@@ -579,7 +579,6 @@ _eglInitContext(_EGLContext *ctx, _EGLDisplay *dpy, _EGLConfig *conf,
_eglInitResource(&ctx->Resource, sizeof(*ctx), dpy);
ctx->ClientAPI = api;
ctx->Config = conf;
ctx->WindowRenderBuffer = EGL_NONE;
ctx->Profile = EGL_CONTEXT_OPENGL_CORE_PROFILE_BIT_KHR;
ctx->ClientMajorVersion = 1; /* the default, per EGL spec */
@@ -611,15 +610,42 @@ static EGLint
_eglQueryContextRenderBuffer(_EGLContext *ctx)
{
_EGLSurface *surf = ctx->DrawSurface;
EGLint rb;
/* From the EGL 1.5 spec:
*
* - If the context is not bound to a surface, then EGL_NONE will be
* returned.
*/
if (!surf)
return EGL_NONE;
if (surf->Type == EGL_WINDOW_BIT && ctx->WindowRenderBuffer != EGL_NONE)
rb = ctx->WindowRenderBuffer;
else
rb = surf->RenderBuffer;
return rb;
switch (surf->Type) {
default:
unreachable("bad EGLSurface type");
case EGL_PIXMAP_BIT:
/* - If the context is bound to a pixmap surface, then EGL_SINGLE_BUFFER
* will be returned.
*/
return EGL_SINGLE_BUFFER;
case EGL_PBUFFER_BIT:
/* - If the context is bound to a pbuffer surface, then EGL_BACK_BUFFER
* will be returned.
*/
return EGL_BACK_BUFFER;
case EGL_WINDOW_BIT:
/* - If the context is bound to a window surface, then either
* EGL_BACK_BUFFER or EGL_SINGLE_BUFFER may be returned. The value
* returned depends on both the buffer requested by the setting of the
* EGL_RENDER_BUFFER property of the surface [...], and on the client
* API (not all client APIs support single-buffer Rendering to window
* surfaces). Some client APIs allow control of whether rendering goes
* to the front or back buffer. This client API-specific choice is not
* reflected in the returned value, which only describes the buffer
* that will be rendered to by default if not overridden by the client
* API.
*/
return surf->ActiveRenderBuffer;
}
}

View File

@@ -64,9 +64,6 @@ struct _egl_context
EGLint ResetNotificationStrategy;
EGLint ContextPriority;
EGLBoolean NoError;
/* The real render buffer when a window surface is bound */
EGLint WindowRenderBuffer;
};

Some files were not shown because too many files have changed in this diff Show More