Compare commits

...

42 Commits

Author SHA1 Message Date
Emil Velikov
af223b57a4 Update version to 18.3.0-rc6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-05 21:36:54 +00:00
Gurchetan Singh
c694d84f10 virgl: don't mark buffers as unclean after a write
We can mark the buffer unclean if it's ever bound as a TBO,
SSBO, ABO, or image.

This improves

dEQP-GLES3.performance.buffer.data_upload.function_call.map_buffer_range.new_specified_buffer.flag_write_full.stream_draw

from 9.58 MB/s to 451.17 MB/s.

v2: Track buffer cleanliness as a function of bindings (Ilia).
v3: virgl_modify_clean --> virgl_dirty_res (Erik)

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 89b4798c06)

With this and previous two patches, the performance of virgl on top of
a r600 (AMD 6870 HD) host improves as follows:

         | FPS avg |  Score
--------------------------------
 before  |   8.2   |   343
 after   |  21.9   |   916

         | FPS avg |  Score
--------------------------------
 before  |  13.2   |   333
 after   |  32.3   |   790
2018-12-05 15:43:48 +00:00
Gurchetan Singh
a69ef11424 virgl: avoid large inline transfers
We flush everytime the command buffer (16 kB) is full, which is
quite costly.

This improves

dEQP-GLES3.performance.buffer.data_upload.function_call.buffer_data.new_buffer.usage_stream_draw

from 111.16 MB/s to 1930.36 MB/s.

In addition, I made the benchmark produce buffers from 0 --> VIRGL_MAX_CMDBUF_DWORDS * 4,
and tried ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 2), ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 4), etc.

I didn't notice any clear differences, so let's just go with the most obvious
heuristic.

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit d18492c64f)
2018-12-05 15:43:32 +00:00
Gurchetan Singh
4b715e3e59 virgl: quadruple command buffer size
Tested running WebGL aquarium on Nvidia host (10,000 fishes)

This moves us from 7 fps to 9 fps.  After quadrupling, performance
gains diminish.

v2: Remove change ID (Erik)

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit c0773315af)
2018-12-05 15:37:50 +00:00
Michal Srb
bd7edf473e drisw: Use separate drisw_loader_funcs for shm
The original code was modifying the global drisw_lf variable, which is bad
when there are multiple contexts in single process, each initialized with
different loader. One may support put_image_shm and the other not.

Since there are currently only two possible combinations, lets create two
global tables, one for each. Lets make them const, since we won't change them
and they can be shared.

This fixes crash in VLC. It used two GL contexts (each in different thread), one
was initialized by its Qt GUI, the other by its video output plugin. The first
one set the put_image_shm=drisw_put_image_shm, the second did not, but
since the same structure was used, the drisw_put_image_shm was used too. Then
it crashed because the second loader did not have putImageShm set.

Downstream bug:
https://bugzilla.opensuse.org/show_bug.cgi?id=1113533

v2: Added Fixes and described the VLC bug.

Fixes: 63c427fa71 ("drisw: use putImageShm if available")
Signed-off-by: Michal Srb <msrb@suse.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 63c0916ada)
2018-12-05 13:29:53 +00:00
Michal Srb
a34228e1b0 gallium: Constify drisw_loader_funcs struct
The content is not expected to change.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Michal Srb <msrb@suse.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c0ac038c97)
2018-12-05 13:29:51 +00:00
Samuel Pitoiset
d92bbe54ea radv: wait on the high 32 bits of timestamp queries
In case we are unlucky if the low part is 0xffffffff.

Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp queries")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c7ada4901a)
[Emil: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_query.c
2018-12-05 13:29:16 +00:00
Lionel Landwerlin
54acae83e0 anv/query: flush render target before copying results
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.

v2: Simplify logic in CopyQueryResults (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9a7b319903)
2018-12-05 13:20:00 +00:00
Alex Smith
462bc0d5d4 radv: Flush before vkCmdWriteTimestamp() if needed
As done for vkCmdBeginQuery() already. Prevents timestamps from being
overwritten by previous vkCmdResetQueryPool() calls if the shader path
was used to do the reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c1b6cb068c)
2018-12-05 13:19:58 +00:00
Samuel Pitoiset
055e0d7126 radv: rework the TC-compat HTILE hardware bug with COND_EXEC
After investigating on this, it appears that COND_WRITE doesn't
work correctly in some situations. I don't know exactly why does
it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK
also uses COND_EXEC I think there is a reason.

Now the driver stores a new metadata value in order to reflect
the last fast depth clear state. If a TC-compat HTILE is fast cleared
with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to
work around that hardware bug.

This fixes rendering issues with The Forest and DXVK and doesn't
seem to introduce any regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914
Fixes: 68dead112e ("radv: update the ZRANGE_PRECISION value for the TC-compat bug")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 824cfc1ee5)
2018-12-05 13:19:55 +00:00
Dave Airlie
5b50e6a7ec radv: use 3d shader for gfx9 copies if dst is 3d
This fixes some crucible 3d miptree tests I've been working on
when executed using the compute shader path.

Fixes: d08f267814 (radv/gfx9: fix 3d image to image transfers on compute queues.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1363a47c9c)
2018-12-05 13:19:53 +00:00
Bas Nieuwenhuizen
5594bb584d radv/android: Use buffer metadata to determine scanout compat.
These days we don't always allocate scanout compatible textures anymore.
That does mean we have to fix the radv android WSI though.

Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3bf48741e1)
2018-12-05 13:19:38 +00:00
Bas Nieuwenhuizen
d369bd91c3 radv/android: Mark android WSI image as shareable.
Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 51091b3e1f)
2018-12-05 13:19:36 +00:00
Matt Turner
cc45108382 Revert "st/mesa: silenced unhanded enum warning in st_glsl_to_tgsi.cpp"
This reverts commit 198c50f487.

This needs to be reverted after commit 017199d2d2 ("mesa: Revert
INTEL_fragment_shader_ordering support")

(cherry picked from commit dd53bb7e1f)
2018-12-05 13:19:28 +00:00
Matt Turner
babf9ab7da mesa: Revert INTEL_fragment_shader_ordering support
This extension is not properly tested (testing for
GL_ARB_fragment_shader_interlock is not sufficient), and since this was
noted in review on August 28th no tests have been sent.

Revert "i965: Add INTEL_fragment_shader_ordering support."
Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering"

This reverts commit 03ecec9ed2.
This reverts commit 119435c877.

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 017199d2d2)
2018-12-05 13:19:21 +00:00
Tobias Klausmann
3985a62afc amd/vulkan: meson build - use radv_deps for libvulkan_radeon
Without this the build breaks with:

FAILED: src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o
cc -Isrc/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha -Isrc/amd/vulkan
-I../src/amd/vulkan -Isrc/../include -I../src/../include -Isrc -I../src
-Isrc/mapi -I../src/mapi -Isrc/mesa -I../src/mesa -I../src/gallium/include
-Isrc/gallium/auxiliary -I../src/gallium/auxiliary -Isrc/amd -I../src/amd
-Isrc/amd/common -I../src/amd/common -Isrc/compiler -I../src/compiler
-Isrc/vulkan/util -I../src/vulkan/util -Isrc/vulkan/wsi -I../src/vulkan/wsi
-Isrc/compiler/nir -I../src/compiler/nir -I/usr/include -I/usr/include/libdrm
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
-std=c99 -O2 -g '-DVERSION="18.3.0-rc5"' -DPACKAGE_VERSION=VERSION
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"'
-DGLX_USE_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=0
-DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING
-DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DENABLE_SHADER_CACHE
-DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ
-DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT
-DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT
-DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE
-DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN
-DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE
-DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT
-DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT
-DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
-DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS
-DHAVE_FUNC_ATTRIBUTE_NORETURN -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS
-DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H
-DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP
-DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L
-DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD
-DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DHAVE_LLVM=0x0600
-DMESA_LLVM_VERSION_PATCH=1 -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED
-DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Werror=implicit-function-declaration
-Werror=missing-prototypes -Werror=return-type -fno-math-errno
-fno-trapping-math -Wno-missing-field-initializers -Wno-format-truncation -O2
-Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection -DNDEBUG -fPIC -pthread
-D__STDC_FORMAT_MACROS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
-D__STDC_LIMIT_MACROS -fvisibility=hidden -Wno-override-init
-DVK_USE_PLATFORM_XCB_KHR -DVK_USE_PLATFORM_XLIB_KHR
-DVK_USE_PLATFORM_WAYLAND_KHR -DVK_USE_PLATFORM_DISPLAY_KHR
-DVK_USE_PLATFORM_XLIB_XRANDR_EXT  -MD -MQ
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -MF
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o.d' -o
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -c
../src/amd/vulkan/radv_pipeline.c
In file included from ../src/vulkan/util/vk_alloc.h:29,
                 from ../src/amd/vulkan/radv_private.h:52,
                 from ../src/amd/vulkan/radv_debug.h:27,
                 from ../src/amd/vulkan/radv_pipeline.c:30:
../src/../include/vulkan/vulkan.h:54:10: fatal error: wayland-client.h: Datei
oder Verzeichnis nicht gefunden
 #include <wayland-client.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.

The above command misses the include directory for wayland:
    -I/usr/include/wayland

The missing include is contained in the (until now) unused radv_deps:

if with_platform_wayland
  radv_deps += dep_wayland_client
  radv_flags += '-DVK_USE_PLATFORM_WAYLAND_KHR'
  libradv_files += files('radv_wsi_wayland.c')
endif

Fixes: 673dda8330 "meson: build "radv" vulkan driver for radeon hardware"
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 9401a2f2e6)
2018-12-03 18:32:04 +00:00
Karol Herbst
a7c4368a66 nv50,nvc0: Fix gallium nine regression regarding sampler bindings
The new approach is that samplers don't get unbound even if they won't be used
in a draw and we should just leave them be as well.

Fixes a regression in multiple windows games using gallium nine and nouveau.

v2: adjust num_samplers to keep track of the highest sampler bound
v3: rework how to set the new value of num_samplers

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106577
Fixes: 4d6fab245e
       "cso: don't track the number of sampler states bound"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit fc0139d283)
2018-12-03 16:41:42 +00:00
Vinson Lee
ab83cfd2bf st/xvmc: Add X11 include path.
This patch fixes this build error.

  CC       tests/xvmc_bench.o
In file included from tests/xvmc_bench.c:35:
tests/testlib.h:38:10: fatal error: 'X11/Xlib.h' file not found
         ^~~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4f74580d30)
2018-12-03 16:07:13 +00:00
Lionel Landwerlin
98d571d212 anv: flush pipeline before query result copies
Pipeline state pending bits should be taken into account when copying
results.

In the particular bug below, the results of the
vkCmdCopyQueryPoolResults() command was being overwritten by the
preceding vkCmdCopyBuffer() with a same destination buffer. This is
because we copy the buffers using the 3D pipeline whereas we copy the
query results using the command streamer. Those pieces of HW work in
parallel and the results are somewhat undefined.

v2: Unconditionally flush the pipeline before copying the results
    (Jason)

v3: Wrap & expressions (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 37f9788e9a)
2018-11-30 16:51:49 +00:00
Thomas Hellstrom
56f90f6213 winsys/svga: Fix a memory leak
The ioctl.cap_3d member was never freed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 058f85d41c)
2018-11-30 16:51:46 +00:00
Thomas Hellstrom
c2a22a44a1 st/xa: Fix a memory leak
Free the context after destruction.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7fce3ca375)
2018-11-30 16:51:44 +00:00
Dave Airlie
fe460ee8cd r600: make suballocator 256-bytes align
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108311
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2ddd44d941)
2018-11-30 16:51:41 +00:00
Emil Velikov
b28aa1178a Update version to 18.3.0-rc5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-29 11:56:27 +00:00
Emil Velikov
bb4bbb5c2d cherry-ignore: egl/wayland: rather obvious build fix
Commit was squashed into the respective offenders

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-11-28 18:05:05 +00:00
Eric Engestrom
ce6a9169f0 vulkan/wsi: fix s/,/;/ typo
Fixes: 59e58c348e "vulkan/wsi: Only wait on semaphores on the first swapchain"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e0f1f74eda)
2018-11-28 18:05:05 +00:00
Emil Velikov
bcc8332606 egl/wayland: plug memory leak in drm_handle_device()
As we fail to open the node, we leak the node/device name.

v2: Log and then free() (Eric)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit ce74a7bb8d)
2018-11-28 18:05:05 +00:00
Emil Velikov
ace4860a4f egl/wayland: bail out when drmGetMagic fails
Currently as the function fails, we pass uninitialized data to the
authentication function. Stop doing that and print an warning when
the function fails.

v2: Plug memory leak in error path (Eric)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit c59d3aa4b9)
2018-11-28 17:41:53 +00:00
Eric Engestrom
41671f5dc0 wsi/display: fix mem leak when freeing swapchains
Fixes: da997ebec9 "vulkan: Add KHR_display extension using DRM [v10]"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
(cherry picked from commit 9575cd2893)
2018-11-28 17:03:40 +00:00
Bas Nieuwenhuizen
ec659efcba radv: Align large buffers to the fragment size.
Improves performance in Talos by about 15% (and significant improvements
in RotR and possibly other but did not bench with final patch) on
kernel 4.19 and earlier.

On 4.20+ a similar effect comes from

433ca054949a "drm/amdgpu: try allocating VRAM as power of two"

v2: Do not impact the alignment of the physical memory.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6569644bb6)
2018-11-28 17:03:38 +00:00
Bas Nieuwenhuizen
a1f6ae4e27 radv: Clamp gfx9 image view extents to the allocated image extents.
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd "radv/gfx9: fix block compression texture views. (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 08ea6b9d9b)
2018-11-27 12:04:31 +00:00
Eric Engestrom
a32c568d39 anv: correctly use vulkan 1.0 by default
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).

Reported-by: Niklas Haas <git@haasn.xyz>
Fixes: 8c048af589 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 56d126f8fd)
2018-11-27 12:04:25 +00:00
Erik Faye-Lund
d575455be6 mesa/main: fix incorrect depth-error
If glGetTexImage or glGetnTexImage is called with a level that doesn't
exist, we get an error message on this form:

Mesa: User error: GL_INVALID_VALUE in glGetTexImage(depth = 0)

This is clearly nonsensical, because these APIs don't even have a
depth-parameter. The reason is that get_texture_image_dims() return
all-zero dimensions for non-existent texture-images, and we go on to
validate these dimensions as if they were user-input, because
glGetTextureSubImage requires checking.

So let's split this logic in two, so glGetTextureSubImage can have
stricter input-validation. All arguments that are no longer validated
are generated internally by mesa, so there's no use in validating them.

Fixes: 42891dbaa1 "gettextsubimage: verify zoffset and depth are correct"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit c120dbfe4d)
2018-11-26 15:50:30 +00:00
Erik Faye-Lund
7d8a9087ae mesa/main: check cube-completeness in common code
This check is the only part of dimensions_error_check that isn't about
error-checking the offset and size arguments of
glGet[Compressed]TextureSubImage(), so it doesn't really belong in here.

This doesn't make a difference right now, apart for changing the
presedence of this error. But it will make a difference  for the next
patch, where we no longer call this method from the non-sub tex-image
getters.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 38af69adfa)
2018-11-26 15:50:29 +00:00
Erik Faye-Lund
5598426132 mesa/main: factor out common error-checking
This error checking is the same for teximage and texsubimage getters, so
let's factor it out to its own function.

This will be useful when getteximage and gettexsubimage gets their own
error checking routines a bit later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 42820c5727)
2018-11-26 15:50:28 +00:00
Erik Faye-Lund
35e9cd3428 mesa/main: factor out tex-image error-checking
This will be useful when we split error-checking for getteximage and
gettexsubimage later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 5e0a84f31c)
2018-11-26 15:50:26 +00:00
Erik Faye-Lund
1a905e4c5b mesa/main: remove bogus error for zero-sized images
The explanation quotes the spec on the following wording to justify the
error:

"An INVALID_VALUE error is generated if xoffset + width is greater than
 the texture’s width, yoffset + height is greater than the  texture’s
 height, or zoffset + depth is greater than the texture’s depth."

However, this shouldn't generate an error in the case where *all three*
of width, xoffset and the texture's width are zero. In this case, we end
up generating an unspecified error.

So let's remove this check, and instead make sure that we consider this
as an empty texture.

So let's not generate an error, there's non mandated in the spec in
xoffset/yoffset/zoffset = 0 case. We already avoid doing any work in
this case, because of the final, non-error generating check in this
function.

Fixes: b37b35a5d2 "getteximage: assume texture image is empty for non defined levels"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 38bbb61252)
2018-11-26 15:37:07 +00:00
Gert Wollny
6b9b7ce38c glsl: free or reuse memory allocated for TF varying
When a shader program is de-serialized the gl_shader_program passed in
may actually still hold memory allocations for the transform feedback
varyings. If that is the case, free the varying names and reallocate
the new storage for the names array.

This fixes a memory leak:
Direct leak of 48 byte(s) in 6 object(s) allocated from:
 in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
 in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:875
 in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
 ...
Indirect leak of 42 byte(s) in 6 object(s) allocated from:
  in __interceptor_strdup (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0x761c8)
  in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:887
  in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985

Fixes: ab2643e4b0
   glsl: serialize data from glTransformFeedbackVaryings

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit f5d053702f)
2018-11-26 15:37:05 +00:00
Bas Nieuwenhuizen
02566b9725 radv: Fix opaque metadata descriptor last layer.
We used the layer count which results in an off by one error.

Not sure this really affects anything.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3c96a1e3a9)
2018-11-26 15:37:02 +00:00
Marek Olšák
825cb76860 winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d4e7d8b7f0)
2018-11-26 15:37:00 +00:00
Marek Olšák
a941399117 winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 82aa07f81f)
2018-11-26 15:36:33 +00:00
Eric Engestrom
f7040d9107 glapi: add missing visibility args
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108829
Fixes: 3218056e0e "meson: Build i965 and dri stack"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 896c59d690)
2018-11-23 11:50:36 +00:00
Jason Ekstrand
b8502f1517 anv: Put robust buffer access in the pipeline hash
It affects apply_pipeline_layout.  Shaders compiled with the wrong value
will work but they may not be robust as requested by the app.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 617e402b3d)
2018-11-22 16:04:58 +00:00
53 changed files with 502 additions and 251 deletions

View File

@@ -1 +1 @@
18.3.0-rc4
18.3.0-rc6

2
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,2 @@
# fixes: Commit was squashed into the respective offenders
c02390f8fcd367c7350db568feabb2f062efca14 egl/wayland: rather obvious build fix

View File

@@ -61,7 +61,6 @@ Note: some of the new features are only available with certain drivers.
<li>GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.</li>
<li>GL_EXT_window_rectangles on radeonsi.</li>
<li>GL_KHR_texture_compression_astc_sliced_3d on radeonsi.</li>
<li>GL_INTEL_fragment_shader_ordering on i965.</li>
<li>GL_NV_fragment_shader_interlock on i965.</li>
<li>EGL_EXT_device_base for all drivers.</li>
<li>EGL_EXT_device_drm for all drivers.</li>

View File

@@ -140,7 +140,7 @@ libvulkan_radeon = shared_library(
],
dependencies : [
dep_llvm, dep_libdrm_amdgpu, dep_thread, dep_elf, dep_dl, dep_m,
dep_valgrind,
dep_valgrind, radv_deps,
idep_nir,
],
c_args : [c_vis_args, no_override_init_args, radv_flags],

View File

@@ -110,17 +110,6 @@ radv_image_from_gralloc(VkDevice device_h,
struct radv_bo *bo = NULL;
VkResult result;
result = radv_image_create(device_h,
&(struct radv_image_create_info) {
.vk_info = base_info,
.scanout = true,
.no_metadata_planes = true},
alloc,
&image_h);
if (result != VK_SUCCESS)
return result;
if (gralloc_info->handle->numFds != 1) {
return vk_errorf(device->instance, VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR,
"VkNativeBufferANDROID::handle::numFds is %d, "
@@ -133,23 +122,14 @@ radv_image_from_gralloc(VkDevice device_h,
*/
int dma_buf = gralloc_info->handle->data[0];
image = radv_image_from_handle(image_h);
VkDeviceMemory memory_h;
const VkMemoryDedicatedAllocateInfoKHR ded_alloc = {
.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO_KHR,
.pNext = NULL,
.buffer = VK_NULL_HANDLE,
.image = image_h
};
const VkImportMemoryFdInfoKHR import_info = {
.sType = VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR,
.pNext = &ded_alloc,
.handleType = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHR,
.fd = dup(dma_buf),
};
/* Find the first VRAM memory type, or GART for PRIME images. */
int memory_type_index = -1;
for (int i = 0; i < device->physical_device->memory_properties.memoryTypeCount; ++i) {
@@ -168,14 +148,49 @@ radv_image_from_gralloc(VkDevice device_h,
&(VkMemoryAllocateInfo) {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.pNext = &import_info,
.allocationSize = image->size,
/* Max buffer size, unused for imports */
.allocationSize = 0x7FFFFFFF,
.memoryTypeIndex = memory_type_index,
},
alloc,
&memory_h);
if (result != VK_SUCCESS)
return result;
struct radeon_bo_metadata md;
device->ws->buffer_get_metadata(radv_device_memory_from_handle(memory_h)->bo, &md);
bool is_scanout;
if (device->physical_device->rad_info.chip_class >= GFX9) {
/* Copied from radeonsi, but is hacky so should be cleaned up. */
is_scanout = md.u.gfx9.swizzle_mode == 0 || md.u.gfx9.swizzle_mode % 4 == 2;
} else {
is_scanout = md.u.legacy.scanout;
}
VkImageCreateInfo updated_base_info = *base_info;
VkExternalMemoryImageCreateInfo external_memory_info = {
.sType = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO,
.pNext = updated_base_info.pNext,
.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT,
};
updated_base_info.pNext = &external_memory_info;
result = radv_image_create(device_h,
&(struct radv_image_create_info) {
.vk_info = &updated_base_info,
.scanout = is_scanout,
.no_metadata_planes = true},
alloc,
&image_h);
if (result != VK_SUCCESS)
goto fail_create_image;
image = radv_image_from_handle(image_h);
radv_BindImageMemory(device_h, image_h, memory_h, 0);
image->owned_memory = memory_h;
@@ -185,9 +200,7 @@ radv_image_from_gralloc(VkDevice device_h,
return VK_SUCCESS;
fail_create_image:
fail_size:
radv_DestroyImage(device_h, image_h, alloc);
radv_FreeMemory(device_h, memory_h, alloc);
return result;
}

View File

@@ -1068,7 +1068,7 @@ static void
radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
struct radv_ds_buffer_info *ds,
struct radv_image *image, VkImageLayout layout,
bool requires_cond_write)
bool requires_cond_exec)
{
uint32_t db_z_info = ds->db_z_info;
uint32_t db_z_info_reg;
@@ -1092,38 +1092,21 @@ radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
}
/* When we don't know the last fast clear value we need to emit a
* conditional packet, otherwise we can update DB_Z_INFO directly.
* conditional packet that will eventually skip the following
* SET_CONTEXT_REG packet.
*/
if (requires_cond_write) {
radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0));
const uint32_t write_space = 0 << 8; /* register */
const uint32_t poll_space = 1 << 4; /* memory */
const uint32_t function = 3 << 0; /* equal to the reference */
const uint32_t options = write_space | poll_space | function;
radeon_emit(cmd_buffer->cs, options);
/* poll address - location of the depth clear value */
if (requires_cond_exec) {
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->clear_value_offset;
/* In presence of stencil format, we have to adjust the base
* address because the first value is the stencil clear value.
*/
if (vk_format_is_stencil(image->vk_format))
va += 4;
va += image->offset + image->tc_compat_zrange_offset;
radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_EXEC, 3, 0));
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, va >> 32);
radeon_emit(cmd_buffer->cs, fui(0.0f)); /* reference value */
radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* comparison mask */
radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write address low */
radeon_emit(cmd_buffer->cs, 0u); /* write address high */
radeon_emit(cmd_buffer->cs, db_z_info);
} else {
radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, db_z_info);
radeon_emit(cmd_buffer->cs, 0);
radeon_emit(cmd_buffer->cs, 3); /* SET_CONTEXT_REG size */
}
radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, db_z_info);
}
static void
@@ -1270,6 +1253,45 @@ radv_set_ds_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
radeon_emit(cs, fui(ds_clear_value.depth));
}
/**
* Update the TC-compat metadata value for this image.
*/
static void
radv_set_tc_compat_zrange_metadata(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
uint32_t value)
{
struct radeon_cmdbuf *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->tc_compat_zrange_offset;
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_PFP));
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
radeon_emit(cs, value);
}
static void
radv_update_tc_compat_zrange_metadata(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkClearDepthStencilValue ds_clear_value)
{
struct radeon_cmdbuf *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->tc_compat_zrange_offset;
uint32_t cond_val;
/* Conditionally set DB_Z_INFO.ZRANGE_PRECISION to 0 when the last
* depth clear value is 0.0f.
*/
cond_val = ds_clear_value.depth == 0.0f ? UINT_MAX : 0;
radv_set_tc_compat_zrange_metadata(cmd_buffer, image, cond_val);
}
/**
* Update the clear depth/stencil values for this image.
*/
@@ -1283,6 +1305,12 @@ radv_update_ds_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
radv_set_ds_clear_metadata(cmd_buffer, image, ds_clear_value, aspects);
if (radv_image_is_tc_compat_htile(image) &&
(aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {
radv_update_tc_compat_zrange_metadata(cmd_buffer, image,
ds_clear_value);
}
radv_update_bound_fast_clear_ds(cmd_buffer, image, ds_clear_value,
aspects);
}
@@ -4192,6 +4220,15 @@ static void radv_initialize_htile(struct radv_cmd_buffer *cmd_buffer,
aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
radv_set_ds_clear_metadata(cmd_buffer, image, value, aspects);
if (radv_image_is_tc_compat_htile(image)) {
/* Initialize the TC-compat metada value to 0 because by
* default DB_Z_INFO.RANGE_PRECISION is set to 1, and we only
* need have to conditionally update its value when performing
* a fast depth clear.
*/
radv_set_tc_compat_zrange_metadata(cmd_buffer, image, 0);
}
}
static void radv_handle_depth_image_transition(struct radv_cmd_buffer *cmd_buffer,

View File

@@ -691,7 +691,7 @@ radv_query_opaque_metadata(struct radv_device *device,
si_make_texture_descriptor(device, image, false,
(VkImageViewType)image->type, image->vk_format,
&fixedmapping, 0, image->info.levels - 1, 0,
image->info.array_size,
image->info.array_size - 1,
image->info.width, image->info.height,
image->info.depth,
desc, NULL);
@@ -870,6 +870,14 @@ radv_image_alloc_htile(struct radv_image *image)
/* + 8 for storing the clear values */
image->clear_value_offset = image->htile_offset + image->surface.htile_size;
image->size = image->clear_value_offset + 8;
if (radv_image_is_tc_compat_htile(image)) {
/* Metadata for the TC-compatible HTILE hardware bug which
* have to be fixed by updating ZRANGE_PRECISION when doing
* fast depth clears to 0.0f.
*/
image->tc_compat_zrange_offset = image->clear_value_offset + 8;
image->size = image->clear_value_offset + 16;
}
image->alignment = align64(image->alignment, image->surface.htile_alignment);
}
@@ -1014,8 +1022,8 @@ radv_image_create(VkDevice _device,
/* Otherwise, try to enable HTILE for depth surfaces. */
if (radv_image_can_enable_htile(image) &&
!(device->instance->debug_flags & RADV_DEBUG_NO_HIZ)) {
radv_image_alloc_htile(image);
image->tc_compatible_htile = image->surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
radv_image_alloc_htile(image);
} else {
image->surface.htile_size = 0;
}
@@ -1175,8 +1183,6 @@ radv_image_view_init(struct radv_image_view *iview,
if (device->physical_device->rad_info.chip_class >= GFX9 &&
vk_format_is_compressed(image->vk_format) &&
!vk_format_is_compressed(iview->vk_format)) {
unsigned rounded_img_w = util_next_power_of_two(iview->extent.width);
unsigned rounded_img_h = util_next_power_of_two(iview->extent.height);
unsigned lvl_width = radv_minify(image->info.width , range->baseMipLevel);
unsigned lvl_height = radv_minify(image->info.height, range->baseMipLevel);
@@ -1186,8 +1192,8 @@ radv_image_view_init(struct radv_image_view *iview,
lvl_width <<= range->baseMipLevel;
lvl_height <<= range->baseMipLevel;
iview->extent.width = CLAMP(lvl_width, iview->extent.width, rounded_img_w);
iview->extent.height = CLAMP(lvl_height, iview->extent.height, rounded_img_h);
iview->extent.width = CLAMP(lvl_width, iview->extent.width, iview->image->surface.u.gfx9.surf_pitch);
iview->extent.height = CLAMP(lvl_height, iview->extent.height, iview->image->surface.u.gfx9.surf_height);
}
}

View File

@@ -2061,7 +2061,7 @@ radv_meta_image_to_image_cs(struct radv_cmd_buffer *cmd_buffer,
itoi_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class >= GFX9 &&
src->image->type == VK_IMAGE_TYPE_3D)
(src->image->type == VK_IMAGE_TYPE_3D || dst->image->type == VK_IMAGE_TYPE_3D))
pipeline = cmd_buffer->device->meta_state.itoi.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);

View File

@@ -1498,6 +1498,14 @@ struct radv_image {
uint64_t clear_value_offset;
uint64_t dcc_pred_offset;
/*
* Metadata for the TC-compat zrange workaround. If the 32-bit value
* stored at this offset is UINT_MAX, the driver will emit
* DB_Z_INFO.ZRANGE_PRECISION=0, otherwise it will skip the
* SET_CONTEXT_REG packet.
*/
uint64_t tc_compat_zrange_offset;
/* For VK_ANDROID_native_buffer, the WSI image owns the memory, */
VkDeviceMemory owned_memory;
};

View File

@@ -1341,10 +1341,13 @@ void radv_CmdCopyQueryPoolResults(
if (flags & VK_QUERY_RESULT_WAIT_BIT) {
/* Wait on the high 32 bits of the timestamp in
* case the low part is 0xffffffff.
*/
radeon_emit(cs, PKT3(PKT3_WAIT_REG_MEM, 5, false));
radeon_emit(cs, WAIT_REG_MEM_NOT_EQUAL | WAIT_REG_MEM_MEM_SPACE(1));
radeon_emit(cs, local_src_va);
radeon_emit(cs, local_src_va >> 32);
radeon_emit(cs, local_src_va + 4);
radeon_emit(cs, (local_src_va + 4) >> 32);
radeon_emit(cs, TIMESTAMP_NOT_READY >> 32);
radeon_emit(cs, 0xffffffff);
radeon_emit(cs, 4);
@@ -1447,6 +1450,22 @@ static unsigned event_type_for_stream(unsigned stream)
}
}
static void emit_query_flush(struct radv_cmd_buffer *cmd_buffer,
struct radv_query_pool *pool)
{
if (cmd_buffer->pending_reset_query) {
if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
/* Only need to flush caches if the query pool size is
* large enough to be resetted using the compute shader
* path. Small pools don't need any cache flushes
* because we use a CP dma clear.
*/
si_emit_cache_flush(cmd_buffer);
cmd_buffer->pending_reset_query = false;
}
}
}
static void emit_begin_query(struct radv_cmd_buffer *cmd_buffer,
uint64_t va,
VkQueryType query_type,
@@ -1593,17 +1612,7 @@ void radv_CmdBeginQueryIndexedEXT(
radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
if (cmd_buffer->pending_reset_query) {
if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
/* Only need to flush caches if the query pool size is
* large enough to be resetted using the compute shader
* path. Small pools don't need any cache flushes
* because we use a CP dma clear.
*/
si_emit_cache_flush(cmd_buffer);
cmd_buffer->pending_reset_query = false;
}
}
emit_query_flush(cmd_buffer, pool);
va += pool->stride * query;
@@ -1680,6 +1689,8 @@ void radv_CmdWriteTimestamp(
radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
emit_query_flush(cmd_buffer, pool);
int num_queries = 1;
if (cmd_buffer->state.subpass && cmd_buffer->state.subpass->view_mask)
num_queries = util_bitcount(cmd_buffer->state.subpass->view_mask);

View File

@@ -223,6 +223,8 @@ struct radeon_winsys {
void (*buffer_set_metadata)(struct radeon_winsys_bo *bo,
struct radeon_bo_metadata *md);
void (*buffer_get_metadata)(struct radeon_winsys_bo *bo,
struct radeon_bo_metadata *md);
void (*buffer_virtual_bind)(struct radeon_winsys_bo *parent,
uint64_t offset, uint64_t size,

View File

@@ -304,8 +304,12 @@ radv_amdgpu_winsys_bo_create(struct radeon_winsys *_ws,
return NULL;
}
unsigned virt_alignment = alignment;
if (size >= ws->info.pte_fragment_size)
virt_alignment = MAX2(virt_alignment, ws->info.pte_fragment_size);
r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
size, alignment, 0, &va, &va_handle,
size, virt_alignment, 0, &va, &va_handle,
(flags & RADEON_FLAG_32BIT ? AMDGPU_VA_RANGE_32_BIT : 0) |
AMDGPU_VA_RANGE_HIGH);
if (r)
@@ -536,6 +540,21 @@ radv_amdgpu_winsys_get_fd(struct radeon_winsys *_ws,
return true;
}
static unsigned eg_tile_split(unsigned tile_split)
{
switch (tile_split) {
case 0: tile_split = 64; break;
case 1: tile_split = 128; break;
case 2: tile_split = 256; break;
case 3: tile_split = 512; break;
default:
case 4: tile_split = 1024; break;
case 5: tile_split = 2048; break;
case 6: tile_split = 4096; break;
}
return tile_split;
}
static unsigned radv_eg_tile_split_rev(unsigned eg_tile_split)
{
switch (eg_tile_split) {
@@ -589,6 +608,43 @@ radv_amdgpu_winsys_bo_set_metadata(struct radeon_winsys_bo *_bo,
amdgpu_bo_set_metadata(bo->bo, &metadata);
}
static void
radv_amdgpu_winsys_bo_get_metadata(struct radeon_winsys_bo *_bo,
struct radeon_bo_metadata *md)
{
struct radv_amdgpu_winsys_bo *bo = radv_amdgpu_winsys_bo(_bo);
struct amdgpu_bo_info info = {0};
int r = amdgpu_bo_query_info(bo->bo, &info);
if (r)
return;
uint64_t tiling_flags = info.metadata.tiling_info;
if (bo->ws->info.chip_class >= GFX9) {
md->u.gfx9.swizzle_mode = AMDGPU_TILING_GET(tiling_flags, SWIZZLE_MODE);
} else {
md->u.legacy.microtile = RADEON_LAYOUT_LINEAR;
md->u.legacy.macrotile = RADEON_LAYOUT_LINEAR;
if (AMDGPU_TILING_GET(tiling_flags, ARRAY_MODE) == 4) /* 2D_TILED_THIN1 */
md->u.legacy.macrotile = RADEON_LAYOUT_TILED;
else if (AMDGPU_TILING_GET(tiling_flags, ARRAY_MODE) == 2) /* 1D_TILED_THIN1 */
md->u.legacy.microtile = RADEON_LAYOUT_TILED;
md->u.legacy.pipe_config = AMDGPU_TILING_GET(tiling_flags, PIPE_CONFIG);
md->u.legacy.bankw = 1 << AMDGPU_TILING_GET(tiling_flags, BANK_WIDTH);
md->u.legacy.bankh = 1 << AMDGPU_TILING_GET(tiling_flags, BANK_HEIGHT);
md->u.legacy.tile_split = eg_tile_split(AMDGPU_TILING_GET(tiling_flags, TILE_SPLIT));
md->u.legacy.mtilea = 1 << AMDGPU_TILING_GET(tiling_flags, MACRO_TILE_ASPECT);
md->u.legacy.num_banks = 2 << AMDGPU_TILING_GET(tiling_flags, NUM_BANKS);
md->u.legacy.scanout = AMDGPU_TILING_GET(tiling_flags, MICRO_TILE_MODE) == 0; /* DISPLAY */
}
md->size_metadata = info.metadata.size_metadata;
memcpy(md->metadata, info.metadata.umd_metadata, sizeof(md->metadata));
}
void radv_amdgpu_bo_init_functions(struct radv_amdgpu_winsys *ws)
{
ws->base.buffer_create = radv_amdgpu_winsys_bo_create;
@@ -599,5 +655,6 @@ void radv_amdgpu_bo_init_functions(struct radv_amdgpu_winsys *ws)
ws->base.buffer_from_fd = radv_amdgpu_winsys_bo_from_fd;
ws->base.buffer_get_fd = radv_amdgpu_winsys_get_fd;
ws->base.buffer_set_metadata = radv_amdgpu_winsys_bo_set_metadata;
ws->base.buffer_get_metadata = radv_amdgpu_winsys_bo_get_metadata;
ws->base.buffer_virtual_bind = radv_amdgpu_winsys_bo_virtual_bind;
}

View File

@@ -525,12 +525,6 @@ supports_nv_fragment_shader_interlock(const _mesa_glsl_parse_state *state)
return state->NV_fragment_shader_interlock_enable;
}
static bool
supports_intel_fragment_shader_ordering(const _mesa_glsl_parse_state *state)
{
return state->INTEL_fragment_shader_ordering_enable;
}
static bool
shader_clock(const _mesa_glsl_parse_state *state)
{
@@ -1311,11 +1305,6 @@ builtin_builder::create_intrinsics()
supports_arb_fragment_shader_interlock,
ir_intrinsic_end_invocation_interlock), NULL);
add_function("__intrinsic_begin_fragment_shader_ordering",
_invocation_interlock_intrinsic(
supports_intel_fragment_shader_ordering,
ir_intrinsic_begin_fragment_shader_ordering), NULL);
add_function("__intrinsic_shader_clock",
_shader_clock_intrinsic(shader_clock,
glsl_type::uvec2_type),
@@ -3430,12 +3419,6 @@ builtin_builder::create_builtins()
supports_nv_fragment_shader_interlock),
NULL);
add_function("beginFragmentShaderOrderingINTEL",
_invocation_interlock(
"__intrinsic_begin_fragment_shader_ordering",
supports_intel_fragment_shader_ordering),
NULL);
add_function("anyInvocationARB",
_vote("__intrinsic_vote_any", vote),
NULL);

View File

@@ -727,7 +727,6 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = {
EXT_AEP(EXT_texture_buffer),
EXT_AEP(EXT_texture_cube_map_array),
EXT(INTEL_conservative_rasterization),
EXT(INTEL_fragment_shader_ordering),
EXT(INTEL_shader_atomic_float_minmax),
EXT(MESA_shader_integer_functions),
EXT(NV_fragment_shader_interlock),

View File

@@ -812,8 +812,6 @@ struct _mesa_glsl_parse_state {
bool EXT_texture_cube_map_array_warn;
bool INTEL_conservative_rasterization_enable;
bool INTEL_conservative_rasterization_warn;
bool INTEL_fragment_shader_ordering_enable;
bool INTEL_fragment_shader_ordering_warn;
bool INTEL_shader_atomic_float_minmax_enable;
bool INTEL_shader_atomic_float_minmax_warn;
bool MESA_shader_integer_functions_enable;

View File

@@ -742,9 +742,6 @@ nir_visitor::visit(ir_call *ir)
case ir_intrinsic_end_invocation_interlock:
op = nir_intrinsic_end_invocation_interlock;
break;
case ir_intrinsic_begin_fragment_shader_ordering:
op = nir_intrinsic_begin_fragment_shader_ordering;
break;
case ir_intrinsic_group_memory_barrier:
op = nir_intrinsic_group_memory_barrier;
break;
@@ -983,9 +980,6 @@ nir_visitor::visit(ir_call *ir)
case nir_intrinsic_end_invocation_interlock:
nir_builder_instr_insert(&b, &instr->instr);
break;
case nir_intrinsic_begin_fragment_shader_ordering:
nir_builder_instr_insert(&b, &instr->instr);
break;
case nir_intrinsic_store_ssbo: {
exec_node *param = ir->actual_parameters.get_head();
ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();

View File

@@ -1122,7 +1122,6 @@ enum ir_intrinsic_id {
ir_intrinsic_memory_barrier_shared,
ir_intrinsic_begin_invocation_interlock,
ir_intrinsic_end_invocation_interlock,
ir_intrinsic_begin_fragment_shader_ordering,
ir_intrinsic_vote_all,
ir_intrinsic_vote_any,

View File

@@ -360,13 +360,20 @@ read_xfb(struct blob_reader *metadata, struct gl_shader_program *shProg)
if (xfb_stage == ~0u)
return;
if (shProg->TransformFeedback.VaryingNames) {
for (unsigned i = 0; i < shProg->TransformFeedback.NumVarying; ++i)
free(shProg->TransformFeedback.VaryingNames[i]);
}
/* Data set by glTransformFeedbackVaryings. */
shProg->TransformFeedback.BufferMode = blob_read_uint32(metadata);
blob_copy_bytes(metadata, &shProg->TransformFeedback.BufferStride,
sizeof(shProg->TransformFeedback.BufferStride));
shProg->TransformFeedback.NumVarying = blob_read_uint32(metadata);
shProg->TransformFeedback.VaryingNames = (char **)
malloc(shProg->TransformFeedback.NumVarying * sizeof(GLchar *));
realloc(shProg->TransformFeedback.VaryingNames,
shProg->TransformFeedback.NumVarying * sizeof(GLchar *));
/* Note, malloc used with VaryingNames. */
for (unsigned i = 0; i < shProg->TransformFeedback.NumVarying; i++)
shProg->TransformFeedback.VaryingNames[i] =

View File

@@ -199,7 +199,6 @@ barrier("memory_barrier_image")
barrier("memory_barrier_shared")
barrier("begin_invocation_interlock")
barrier("end_invocation_interlock")
barrier("begin_fragment_shader_ordering")
# A conditional discard, with a single boolean source.
intrinsic("discard_if", src_comp=[1])

View File

@@ -1127,13 +1127,22 @@ drm_handle_device(void *data, struct wl_drm *drm, const char *device)
if (dri2_dpy->fd == -1) {
_eglLog(_EGL_WARNING, "wayland-egl: could not open %s (%s)",
dri2_dpy->device_name, strerror(errno));
free(dri2_dpy->device_name);
dri2_dpy->device_name = NULL;
return;
}
if (drmGetNodeTypeFromFd(dri2_dpy->fd) == DRM_NODE_RENDER) {
dri2_dpy->authenticated = true;
} else {
drmGetMagic(dri2_dpy->fd, &magic);
if (drmGetMagic(dri2_dpy->fd, &magic)) {
close(dri2_dpy->fd);
dri2_dpy->fd = -1;
free(dri2_dpy->device_name);
dri2_dpy->device_name = NULL;
_eglLog(_EGL_WARNING, "wayland-egl: drmGetMagic failed");
return;
}
wl_drm_authenticate(dri2_dpy->wl_drm, magic);
}
}

View File

@@ -142,7 +142,7 @@ pipe_loader_release(struct pipe_loader_device **devs, int ndev);
*/
bool
pipe_loader_sw_probe_dri(struct pipe_loader_device **devs,
struct drisw_loader_funcs *drisw_lf);
const struct drisw_loader_funcs *drisw_lf);
/**
* Initialize a kms backed sw device given an fd.

View File

@@ -132,7 +132,7 @@ pipe_loader_sw_probe_teardown_common(struct pipe_loader_sw_device *sdev)
#ifdef HAVE_PIPE_LOADER_DRI
bool
pipe_loader_sw_probe_dri(struct pipe_loader_device **devs, struct drisw_loader_funcs *drisw_lf)
pipe_loader_sw_probe_dri(struct pipe_loader_device **devs, const struct drisw_loader_funcs *drisw_lf)
{
struct pipe_loader_sw_device *sdev = CALLOC_STRUCT(pipe_loader_sw_device);
int i;

View File

@@ -600,25 +600,23 @@ static inline void
nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
unsigned nr, void **hwcso)
{
unsigned highest_found = 0;
unsigned i;
assert(nr <= PIPE_MAX_SAMPLERS);
for (i = 0; i < nr; ++i) {
struct nv50_tsc_entry *old = nv50->samplers[s][i];
if (hwcso[i])
highest_found = i;
nv50->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
if (old)
nv50_screen_tsc_unlock(nv50->screen, old);
}
assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS);
for (; i < nv50->num_samplers[s]; ++i) {
if (nv50->samplers[s][i]) {
nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]);
nv50->samplers[s][i] = NULL;
}
}
nv50->num_samplers[s] = nr;
if (nr >= nv50->num_samplers[s])
nv50->num_samplers[s] = highest_found + 1;
nv50->dirty_3d |= NV50_NEW_3D_SAMPLERS;
}

View File

@@ -464,11 +464,15 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
unsigned s,
unsigned nr, void **hwcso)
{
unsigned highest_found = 0;
unsigned i;
for (i = 0; i < nr; ++i) {
struct nv50_tsc_entry *old = nvc0->samplers[s][i];
if (hwcso[i])
highest_found = i;
if (hwcso[i] == old)
continue;
nvc0->samplers_dirty[s] |= 1 << i;
@@ -477,14 +481,8 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
if (old)
nvc0_screen_tsc_unlock(nvc0->screen, old);
}
for (; i < nvc0->num_samplers[s]; ++i) {
if (nvc0->samplers[s][i]) {
nvc0_screen_tsc_unlock(nvc0->screen, nvc0->samplers[s][i]);
nvc0->samplers[s][i] = NULL;
}
}
nvc0->num_samplers[s] = nr;
if (nr >= nvc0->num_samplers[s])
nvc0->num_samplers[s] = highest_found + 1;
}
static void

View File

@@ -1636,7 +1636,7 @@ static void r600_query_hw_get_result_resource(struct r600_common_context *rctx,
}
if (query->buffer.previous) {
u_suballocator_alloc(rctx->allocator_zeroed_memory, 16, 16,
u_suballocator_alloc(rctx->allocator_zeroed_memory, 16, 256,
&tmp_buffer_offset, &tmp_buffer);
if (!tmp_buffer)
return;

View File

@@ -106,7 +106,6 @@ static void virgl_buffer_transfer_unmap(struct pipe_context *ctx,
if (trans->base.usage & PIPE_TRANSFER_WRITE) {
if (!(transfer->usage & PIPE_TRANSFER_FLUSH_EXPLICIT)) {
struct virgl_screen *vs = virgl_screen(ctx->screen);
vbuf->base.clean = FALSE;
vctx->num_transfers++;
vs->vws->transfer_put(vs->vws, vbuf->base.hw_res,
&transfer->box, trans->base.stride, trans->base.layer_stride, trans->offset, transfer->level);

View File

@@ -61,6 +61,12 @@ static void virgl_encoder_write_res(struct virgl_context *ctx,
}
}
static void virgl_dirty_res(struct virgl_resource *res)
{
if (res)
res->clean = FALSE;
}
int virgl_encode_bind_object(struct virgl_context *ctx,
uint32_t handle, uint32_t object)
{
@@ -615,6 +621,7 @@ int virgl_encode_sampler_view(struct virgl_context *ctx,
if (res->u.b.target == PIPE_BUFFER) {
virgl_encoder_write_dword(ctx->cbuf, state->u.buf.offset / elem_size);
virgl_encoder_write_dword(ctx->cbuf, (state->u.buf.offset + state->u.buf.size) / elem_size - 1);
virgl_dirty_res(res);
} else {
virgl_encoder_write_dword(ctx->cbuf, state->u.tex.first_layer | state->u.tex.last_layer << 16);
virgl_encoder_write_dword(ctx->cbuf, state->u.tex.first_level | state->u.tex.last_level << 8);
@@ -949,6 +956,7 @@ int virgl_encode_set_shader_buffers(struct virgl_context *ctx,
virgl_encoder_write_dword(ctx->cbuf, buffers[i].buffer_offset);
virgl_encoder_write_dword(ctx->cbuf, buffers[i].buffer_size);
virgl_encoder_write_res(ctx, res);
virgl_dirty_res(res);
} else {
virgl_encoder_write_dword(ctx->cbuf, 0);
virgl_encoder_write_dword(ctx->cbuf, 0);
@@ -972,6 +980,7 @@ int virgl_encode_set_hw_atomic_buffers(struct virgl_context *ctx,
virgl_encoder_write_dword(ctx->cbuf, buffers[i].buffer_offset);
virgl_encoder_write_dword(ctx->cbuf, buffers[i].buffer_size);
virgl_encoder_write_res(ctx, res);
virgl_dirty_res(res);
} else {
virgl_encoder_write_dword(ctx->cbuf, 0);
virgl_encoder_write_dword(ctx->cbuf, 0);
@@ -999,6 +1008,7 @@ int virgl_encode_set_shader_images(struct virgl_context *ctx,
virgl_encoder_write_dword(ctx->cbuf, images[i].u.buf.offset);
virgl_encoder_write_dword(ctx->cbuf, images[i].u.buf.size);
virgl_encoder_write_res(ctx, res);
virgl_dirty_res(res);
} else {
virgl_encoder_write_dword(ctx->cbuf, 0);
virgl_encoder_write_dword(ctx->cbuf, 0);

View File

@@ -95,7 +95,11 @@ static void virgl_buffer_subdata(struct pipe_context *pipe,
usage |= PIPE_TRANSFER_DISCARD_RANGE;
u_box_1d(offset, size, &box);
virgl_transfer_inline_write(pipe, resource, 0, usage, &box, data, 0, 0);
if (size >= (VIRGL_MAX_CMDBUF_DWORDS * 4))
u_default_buffer_subdata(pipe, resource, usage, offset, size, data);
else
virgl_transfer_inline_write(pipe, resource, 0, usage, &box, data, 0, 0);
}
void virgl_init_context_resource_functions(struct pipe_context *ctx)

View File

@@ -31,7 +31,7 @@ struct pipe_fence_handle;
struct winsys_handle;
struct virgl_hw_res;
#define VIRGL_MAX_CMDBUF_DWORDS (16*1024)
#define VIRGL_MAX_CMDBUF_DWORDS (64 * 1024)
struct virgl_drm_caps {
union virgl_caps caps;

View File

@@ -421,12 +421,19 @@ static const __DRIextension *drisw_screen_extensions[] = {
NULL
};
static struct drisw_loader_funcs drisw_lf = {
static const struct drisw_loader_funcs drisw_lf = {
.get_image = drisw_get_image,
.put_image = drisw_put_image,
.put_image2 = drisw_put_image2
};
static const struct drisw_loader_funcs drisw_shm_lf = {
.get_image = drisw_get_image,
.put_image = drisw_put_image,
.put_image2 = drisw_put_image2,
.put_image_shm = drisw_put_image_shm
};
static const __DRIconfig **
drisw_init_screen(__DRIscreen * sPriv)
{
@@ -434,6 +441,7 @@ drisw_init_screen(__DRIscreen * sPriv)
const __DRIconfig **configs;
struct dri_screen *screen;
struct pipe_screen *pscreen = NULL;
const struct drisw_loader_funcs *lf = &drisw_lf;
screen = CALLOC_STRUCT(dri_screen);
if (!screen)
@@ -448,10 +456,10 @@ drisw_init_screen(__DRIscreen * sPriv)
sPriv->extensions = drisw_screen_extensions;
if (loader->base.version >= 4) {
if (loader->putImageShm)
drisw_lf.put_image_shm = drisw_put_image_shm;
lf = &drisw_shm_lf;
}
if (pipe_loader_sw_probe_dri(&screen->dev, &drisw_lf)) {
if (pipe_loader_sw_probe_dri(&screen->dev, lf)) {
dri_init_options(screen);
pscreen = pipe_loader_create_screen(screen->dev);

View File

@@ -91,6 +91,7 @@ xa_context_destroy(struct xa_context *r)
}
r->pipe->destroy(r->pipe);
free(r);
}
XA_EXPORT int

View File

@@ -27,6 +27,7 @@ AM_CFLAGS = \
$(GALLIUM_CFLAGS) \
$(VISIBILITY_CFLAGS) \
$(VL_CFLAGS) \
$(X11_INCLUDES) \
$(XCB_DRI3_CFLAGS) \
$(XVMC_CFLAGS)

View File

@@ -1310,6 +1310,12 @@ static struct pb_buffer *amdgpu_bo_from_handle(struct radeon_winsys *rws,
if (bo) {
p_atomic_inc(&bo->base.reference.count);
simple_mtx_unlock(&ws->bo_export_table_lock);
/* Release the buffer handle, because we don't need it anymore.
* This function is returning an existing buffer, which has its own
* handle.
*/
amdgpu_bo_free(result.buf_handle);
return &bo->base;
}

View File

@@ -280,6 +280,12 @@ amdgpu_winsys_create(int fd, const struct pipe_screen_config *config,
if (ws) {
pipe_reference(NULL, &ws->reference);
simple_mtx_unlock(&dev_tab_mutex);
/* Release the device handle, because we don't need it anymore.
* This function is returning an existing winsys instance, which
* has its own device handle.
*/
amdgpu_device_deinitialize(dev);
return &ws->base;
}

View File

@@ -1198,4 +1198,6 @@ void
vmw_ioctl_cleanup(struct vmw_winsys_screen *vws)
{
VMW_FUNC;
free(vws->ioctl.cap_3d);
}

View File

@@ -62,7 +62,7 @@ struct dri_sw_winsys
{
struct sw_winsys base;
struct drisw_loader_funcs *lf;
const struct drisw_loader_funcs *lf;
};
static inline struct dri_sw_displaytarget *
@@ -282,7 +282,7 @@ dri_destroy_sw_winsys(struct sw_winsys *winsys)
}
struct sw_winsys *
dri_create_sw_winsys(struct drisw_loader_funcs *lf)
dri_create_sw_winsys(const struct drisw_loader_funcs *lf)
{
struct dri_sw_winsys *ws;

View File

@@ -33,6 +33,6 @@
struct sw_winsys;
struct sw_winsys *dri_create_sw_winsys(struct drisw_loader_funcs *lf);
struct sw_winsys *dri_create_sw_winsys(const struct drisw_loader_funcs *lf);
#endif

View File

@@ -4804,7 +4804,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
break;
}
case nir_intrinsic_begin_fragment_shader_ordering:
case nir_intrinsic_begin_invocation_interlock: {
const fs_builder ubld = bld.group(8, 0);
const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2);

View File

@@ -636,7 +636,7 @@ VkResult anv_CreateInstance(
}
if (instance->app_info.api_version == 0)
anv_EnumerateInstanceVersion(&instance->app_info.api_version);
instance->app_info.api_version = VK_API_VERSION_1_0;
instance->enabled_extensions = enabled_extensions;

View File

@@ -446,6 +446,9 @@ anv_pipeline_hash_graphics(struct anv_pipeline *pipeline,
if (layout)
_mesa_sha1_update(&ctx, layout->sha1, sizeof(layout->sha1));
const bool rba = pipeline->device->robust_buffer_access;
_mesa_sha1_update(&ctx, &rba, sizeof(rba));
for (unsigned s = 0; s < MESA_SHADER_STAGES; s++) {
if (stages[s].entrypoint)
anv_pipeline_hash_shader(&ctx, &stages[s]);
@@ -466,6 +469,9 @@ anv_pipeline_hash_compute(struct anv_pipeline *pipeline,
if (layout)
_mesa_sha1_update(&ctx, layout->sha1, sizeof(layout->sha1));
const bool rba = pipeline->device->robust_buffer_access;
_mesa_sha1_update(&ctx, &rba, sizeof(rba));
anv_pipeline_hash_shader(&ctx, stage);
_mesa_sha1_final(&ctx, sha1_out);

View File

@@ -1747,6 +1747,13 @@ enum anv_pipe_bits {
* we would have to CS stall on every flush which could be bad.
*/
ANV_PIPE_NEEDS_CS_STALL_BIT = (1 << 21),
/* This bit does not exist directly in PIPE_CONTROL. It means that render
* target operations are ongoing. Some operations like copies on the
* command streamer might need to be aware of this to trigger the
* appropriate stall before they can proceed with the copy.
*/
ANV_PIPE_RENDER_TARGET_WRITES = (1 << 22),
};
#define ANV_PIPE_FLUSH_BITS ( \

View File

@@ -263,4 +263,5 @@ genX(blorp_exec)(struct blorp_batch *batch,
cmd_buffer->state.gfx.vb_dirty = ~0;
cmd_buffer->state.gfx.dirty = ~0;
cmd_buffer->state.push_constants_dirty = ~0;
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES;
}

View File

@@ -1758,6 +1758,12 @@ genX(cmd_buffer_apply_pipe_flushes)(struct anv_cmd_buffer *cmd_buffer)
pipe.StallAtPixelScoreboard = true;
}
/* If a render target flush was emitted, then we can toggle off the bit
* saying that render target writes are ongoing.
*/
if (bits & ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT)
bits &= ~(ANV_PIPE_RENDER_TARGET_WRITES);
bits &= ~(ANV_PIPE_FLUSH_BITS | ANV_PIPE_CS_STALL_BIT);
}
@@ -2769,6 +2775,8 @@ void genX(CmdDraw)(
prim.StartInstanceLocation = firstInstance;
prim.BaseVertexLocation = 0;
}
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES;
}
void genX(CmdDrawIndexed)(
@@ -2808,6 +2816,8 @@ void genX(CmdDrawIndexed)(
prim.StartInstanceLocation = firstInstance;
prim.BaseVertexLocation = vertexOffset;
}
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES;
}
/* Auto-Draw / Indirect Registers */
@@ -2941,6 +2951,8 @@ void genX(CmdDrawIndirect)(
offset += stride;
}
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES;
}
void genX(CmdDrawIndexedIndirect)(
@@ -2980,6 +2992,8 @@ void genX(CmdDrawIndexedIndirect)(
offset += stride;
}
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES;
}
static VkResult

View File

@@ -302,4 +302,5 @@ genX(cmd_buffer_so_memcpy)(struct anv_cmd_buffer *cmd_buffer,
}
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES;
}

View File

@@ -729,11 +729,19 @@ void genX(CmdCopyQueryPoolResults)(
ANV_FROM_HANDLE(anv_query_pool, pool, queryPool);
ANV_FROM_HANDLE(anv_buffer, buffer, destBuffer);
if (flags & VK_QUERY_RESULT_WAIT_BIT) {
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.CommandStreamerStallEnable = true;
pc.StallAtPixelScoreboard = true;
}
/* If render target writes are ongoing, request a render target cache flush
* to ensure proper ordering of the commands from the 3d pipe and the
* command streamer.
*/
if (cmd_buffer->state.pending_pipe_bits & ANV_PIPE_RENDER_TARGET_WRITES) {
cmd_buffer->state.pending_pipe_bits |=
ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
}
if ((flags & VK_QUERY_RESULT_WAIT_BIT) ||
(cmd_buffer->state.pending_pipe_bits & ANV_PIPE_FLUSH_BITS)) {
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_CS_STALL_BIT;
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
}
struct anv_address dest_addr = anv_address_add(buffer->address, destOffset);

View File

@@ -40,7 +40,7 @@ libglapi = shared_library(
'glapi',
[files_mapi_glapi, files_mapi_util, shared_glapi_mapi_tmp_h],
c_args : [
c_msvc_compat_args, '-DMAPI_MODE_GLAPI',
c_msvc_compat_args, c_vis_args, '-DMAPI_MODE_GLAPI',
'-DMAPI_ABI_HEADER="@0@"'.format(shared_glapi_mapi_tmp_h.full_path()),
],
link_args : [ld_args_gc_sections],

View File

@@ -247,7 +247,6 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_primitive_bounding_box = true;
ctx->Extensions.OES_texture_buffer = true;
ctx->Extensions.ARB_fragment_shader_interlock = true;
ctx->Extensions.INTEL_fragment_shader_ordering = true;
if (can_do_pipelined_register_writes(brw->screen)) {
ctx->Extensions.ARB_draw_indirect = true;

View File

@@ -317,7 +317,6 @@ EXT(IBM_texture_mirrored_repeat , dummy_true
EXT(INGR_blend_func_separate , EXT_blend_func_separate , GLL, x , x , x , 1999)
EXT(INTEL_conservative_rasterization , INTEL_conservative_rasterization , x , GLC, x , 31, 2013)
EXT(INTEL_fragment_shader_ordering , INTEL_fragment_shader_ordering , GLL, GLC, x , x , 2013)
EXT(INTEL_performance_query , INTEL_performance_query , GLL, GLC, x , ES2, 2013)
EXT(INTEL_shader_atomic_float_minmax , INTEL_shader_atomic_float_minmax , GLL, GLC, x , x , 2018)

View File

@@ -4296,7 +4296,6 @@ struct gl_extensions
GLboolean ATI_fragment_shader;
GLboolean GREMEDY_string_marker;
GLboolean INTEL_conservative_rasterization;
GLboolean INTEL_fragment_shader_ordering;
GLboolean INTEL_performance_query;
GLboolean INTEL_shader_atomic_float_minmax;
GLboolean KHR_blend_equation_advanced;

View File

@@ -900,8 +900,7 @@ select_tex_image(const struct gl_texture_object *texObj, GLenum target,
/**
* Error-check the offset and size arguments to
* glGet[Compressed]TextureSubImage(). Also checks if the specified
* texture image is missing.
* glGet[Compressed]TextureSubImage().
* \return true if error, false if no error.
*/
static bool
@@ -913,6 +912,7 @@ dimensions_error_check(struct gl_context *ctx,
const char *caller)
{
const struct gl_texture_image *texImage;
GLuint imageWidth = 0, imageHeight = 0, imageDepth = 0;
if (xoffset < 0) {
_mesa_error(ctx, GL_INVALID_VALUE, "%s(xoffset = %d)", caller, xoffset);
@@ -981,82 +981,44 @@ dimensions_error_check(struct gl_context *ctx,
"%s(zoffset + depth = %d)", caller, zoffset + depth);
return true;
}
/* According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"):
*
* "An INVALID_OPERATION error is generated by GetTextureImage if the
* effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY ,
* and the texture object is not cube complete or cube array complete,
* respectively."
*
* This applies also to GetTextureSubImage, GetCompressedTexImage,
* GetCompressedTextureImage, and GetnCompressedTexImage.
*/
if (!_mesa_cube_complete(texObj)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"%s(cube incomplete)", caller);
return true;
}
break;
default:
; /* nothing */
}
texImage = select_tex_image(texObj, target, level, zoffset);
if (!texImage) {
/* Trying to return a non-defined level is a valid operation per se, as
* OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries") does not
* handle this case as an error.
*
* Rather, we need to look at section 8.22 ("Texture State and Proxy
* State"):
*
* "Each initial texture image is null. It has zero width, height, and
* depth, internal format RGBA, or R8 for buffer textures, component
* sizes set to zero and component types set to NONE, the compressed
* flag set to FALSE, a zero compressed size, and the bound buffer
* object name is zero."
*
* This means we need to assume the image for the non-defined level is
* an empty image. With this assumption, we can go back to section
* 8.11.4 and checking again the errors:
*
* "An INVALID_VALUE error is generated if xoffset + width is greater
* than the textures width, yoffset + height is greater than the
* textures height, or zoffset + depth is greater than the textures
* depth."
*
* Thus why we return INVALID_VALUE.
*/
_mesa_error(ctx, GL_INVALID_VALUE, "%s(missing image)", caller);
return true;
if (texImage) {
imageWidth = texImage->Width;
imageHeight = texImage->Height;
imageDepth = texImage->Depth;
}
if (xoffset + width > texImage->Width) {
if (xoffset + width > imageWidth) {
_mesa_error(ctx, GL_INVALID_VALUE,
"%s(xoffset %d + width %d > %u)",
caller, xoffset, width, texImage->Width);
caller, xoffset, width, imageWidth);
return true;
}
if (yoffset + height > texImage->Height) {
if (yoffset + height > imageHeight) {
_mesa_error(ctx, GL_INVALID_VALUE,
"%s(yoffset %d + height %d > %u)",
caller, yoffset, height, texImage->Height);
caller, yoffset, height, imageHeight);
return true;
}
if (target != GL_TEXTURE_CUBE_MAP) {
/* Cube map error checking was done above */
if (zoffset + depth > texImage->Depth) {
if (zoffset + depth > imageDepth) {
_mesa_error(ctx, GL_INVALID_VALUE,
"%s(zoffset %d + depth %d > %u)",
caller, zoffset, depth, texImage->Depth);
caller, zoffset, depth, imageDepth);
return true;
}
}
/* Extra checks for compressed textures */
{
if (texImage) {
GLuint bw, bh, bd;
_mesa_get_format_block_size_3d(texImage->TexFormat, &bw, &bh, &bd);
if (bw > 1 || bh > 1 || bd > 1) {
@@ -1162,53 +1124,15 @@ pbo_error_check(struct gl_context *ctx, GLenum target,
/**
* Do error checking for all (non-compressed) get-texture-image functions.
* \return true if any error, false if no errors.
* Do teximage-related error checking for getting uncompressed images.
* \return true if there was an error
*/
static bool
getteximage_error_check(struct gl_context *ctx,
struct gl_texture_object *texObj,
GLenum target, GLint level,
GLint xoffset, GLint yoffset, GLint zoffset,
GLsizei width, GLsizei height, GLsizei depth,
GLenum format, GLenum type, GLsizei bufSize,
GLvoid *pixels, const char *caller)
teximage_error_check(struct gl_context *ctx,
struct gl_texture_image *texImage,
GLenum format, const char *caller)
{
struct gl_texture_image *texImage;
GLenum baseFormat, err;
GLint maxLevels;
assert(texObj);
if (texObj->Target == 0) {
_mesa_error(ctx, GL_INVALID_OPERATION, "%s(invalid texture)", caller);
return true;
}
maxLevels = _mesa_max_texture_levels(ctx, target);
if (level < 0 || level >= maxLevels) {
_mesa_error(ctx, GL_INVALID_VALUE, "%s(level = %d)", caller, level);
return true;
}
err = _mesa_error_check_format_and_type(ctx, format, type);
if (err != GL_NO_ERROR) {
_mesa_error(ctx, err, "%s(format/type)", caller);
return true;
}
if (dimensions_error_check(ctx, texObj, target, level,
xoffset, yoffset, zoffset,
width, height, depth, caller)) {
return true;
}
if (pbo_error_check(ctx, target, width, height, depth,
format, type, bufSize, pixels, caller)) {
return true;
}
texImage = select_tex_image(texObj, target, level, zoffset);
GLenum baseFormat;
assert(texImage);
/*
@@ -1241,8 +1165,8 @@ getteximage_error_check(struct gl_context *ctx,
return true;
}
else if (_mesa_is_stencil_format(format)
&& !_mesa_is_depthstencil_format(baseFormat)
&& !_mesa_is_stencil_format(baseFormat)) {
&& !_mesa_is_depthstencil_format(baseFormat)
&& !_mesa_is_stencil_format(baseFormat)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"%s(format mismatch)", caller);
return true;
@@ -1271,6 +1195,142 @@ getteximage_error_check(struct gl_context *ctx,
}
/**
* Do common teximage-related error checking for getting uncompressed images.
* \return true if there was an error
*/
static bool
common_error_check(struct gl_context *ctx,
struct gl_texture_object *texObj,
GLenum target, GLint level,
GLsizei width, GLsizei height, GLsizei depth,
GLenum format, GLenum type, GLsizei bufSize,
GLvoid *pixels, const char *caller)
{
GLenum err;
GLint maxLevels;
if (texObj->Target == 0) {
_mesa_error(ctx, GL_INVALID_OPERATION, "%s(invalid texture)", caller);
return true;
}
maxLevels = _mesa_max_texture_levels(ctx, target);
if (level < 0 || level >= maxLevels) {
_mesa_error(ctx, GL_INVALID_VALUE, "%s(level = %d)", caller, level);
return true;
}
err = _mesa_error_check_format_and_type(ctx, format, type);
if (err != GL_NO_ERROR) {
_mesa_error(ctx, err, "%s(format/type)", caller);
return true;
}
/* According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"):
*
* "An INVALID_OPERATION error is generated by GetTextureImage if the
* effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY ,
* and the texture object is not cube complete or cube array complete,
* respectively."
*
* This applies also to GetTextureSubImage, GetCompressedTexImage,
* GetCompressedTextureImage, and GetnCompressedTexImage.
*/
if (target == GL_TEXTURE_CUBE_MAP && !_mesa_cube_complete(texObj)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"%s(cube incomplete)", caller);
return true;
}
return false;
}
/**
* Do error checking for all (non-compressed) get-texture-image functions.
* \return true if any error, false if no errors.
*/
static bool
getteximage_error_check(struct gl_context *ctx,
struct gl_texture_object *texObj,
GLenum target, GLint level,
GLsizei width, GLsizei height, GLsizei depth,
GLenum format, GLenum type, GLsizei bufSize,
GLvoid *pixels, const char *caller)
{
struct gl_texture_image *texImage;
assert(texObj);
if (common_error_check(ctx, texObj, target, level, width, height, depth,
format, type, bufSize, pixels, caller)) {
return true;
}
if (width == 0 || height == 0 || depth == 0) {
/* Not an error, but nothing to do. Return 'true' so that the
* caller simply returns.
*/
return true;
}
if (pbo_error_check(ctx, target, width, height, depth,
format, type, bufSize, pixels, caller)) {
return true;
}
texImage = select_tex_image(texObj, target, level, 0);
if (teximage_error_check(ctx, texImage, format, caller)) {
return true;
}
return false;
}
/**
* Do error checking for all (non-compressed) get-texture-image functions.
* \return true if any error, false if no errors.
*/
static bool
gettexsubimage_error_check(struct gl_context *ctx,
struct gl_texture_object *texObj,
GLenum target, GLint level,
GLint xoffset, GLint yoffset, GLint zoffset,
GLsizei width, GLsizei height, GLsizei depth,
GLenum format, GLenum type, GLsizei bufSize,
GLvoid *pixels, const char *caller)
{
struct gl_texture_image *texImage;
assert(texObj);
if (common_error_check(ctx, texObj, target, level, width, height, depth,
format, type, bufSize, pixels, caller)) {
return true;
}
if (dimensions_error_check(ctx, texObj, target, level,
xoffset, yoffset, zoffset,
width, height, depth, caller)) {
return true;
}
if (pbo_error_check(ctx, target, width, height, depth,
format, type, bufSize, pixels, caller)) {
return true;
}
texImage = select_tex_image(texObj, target, level, zoffset);
if (teximage_error_check(ctx, texImage, format, caller)) {
return true;
}
return false;
}
/**
* Return the width, height and depth of a texture image.
* This function must be resilient to bad parameter values since
@@ -1399,7 +1459,7 @@ _mesa_GetnTexImageARB(GLenum target, GLint level, GLenum format, GLenum type,
get_texture_image_dims(texObj, target, level, &width, &height, &depth);
if (getteximage_error_check(ctx, texObj, target, level,
0, 0, 0, width, height, depth,
width, height, depth,
format, type, bufSize, pixels, caller)) {
return;
}
@@ -1430,7 +1490,7 @@ _mesa_GetTexImage(GLenum target, GLint level, GLenum format, GLenum type,
get_texture_image_dims(texObj, target, level, &width, &height, &depth);
if (getteximage_error_check(ctx, texObj, target, level,
0, 0, 0, width, height, depth,
width, height, depth,
format, type, INT_MAX, pixels, caller)) {
return;
}
@@ -1464,7 +1524,7 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum format, GLenum type,
&width, &height, &depth);
if (getteximage_error_check(ctx, texObj, texObj->Target, level,
0, 0, 0, width, height, depth,
width, height, depth,
format, type, bufSize, pixels, caller)) {
return;
}
@@ -1497,9 +1557,10 @@ _mesa_GetTextureSubImage(GLuint texture, GLint level,
return;
}
if (getteximage_error_check(ctx, texObj, texObj->Target, level,
xoffset, yoffset, zoffset, width, height, depth,
format, type, bufSize, pixels, caller)) {
if (gettexsubimage_error_check(ctx, texObj, texObj->Target, level,
xoffset, yoffset, zoffset,
width, height, depth,
format, type, bufSize, pixels, caller)) {
return;
}

View File

@@ -4072,7 +4072,6 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
case ir_intrinsic_generic_atomic_comp_swap:
case ir_intrinsic_begin_invocation_interlock:
case ir_intrinsic_end_invocation_interlock:
case ir_intrinsic_begin_fragment_shader_ordering:
unreachable("Invalid intrinsic");
}
}

View File

@@ -954,8 +954,8 @@ wsi_common_queue_present(const struct wsi_device *wsi,
/* We only need/want to wait on semaphores once. After that, we're
* guaranteed ordering since it all happens on the same queue.
*/
submit_info.waitSemaphoreCount = pPresentInfo->waitSemaphoreCount,
submit_info.pWaitSemaphores = pPresentInfo->pWaitSemaphores,
submit_info.waitSemaphoreCount = pPresentInfo->waitSemaphoreCount;
submit_info.pWaitSemaphores = pPresentInfo->pWaitSemaphores;
/* Set up the pWaitDstStageMasks */
stage_flags = vk_alloc(&swapchain->alloc,

View File

@@ -1062,6 +1062,8 @@ wsi_display_swapchain_destroy(struct wsi_swapchain *drv_chain,
for (uint32_t i = 0; i < chain->base.image_count; i++)
wsi_display_image_finish(drv_chain, allocator, &chain->images[i]);
wsi_swapchain_finish(&chain->base);
vk_free(allocator, chain);
return VK_SUCCESS;
}