Compare commits

...

101 Commits

Author SHA1 Message Date
Emil Velikov
1e1734634b Update version to 18.0.0-rc4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 02:15:14 +00:00
Roland Scheidegger
0c0d6d7751 r600: don't do stack workarounds for hemlock
By the looks of it it seems hemlock is treated separately to cypress, but
certainly it won't need the stack workarounds cedar/redwood (and
seemingly every other eg chip except cypress/juniper) need.
(Discovered by accident.)

Acked-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c2f0e08857)
2018-02-05 19:06:03 +00:00
Jon Turney
f90ba6c1e0 travis: add osx autotools build
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b3a1d9588e)
2018-02-05 19:06:03 +00:00
Jon Turney
f009ba1fd7 travis: pip -> pip2
On travis, for OSX, python2 from homebrew is pre-installed. per [1]:

 python points to the macOS system Python (with no manual PATH modification)
 python2 points to Homebrew’s Python 2.7.x (if installed)
 python3 points to Homebrew’s Python 3.x (if installed)
 pip doesn't exist
 pip2 points to Homebrew’s Python 2.7.x’s pip (if installed)
 pip3 points to Homebrew’s Python 3.x’s pip (if installed)

We will end up using 'python2' for building mesa.

Just use 'pip2' instead of 'pip', as that seems to work for all platforms on
travis.

[1] https://docs.brew.sh/Homebrew-and-Python.html

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4701379d96)
2018-02-05 19:06:03 +00:00
Jon Turney
331bea12db travis: conditionalize building of prerequisites on if OS=linux
Use a '|' YAML literal block to avoid the convoluted syntax needed to put
the entire conditional on a single line.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7d1ec6d6a9)
2018-02-05 19:06:03 +00:00
Jon Turney
937b151e4f glx/test: fix building for osx
An additional stub for applegl_create_context() is needed
Cannot test indirect API as it's not built on osx, currently

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 63041ba613)
2018-02-05 19:06:02 +00:00
Jon Turney
9d7a80f4ae glx/apple: locate dispatch table functions to wrap by name
Avoid reaching into the dispatch table internals (and thus having to deal
with the complexities of remap etc.) by identifying functions to wrap by
name.

See:
https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq.
https://bugs.freedesktop.org/show_bug.cgi?id=90311

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d3540b405b)
2018-02-05 19:06:02 +00:00
Jon Turney
ac08cc6873 glx/apple: include util/debug.h for env_var_as_boolean prototype
mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b37b7b42dc)
2018-02-05 19:06:02 +00:00
Jon Turney
53f8d524a0 osx: ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f8ed9f24d5)
2018-02-05 19:06:02 +00:00
Jon Turney
eeee001d78 configure: Default to gbm=no on osx
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7ad7a07c88)
2018-02-05 19:06:02 +00:00
Eric Anholt
6fb0121e2c mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.
swapBytes operates on bytes, not 4-bit channels, so you can't just take
non-swapBytes cases and flip the REV flag.

Avoids piglit texture-packed-formats regressions when enabling the
ABGR4444 format.

Fixes: c5a5c9a7db ("mesa/formats: add new mesa formats and their pack/unpack functions.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1429cd74c2)
2018-02-05 19:06:02 +00:00
Dylan Baker
7769005a12 meson: Check for actual LLVM required versions
Currently we always check for 3.9.0, which is pretty safe since
everything except radv work with >= 3.9 and 3.9 is pretty old at this
point. However, radv actually requires 4.0, and there is a patch for
radeonsi to do the same.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c75a4e5b46)
2018-02-05 19:06:02 +00:00
Dylan Baker
2a99c5211b meson: Don't confuse the install and search paths for dri drivers
Currently there is not a separate option for setting the search path of
DRI drivers in meson, like there is in scons and autotools. This is an
oversight and needs to be fixed. This adds an extra option
`dri-search-path`, which will default to the value of
`dri-drivers-path`, like autotools does.

v2: - Split input list before joining.
v3: - use : instead of ; as the delimiter. The autotools help string
      incorrectly says ; but the code uses :
v4: - Take list in pre : delimited form (Ilia)
    - Ensure that the dri-search-path is absolute when using
      dri_drivers_path

Fixes: db9788420d ("meson: Add support for configuring dri drivers directory.")
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
(cherry picked from commit d7235ef83b)
2018-02-05 19:06:02 +00:00
Kenneth Graunke
d6a8939225 i965: Bump official kernel requirement to Linux v3.9.
In commit 3f353342a6 (present in 17.3.0)
we started unconditionally using I915_EXEC_NO_RELOC, which was
introduced in Linux v3.9.  ChromeOS kernel 3.8 has backported this,
so it should work too.

Running on older kernels would likely result in every single batch
being rejected by the kernel, which is pretty catastrophic.  Yet, it
appears that nobody noticed.  So, let's just bump the official
requirement and move forward ever so slowly.

Fixes: 3f353342a6 ("i965: Use I915_EXEC_NO_RELOC")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c3cd2aac27)
2018-02-05 19:06:02 +00:00
Marc Dietrich
a445cba84d meson: don't install windows headers on non-windows platforms
Only dive into the windows subdir if windows platform is selected.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Fixes: 5ef75cb02b "meson: build src/glx/windows"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 4c5f0b4fd4)
2018-02-05 19:06:02 +00:00
Andres Gomez
f4ac792671 i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.

In 61a8a55f55 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.

Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 256-bits through 2 uploads.

Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.

Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1

v2: Added more inline comments to explain why we are using
    ISL_FORMAT_R32_FLOAT and its consequences, as requested by
    Alejandro and Antía.

Fixes: 75968a668e ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Antia Puentes <apuentes@igalia.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5a7aba2e0a)
2018-02-05 19:06:02 +00:00
Michel Dänzer
25583470fc winsys/radeon: Compute is_displayable in surf_drm_to_winsys
It was always 0, breaking (at least) DRI3 with Xwayland.

Bugzilla: https://bugs.freedesktop.org/104306
Fixes: 5f2073be32 ("ac/surface: add ac_surface::is_displayable")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1cf1bf32ef)
2018-02-05 19:06:02 +00:00
Matthew Nicholls
7eaa4049f1 radv: remove predication on cache flushes
This can lead to a situation where cache flushes could get conditionally
disabled while still clearing the flush_bits, and thus flushes due to
application pipeline barriers may never get executed.

Fixes: a6c2001ace (radv: add support for cmd predication.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ef272b161e)
[Emil Velikov: trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_cmd_buffer.c
2018-02-05 19:06:02 +00:00
Dave Airlie
15ef35052c virgl: also remove dimension on indirect.
This fixes some dEQP tests that generated bad shaders.

Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 49c61d8b84)
2018-02-05 19:06:02 +00:00
Dave Airlie
1c68826323 radv/gfx9: fix block compression texture views. (v2)
This ports a fix from amdvlk, to fix the sizing for mip levels
when block compressed images are viewed using uncompressed views.

My original fix didn't power the clamping, but it looks like
the clamping is required to stop the sizing going too large.

Fixes:
dEQP-VK.image.texel_view_compatible.graphic.extended*bc*
Doesn't crash DOW3 anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f6cc15dccd)
2018-02-05 19:06:02 +00:00
Bas Nieuwenhuizen
0a26b54725 radv: Signal fence correctly after sparse binding.
It did not signal syncobjs in the fence, and also signalled too early
if there was work on the queue already, as we have to wait till that
work is done.

Fixes: d27aaae4d2 "radv: Add external fence support."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0347a83bbf)
2018-02-05 19:06:02 +00:00
Jon Turney
8c827600ed meson: libdrm shouldn't appear in Requires.private: if it wasn't found
Otherwise, using pkg-config to retrieve flags will fail, e.g.

$ pkg-config gl --cflags
Package libdrm was not found in the pkg-config search path.
Perhaps you should add the directory containing `libdrm.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libdrm', required by 'gl', not found

Fixes: 3218056e0e ("meson: Build i965 and dri stack")

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
(cherry picked from commit 4a0bab1d7f)
2018-02-05 19:06:01 +00:00
Timothy Arceri
c369ec95d9 st/shader_cache: restore num_tgsi_tokens when loading from cache
Without this we will fail to correctly serialise programs when
using glGetProgramBinary() if the program was retrieved from
the disk cache rather than freshly compiled.

Fixes: c69b0dd681 "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program"

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762
(cherry picked from commit 041b18cf23)
2018-02-05 19:06:01 +00:00
Rafael Antognolli
626c84edb3 i965/gen10: Use CS Stall instead of WriteImmediate.
Fixes: ca19ee33d7
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 131e871385)
2018-02-05 19:06:01 +00:00
Rafael Antognolli
657817030b anv/gen10: Emit CS stall and mark push constants dirty.
I got reviews and fixed the patches locally, but ended up merging the
ones that I sent originally to the list. This patch fixes those
mistakes.

Fixes: 78c125af39
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 20578f81a6)
2018-02-05 19:06:01 +00:00
Stephan Gerhold
e6018eceb9 util/build-id: Fix address comparison for binaries with LOAD vaddr > 0
build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD
segment has a virtual address other than 0x0.

For most shared libraries, the first LOAD segment has vaddr=0x0:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    LOAD           0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000
    LOAD           0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW  0x1000

However, compiling the Intel Vulkan driver as 32-bit binary on Android produces
the following ELF header with vaddr=0x8000 instead:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    PHDR           0x000034 0x00008034 0x00008034 0x00100 0x00100 R   0x4
    LOAD           0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000
    LOAD           0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW  0x1000

build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr()
and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a
different memory address, e.g.:

    dli_fbase=0xd8395000 (offset 0x8000)
    dlpi_addr=0xd838d000

At least on glibc and bionic (Android) dli_fbase refers to the address where
the shared object is mapped into the process space, whereas dlpi_addr is just
the base address for the vaddrs declared in the ELF header.

To compare them correctly, we need to calculate the start of the mapping
by adding the vaddr of the first LOAD segment to the base address.

Note: musl users will need the following patch.
https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac

Cc: Chad Versace <chadversary@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642
Fixes: 5c98d38 "util: Query build-id by symbol address, not library name"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 02e2009b92)
2018-02-05 19:06:01 +00:00
Jordan Justen
227e0fb0a4 i965: Create new program cache bo when clearing the program cache
When the disk shader cache CI testing was enabled, we started noticing
occasional failures on deqp test runs. (Mainly SNB, rarely HSW)

Before this change, when we cleared the (in memory) program cache we
reused the same bo. Since the disk shader cache quickly restores
programs, it appears that this would lead to overwrites of the older
program binaries in the in memory program cache that apparently were
still executing in some cases. If these programs were still executing,
this could cause a GPU hang.

This issue is probably not disk shader cache specific, but may have
been hidden due to the compiler taking time to recompile programs
after the cache was cleared.

v2:
 * Don't add `copy` param to brw_cache_new_bo (Ken)
 * Call from brw_program_cache_check_size (Ken)

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 83e60ce927)
2018-02-05 19:06:01 +00:00
George Kyriazis
28097758a8 meson/swr: Updated copyright dates
cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit bbef9474fa)
2018-02-05 19:06:01 +00:00
George Kyriazis
5afeb68c7e meson/swr: re-shuffle generated files
Move generated files from codegen/meson.build to other directories, in order
to satisfy generated include file dependencies

Add correct file lists for architecture-specific libraries.

cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 16bf813830)
2018-02-05 19:06:01 +00:00
Jason Ekstrand
c4d9ceecf8 i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that
use window-system MSAA.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 2f7205be47)
2018-02-05 19:06:01 +00:00
Emil Velikov
133aa8c9f7 cherry-ignore: radv: Don't expose VK_KHX_multiview on android.
stable: The KHX extension is disabled all together in the stable
branches.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:04:56 +00:00
Emil Velikov
a307f06969 radv: Stop advertising VK_KHX_multiview
We don't want to advertise experimental extensions in actual releases.
However, there's no harm in leaving the code lying around in the tree.

[Emil Velikov: port from equivalent ANV commit]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:01:44 +00:00
Jason Ekstrand
d50d11f84b anv: Stop advertising VK_KHX_multiview
We don't want to advertise experimental extensions in actual releases.
However, there's no harm in leaving the code lying around in the tree.

(cherry picked from commit e4371d14f1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_device.c
2018-02-05 18:57:38 +00:00
Lucas Stach
0c2caeb441 renderonly: fix dumb BO allocation for non 32bpp formats
Take into account the resource format, instead of applying a hardcoded
32bpp. This not only over-allocates 16bpp formats, but also results in
a wrong stride being filled into the handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 0c71a19fe4)
2018-02-05 16:41:19 +00:00
Jason Ekstrand
b3cfa244e1 anv/cmd_buffer: Re-emit the pipeline at every subpass
If we ever hit this edge-case, it can theoretically cause problem for
CNL because we could end up changing render targets without re-emitting
3DSTATE_MULTISAMPLE which is part of the pipeline.  Just get rid of the
edge case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 97938dac36)
2018-02-05 16:41:15 +00:00
Dave Airlie
446187287a r600/sb: insert the else clause when we might depart from a loop
If there is a break inside the else clause and this means we
are breaking from a loop, the loop finalise will want to insert
the LOOP_BREAK/CONTINUE instruction, however if we don't emit
the else there is no where for these to end up, so they will end
up in the wrong place.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8d633f067b)
2018-02-05 16:41:12 +00:00
Tapani Pälli
af748138eb nir: mark unused space in packed_tex_data
This change cleans following scary warnings in valgrind output
when disk cache is being written:

   ==6532== Uninitialised byte(s) found during client check request
   ==6532==    at 0x14423FAD: blob_write_bytes (blob.c:152)
   ==6532==    by 0x144240FB: blob_write_uint32 (blob.c:194)
   ==6532==    by 0x144001A5: write_tex (nir_serialize.c:613)

and later (loads of):

   ==6532== Use of uninitialised value of size 8
   ==6532==    at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
   ==6532==    by 0x13F65014: util_hash_crc32 (crc32.c:127)
   ==6532==    by 0x13F5DABA: cache_put (disk_cache.c:947)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d0343bef66)
2018-02-05 16:41:09 +00:00
Tapani Pälli
753e9d6dd2 i965: fix disk_cache leak when destroying context
==2780== 1,024 bytes in 1 blocks are possibly lost in loss record 180 of 205
   ==2780==    at 0x4C31A1E: calloc (vg_replace_malloc.c:711)
   ==2780==    by 0x13F6467E: util_queue_init (u_queue.c:309)
   ==2780==    by 0x13F5C9F6: disk_cache_create (disk_cache.c:369)
   ==2780==    by 0x13F05406: brw_disk_cache_init (brw_disk_cache.c:428)
   ==2780==    by 0x13F01E78: brwCreateContext (brw_context.c:1068)

Fixes: 1a61a8b9a7 ("i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit b99c88037b)
2018-02-05 16:41:05 +00:00
Tapani Pälli
62e8b651b1 i965: fix prog_data leak in brw_disk_cache
==25481== 576 bytes in 1 blocks are definitely lost in loss record 179 of 208
   ==25481==    at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
   ==25481==    by 0x1404E2CC: ralloc_size (ralloc.c:121)
   ==25481==    by 0x14119F82: read_and_upload (brw_disk_cache.c:176)
   ==25481==    by 0x1411A5C9: brw_disk_cache_upload_program (brw_disk_cache.c:271)
   ==25481==    by 0x1412FCA4: brw_upload_wm_prog (brw_wm.c:597)

Fixes: 516d50db31 ("i965: add initial implementation of on disk shader cache")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 28db950b51)
2018-02-05 16:41:02 +00:00
Dave Airlie
412c850120 r600/eg: construct proper rat mask for image/buffers.
If the images/buffer bindings had a gap, this produced the wrong values,
this should fix that to generate the correct rat mask for mixes of
images/buffers/cbs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e0e23ea69c)
2018-02-05 16:40:59 +00:00
Marek Olšák
61c42583d9 winsys/amdgpu: fix assertion failure with UVD and VCE rings
Cc: 18.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 17423c993d)
2018-02-05 16:40:56 +00:00
Emil Velikov
17c0e248d7 Update version to 18.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 11:27:18 +00:00
Emil Velikov
92a332ed1a cherry-ignore: add patches picked without -x
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-26 19:53:02 +00:00
Maxin B. John
74b39c0bbf anv_icd.py: improve reproducible builds
Sort the output to ensure build reproducibility

Signed-off-by: Maxin B. John <maxin.john@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8116b9170b)
2018-01-26 19:53:02 +00:00
Bas Nieuwenhuizen
a5bdf2abf9 radeonsi: Export signalled sync file instead of -1.
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5a3404d443)
2018-01-26 19:53:02 +00:00
Dave Airlie
305b0b1356 radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
This seems to be broken, at least the cts tests fail.

This fixes:
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8

2 samples seems to pass fine, amdvlk doesn't appear to enable TC for
possibly some other reasons here.

This is most likely a hack.

v1.1: add a bit of explaination text. (Samuel)
Fixes: ad3d98da9 (radv: enable tc compatible htile for d32s8 also.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f4c534ef68)
2018-01-26 19:53:02 +00:00
Emil Velikov
28680e72b8 configure.ac: correct driglx-direct help text
The default was toggled a while back, but the text wasn't updated.

Fixes: bd526ec9e1 ("configure: Always default to
--enable-driglx-direct")
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 6aeef54644)
2018-01-26 19:53:02 +00:00
Roland Scheidegger
32b2c0da59 gallivm: fix crash with seamless cube filtering with different min/mag filter
We are not allowed to modify the incoming coords values, or things may
crash (as we may be inside a llvm conditional and the values may be used
in another branch).
I recently broke this when fixing an issue with NaNs and seamless cube
map filtering, and it causes crashes when doing cubemap filtering
if the min and mag filters are different.
Add const to the pointers passed in to prevent this mishap in the future.

Fixes: a485ad0bcd ("gallivm: fix an issue with NaNs with seamless cube filtering")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4fe662c58f)
2018-01-26 19:53:02 +00:00
Greg V
b01ea9701e meson: handle LLVM 'x.x.xgit-revision' versions
When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git
exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version
becomes something like 5.0.0git-f8ab206b2176.

New meson versions already handle this, but we support older versions too.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 8fae5eddd9)
2018-01-26 19:53:02 +00:00
Greg V
bf22d563f5 meson: fix getting cflags from pkg-config
get_pkgconfig_variable('cflags') always returns an empty list, it's a
function for getting *custom* variables.

Meson does not yet support asking for cflags, so explicitly invoke
pkg-config for now.

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker")
Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit 53f9131205)
2018-01-26 19:53:02 +00:00
Greg V
af8c66ba6b meson: fix missing dependencies
Fixes: 66f97f6640 ("meson: build radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@colalbora.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit 7c8cfe2d59)
2018-01-26 19:53:02 +00:00
Dylan Baker
a807ad2f7c meson: correctly set SYSCONFDIR for loading dirrc
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 5781c3d1db)
2018-01-26 19:53:01 +00:00
Dave Airlie
90e4f15053 radv: move spi_baryc_cntl to pipeline
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.

This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*

Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 298554541d)
2018-01-26 19:53:01 +00:00
Scott D Phillips
12afb389d6 meson: Fix define for USE_SSE41
Before we were adding -DHAVE_SSE41 which isn't what the code is
looking for, so some uses of the sse4.1 code were always being
skipped.

v2: Don't add any compile check for the quite old -msse4.1 option (Dylan)

Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 0b8d38bd48)
2018-01-26 19:53:01 +00:00
Brian Paul
2d7035ee48 vbo: fix incorrect min/max_index values in display list draw call
This fixes another regression from commit 8e4efdc895 ("vbo: optimize
some display list drawing").  The problem was the min_index, max_index
values passed to the vbo drawing function were not computed to compensate
for the biased prim::start values.

https://bugs.freedesktop.org/show_bug.cgi?id=104746
https://bugs.freedesktop.org/show_bug.cgi?id=104742
https://bugs.freedesktop.org/show_bug.cgi?id=104690
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: 8e4efdc895 ("vbo: optimize some display list drawing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
(cherry picked from commit 365a48abdd)
2018-01-26 19:53:01 +00:00
Dave Airlie
80ca933e68 radv: fix sample_mask_in loading. (v3.1)
This is ported from radeonsi and fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_*

v2: don't call this path for radeonsi, it does it in the epilog.
use the radeonsi code path.
v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel)
v3.1: set ps_iter_samples default to 1 (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: bdcbe7c76 (radv: add sample mask input support)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 766589d89a)
2018-01-26 19:53:01 +00:00
Dave Airlie
05e6e669bd radv: don't use hw resolves for r16g16 norm formats.
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.

Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c727ea9370)
2018-01-26 19:53:01 +00:00
Dave Airlie
62803e022e radv: don't use hw resolve for integer image formats
From reading AMDVLK it currently never uses hw resolve paths.

This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.

This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4df414bbd2)
2018-01-26 19:53:01 +00:00
Dave Airlie
e76f0abed8 radv: add fs_key meta format support to resolve passes.
Some of the hw resolve passes need the SPI color format setup
correctly.

This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 316d762186)
2018-01-26 19:53:01 +00:00
Christoph Haag
eaf9500651 meson: remove lib prefix from libd3dadapter9.so
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit 4b4d929c27)
2018-01-26 19:53:01 +00:00
Eric Engestrom
3ca5ace19d radeon: remove left over dead code
Fixes: 4e0d99a635 "r100: Use shared debug code"
Cc: Pauli Nieminen <suokkos@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit eee8dd7c33)
2018-01-26 19:53:01 +00:00
Rafael Antognolli
e1ac54507e i965/gen10: Re-enable push constants.
The GPU hang caused by push constants is apparently fixed, so let's
enable them again.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bcfd78e448)
2018-01-26 19:53:01 +00:00
Rafael Antognolli
dcdeb6a33e anv/gen10: Ignore push constant packets during context restore.
Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a
context restore.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 78c125af39)
2018-01-26 19:53:01 +00:00
Rafael Antognolli
8452d0f466 i965/gen10: Ignore push constant packets during context restore.
These packets were causing GPU hangs when the context was restored,
possibly because they were pointing to BO's that were already
unreferenced. So we tell the hardware to ignore such packets after the
batch buffer ends, since we know those BO's are not around anymore.

This change fixes GPU hangs on CNL. The (partial) solution to this
problem so far was to entirely disable push constants on this platform.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ca19ee33d7)
2018-01-26 19:53:01 +00:00
Eleni Maria Stea
123a39cd6a mesa: Fix function pointers initialization in status tracker
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 8096b558a7)
2018-01-26 19:53:01 +00:00
Samuel Pitoiset
639d95e93f ac/nir: set amdgpu.uniform and invariant.load for UBOs
UBOs are constants buffers.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 41c36c45 ("amd/common: use ac_build_buffer_load() for emitting UBO loads")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 49b0a140a7)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
70814af14f anv/pipeline: Don't look at blend state unless we have an attachment
Without this, we may end up dereferencing blend before we check for
binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
be NULL so long as you don't have any color attachments.  This fixes a
segfault when running The Talos Principal.

Fixes: 12f4e00b69
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit c8949e2498)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
ca6942c672 i965/fs: Reset the register file to VGRF in lower_integer_multiplication
18fde36ced changed the way temporary
registers were allocated in lower_integer_multiplication so that we
allocate regs_written(inst) space and keep the stride of the original
destination register.  This was to ensure that any MUL which originally
followed the CHV/BXT integer multiply regioning restrictions would
continue to follow those restrictions even after lowering.  This works
fine except that I forgot to reset the register file to VGRF so, even
though they were assigned a number from alloc.allocate(), they had the
wrong register file.  This caused some GLES 3.0 CTS tests to start
failing on Sandy Bridge due to attempted reads from the MRF:

    ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64

This commit remedies this problem by, instead of copying inst->dst and
overwriting nr, just make a new register and set the region to match
inst->dst.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
Fixes: 18fde36ced
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit db682b8f0e)
2018-01-26 19:53:01 +00:00
Chuck Atkins
9550852086 configure.ac: add missing llvm dependencies to .pc files
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: <mesa-stable@lists.freedesktop.org>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6ac5e851f1)
2018-01-26 19:53:01 +00:00
George Kyriazis
2594045132 swr/rast: support llvm 3.9 type declarations
LLVM 3.9 was not taken into account in initial check-in.

Fixes: 01ab218bbc ("swr/rast: Initial work for debugging support.")
cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 0e879aad2f)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
b62cefdef8 i965/draw: Set NEW_AUX_STATE when draw aux changes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 20f70ae385)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
521d5b4dcc i965: Replace draw_aux_buffer_disabled with draw_aux_usage
Instead of keeping an array of booleans, we now hang onto an array of
isl_aux_usage enums.  This means that the thing we are passing from
brw_draw.c to surface state setup is the thing that surface state setup
actually needs instead of an input to compute what it needs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e52a9f18d6)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
85c18bb410 i965/surface_state: Drop brw_aux_surface_disabled
The only purpose of this function is to disable aux on texture surfaces
when the corresponding renderbuffer has aux disabled.  However, the act
of disabling aux on the renderbuffer will cause it to be resolved and
intel_miptree_texture_aux_usage will already check the resolved status
of a texture and return ISL_AUX_USAGE_NONE for it.  Even if we used CCS
for it, that wouldn't really be a problem because the CCS will be in the
pass-through state and so it would effectively be ignored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 468ea3cc45)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
d2e9fe8351 i965/miptree: Add an aux_disabled parameter to render_aux_usage
Only one of the callers of intel_miptree_render_aux_usage actually took
brw->draw_aux_buffer_disabled into account.  This was causing us to
ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render.
This isn't a problem because the draw_aux_buffer_disabled entry was set
during texture preparation and we already did the resolve at that time.
However, this also meant that the aux_usage we were passing to
brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our
automatic cache flushing around aux_usage changes wasn't happening.
This was causing GPU hangs in Oxenfree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d38ec24f53)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
a6f4d96a1a i965/miptree: Take an aux_usage in prepare/finish_render
Both callers of intel_miptree_prepare/finish_render have to call
intel_miptree_render_aux_usage anyway for other reasons.  They may as
well pass the result in instead of us calling it again.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dfe0217905)
2018-01-26 19:53:01 +00:00
Greg V
658e9e442c meson: fix BSD build
CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c38c60a63c)
2018-01-26 19:53:01 +00:00
Marek Olšák
48510dccc4 radeonsi: don't ignore pitch for imported textures
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 022c5b22fe)
2018-01-26 19:53:00 +00:00
Topi Pohjolainen
f6f43e6a4c i965: Don't try to disable render aux buffers for compute
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit ec4bb693a0)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
90b00bf766 anv/cmd_buffer: Move gen7 index buffer state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4064fe59e7)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
19b3e2b781 anv/cmd_buffer: Move num_workgroups to compute state
While we're here, make it an anv_address.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 38ec78049f)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
81a740b941 anv/cmd_buffer: Move dynamic state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 95ff232294)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
f310f42ed3 anv/cmd_buffer: Use a temporary variable for dynamic state
We were already doing this for some packets to keep the lines shorter.
We may as well just do it for all of them.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 24caee8975)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
8c93db854c anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
Vertex buffers are entirely a graphics pipeline thing.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8bd5ec5b86)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
76e7324b79 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e85aaec148)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
66d17b545f anv: Separate compute and graphics descriptor sets
The Vulkan spec says:

    "pipelineBindPoint is a VkPipelineBindPoint indicating whether the
    descriptors will be used by graphics pipelines or compute pipelines.
    There is a separate set of bind points for each of graphics and
    compute, so binding one does not disturb the other."

Up until now, we've been ignoring the pipeline bind point and had just
one bind point for everything.  This commit separates things out into
separate bind points.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 97f96610c8)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
144a300204 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 31b2144c83)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
2dec9ce687 anv/cmd_buffer: Add a helper for binding descriptor sets
This lets us unify some code between push descriptors and regular
descriptors.  It doesn't do much for us yet but it will.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9e1ca16f8)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
bde35c09de anv/cmd_buffer: Refactor ensure_push_descriptor_set
It's now a function which returns the push descriptor set.  Since we set
the error on the command buffer, returning the error is a little
redundant.  Returning the descriptor set (or NULL on error) is more
convenient.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 90cceaa9dd)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
064fbf7180 anv: Remove semicolons from vk_error[f] definitions
With the semicolons, they can't be used in a function argument without
throwing syntax errors.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d5592e2fda)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
cb5abcd715 anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
Initially, these just contain the pipeline in a base struct.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9af5379228)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
f0a1c2c69e anv/cmd_buffer: Use some pre-existing pipeline temporaries
There are several places where we'd already saved the pipeline off to a
temporary variable but, due to an artifact of history, weren't actually
using that temporary everywhere.  No functional change.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ddc2d28548)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
3c85e8c8e3 anv/cmd_buffer: Rework anv_cmd_state_reset
This splits anv_cmd_state_reset into separate init and finish functions.
This lets us share init code with cmd_buffer_create.  This potentially
fixes subtle bugs where we may have missed some bit of state that needs
to get initialized on command buffer creation.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd3feea745)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
2ecc2f85fe anv/cmd_buffer: Get rid of the meta query workaround
Meta has been gone for a long time.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d6c9a89d13)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
f4f0838d31 anv/cmd_state: Drop the scratch_size field
This is a legacy left-over from the mechanism we used to use to handle
scratch.  The new (and better) mechanism doesn't use this.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bc0a21e348)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
44b15816bb anv/pipeline: Don't assert on more than 32 samplers
This prevents an assert when running one unreleased Vulkan game.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b69ba3817)
2018-01-26 19:53:00 +00:00
Marc Dietrich
4d0b43117d meson: fix some defines misspelled errors in meson.build
Defines
- HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
- HAVE_FUNC_ATTRIBUTE_VISIBILITY
were misspelled.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 911ca587f8)
2018-01-26 19:53:00 +00:00
Emil Velikov
99a48002a2 Update version to 18.0.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 18:07:37 +00:00
Emil Velikov
e91e68d6a8 Update version to 18.0.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 16:39:33 +00:00
Emil Velikov
a9db8ac935 automake: small cleanup after the meson.build inclusion
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.

Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 14:25:34 +00:00
Emil Velikov
41e48eac87 automake: anv: ship anv_extensions_gen.py in the tarball
Fixes: dd088d4bec ("anv/extensions: Generate a header file with
extension tables")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 14:25:34 +00:00
Emil Velikov
90002ba41e automake: vc5: remove non-applicable v3dx_simulator.h
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 14:25:32 +00:00
93 changed files with 1381 additions and 731 deletions

View File

@@ -396,9 +396,39 @@ matrix:
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="macOS make"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make check"
- DRI_LOADERS="--with-platforms=x11 --disable-egl"
os: osx
before_install:
- |
if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext
# Set PATH for homebrew pip3 installs
PATH="$HOME/Library/Python/3.6/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"
# Set PATH for keg-only gettext
PATH="/usr/local/opt/gettext/bin:${PATH}"
# Install xquartz for prereqs ...
XQUARTZ_VERSION="2.7.11"
wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg
hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg
sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /
hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}
# ... and set paths
PATH="/opt/X11/bin:${PATH}"
PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"
ACLOCAL="aclocal -I /opt/X11/share/aclocal -I /usr/local/share/aclocal"
fi
install:
- pip install --user mako
- pip2 install --user mako
# Install the latest meson from pip, since the version in the ubuntu repos is
# often quite old.
@@ -419,62 +449,64 @@ install:
# Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting).
- wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
- tar -jxvf $XORGMACROS_VERSION.tar.bz2
- (cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- |
if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -jxvf $XORGMACROS_VERSION.tar.bz2
(cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
- tar -jxvf $GLPROTO_VERSION.tar.bz2
- (cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
tar -jxvf $GLPROTO_VERSION.tar.bz2
(cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
- tar -jxvf $DRI2PROTO_VERSION.tar.bz2
- (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
tar -jxvf $DRI2PROTO_VERSION.tar.bz2
(cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
- tar -jxvf $XCBPROTO_VERSION.tar.bz2
- (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
tar -jxvf $XCBPROTO_VERSION.tar.bz2
(cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
- tar -jxvf $LIBXCB_VERSION.tar.bz2
- (cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
tar -jxvf $LIBXCB_VERSION.tar.bz2
(cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
- tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
- (cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
(cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
- tar -jxvf $LIBDRM_VERSION.tar.bz2
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -jxvf $LIBDRM_VERSION.tar.bz2
(cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
- wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
- tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
- (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
(cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
- tar -jxvf $LIBVDPAU_VERSION.tar.bz2
- (cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
tar -jxvf $LIBVDPAU_VERSION.tar.bz2
(cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
- tar -jxvf $LIBVA_VERSION.tar.bz2
- (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
tar -jxvf $LIBVA_VERSION.tar.bz2
(cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
- wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
- tar -axvf $LIBWAYLAND_VERSION.tar.xz
- (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
tar -axvf $LIBWAYLAND_VERSION.tar.xz
(cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
- wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
- tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
- (cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
(cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Meson requires ninja >= 1.6, but trusty has 1.3.x
- wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip;
- unzip ninja-linux.zip
- mv ninja $HOME/prefix/bin/
# Meson requires ninja >= 1.6, but trusty has 1.3.x
wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip
unzip ninja-linux.zip
mv ninja $HOME/prefix/bin/
# Generate the header since one is missing on the Travis instance
- mkdir -p linux
- printf "%s\n" \
# Generate this header since one is missing on the Travis instance
mkdir -p linux
printf "%s\n" \
"#ifndef _LINUX_MEMFD_H" \
"#define _LINUX_MEMFD_H" \
"" \
@@ -485,6 +517,7 @@ install:
"#define MFD_ALLOW_SEALING 0x0002U" \
"" \
"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
fi
script:
- if test "x$BUILD" = xmake; then

View File

@@ -1 +1 @@
17.4.0-devel
18.0.0-rc4

6
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,6 @@
# fixes: The following commits were applied without the "cherry-picked from" tag
50265cd9ee4caffee853700bdcd75b92eedc0e7b automake: anv: ship anv_extensions_gen.py in the tarball
ac4437b20b87c7285b89466f05b51518ae616873 automake: small cleanup after the meson.build inclusion
# stable: The KHX extension is disabled all together in the stable branches.
bee9270853c34aa8e4b3d19a125608ee67c87b86 radv: Don't expose VK_KHX_multiview on android.

View File

@@ -685,6 +685,19 @@ AC_LINK_IFELSE(
LDFLAGS=$save_LDFLAGS
AM_CONDITIONAL(HAVE_LD_DYNAMIC_LIST, test "$have_ld_dynamic_list" = "yes")
dnl
dnl OSX linker does not support build-id
dnl
case "$host_os" in
darwin*)
LD_BUILD_ID=""
;;
*)
LD_BUILD_ID="-Wl,--build-id=sha1"
;;
esac
AC_SUBST([LD_BUILD_ID])
dnl
dnl compatibility symlinks
dnl
@@ -1270,10 +1283,10 @@ AC_ARG_ENABLE([xa],
[enable_xa=no])
AC_ARG_ENABLE([gbm],
[AS_HELP_STRING([--enable-gbm],
[enable gbm library @<:@default=yes except cygwin@:>@])],
[enable gbm library @<:@default=yes except cygwin and macOS@:>@])],
[enable_gbm="$enableval"],
[case "$host_os" in
cygwin*)
cygwin* | darwin*)
enable_gbm=no
;;
*)
@@ -1598,7 +1611,7 @@ fi
AC_ARG_ENABLE([driglx-direct],
[AS_HELP_STRING([--disable-driglx-direct],
[disable direct rendering in GLX and EGL for DRI \
@<:@default=auto@:>@])],
@<:@default=enabled@:>@])],
[driglx_direct="$enableval"],
[driglx_direct="yes"])
@@ -2780,6 +2793,18 @@ if test "x$enable_llvm" = xyes; then
fi
fi
fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" == xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
fi
fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)

View File

@@ -57,6 +57,10 @@ dri_drivers_path = get_option('dri-drivers-path')
if dri_drivers_path == ''
dri_drivers_path = join_paths(get_option('libdir'), 'dri')
endif
dri_search_path = get_option('dri-search-path')
if dri_search_path == ''
dri_search_path = join_paths(get_option('prefix'), dri_drivers_path)
endif
with_gles1 = get_option('gles1')
with_gles2 = get_option('gles2')
@@ -202,18 +206,20 @@ if with_dri_i915 or with_gallium_i915
dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75')
endif
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 'linux'].contains(host_machine.system())
if host_machine.system() == 'darwin'
with_dri_platform = 'apple'
elif ['windows', 'cygwin'].contains(host_machine.system())
with_dri_platform = 'windows'
elif host_machine.system() == 'linux'
# FIXME: This should include BSD and possibly other systems
elif system_has_kms_drm
with_dri_platform = 'drm'
else
# FIXME: haiku doesn't use dri, and xlib doesn't use dri, probably should
# assert here that one of those cases has been met.
# FIXME: GNU (hurd) ends up here as well, but meson doesn't officially
# support Hurd at time of writing (2017/11)
# FIXME: illumos ends up here as well
with_dri_platform = 'none'
endif
@@ -225,7 +231,7 @@ with_platform_surfaceless = false
egl_native_platform = ''
_platforms = get_option('platforms')
if _platforms == 'auto'
if ['linux'].contains(host_machine.system())
if system_has_kms_drm
_platforms = 'x11,wayland,drm,surfaceless'
else
error('Unknown OS, no platforms enabled. Patches gladly accepted to fix this.')
@@ -272,10 +278,10 @@ endif
with_gbm = get_option('gbm')
if with_gbm == 'auto' and with_dri # TODO: or gallium
with_gbm = host_machine.system() == 'linux'
with_gbm = system_has_kms_drm
elif with_gbm == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('GBM only supports unix-like platforms')
if not system_has_kms_drm
error('GBM only supports DRM/KMS platforms')
endif
with_gbm = true
else
@@ -351,7 +357,7 @@ endif
with_dri2 = (with_dri or with_any_vk) and with_dri_platform == 'drm'
with_dri3 = get_option('dri3')
if with_dri3 == 'auto'
if host_machine.system() == 'linux' and with_dri2
if system_has_kms_drm and with_dri2
with_dri3 = true
else
with_dri3 = false
@@ -371,10 +377,12 @@ if with_dri or with_gallium
endif
endif
prog_pkgconfig = find_program('pkg-config')
dep_vdpau = []
_vdpau = get_option('gallium-vdpau')
if _vdpau == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_vdpau = false
elif not with_platform_x11
with_gallium_vdpau = false
@@ -386,8 +394,8 @@ if _vdpau == 'auto'
with_gallium_vdpau = dep_vdpau.found()
endif
elif _vdpau == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('VDPAU state tracker can only be build on unix-like OSes.')
if not system_has_kms_drm
error('VDPAU state tracker can only be build on DRM/KMS OSes.')
elif not with_platform_x11
error('VDPAU state tracker requires X11 support.')
with_gallium_vdpau = false
@@ -402,7 +410,7 @@ else
endif
if with_gallium_vdpau
dep_vdpau = declare_dependency(
compile_args : dep_vdpau.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['vdpau', '--cflags']).stdout().split()
)
endif
@@ -417,7 +425,7 @@ endif
dep_xvmc = []
_xvmc = get_option('gallium-xvmc')
if _xvmc == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_xvmc = false
elif not with_platform_x11
with_gallium_xvmc = false
@@ -428,8 +436,8 @@ if _xvmc == 'auto'
with_gallium_xvmc = dep_xvmc.found()
endif
elif _xvmc == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('XVMC state tracker can only be build on unix-like OSes.')
if not system_has_kms_drm
error('XVMC state tracker can only be build on DRM/KMS OSes.')
elif not with_platform_x11
error('XVMC state tracker requires X11 support.')
with_gallium_xvmc = false
@@ -443,7 +451,7 @@ else
endif
if with_gallium_xvmc
dep_xvmc = declare_dependency(
compile_args : dep_xvmc.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['xvmc', '--cflags']).stdout().split()
)
endif
@@ -455,7 +463,7 @@ endif
dep_omx = []
_omx = get_option('gallium-omx')
if _omx == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_omx = false
elif not with_platform_x11
with_gallium_omx = false
@@ -466,8 +474,8 @@ if _omx == 'auto'
with_gallium_omx = dep_omx.found()
endif
elif _omx == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('OMX state tracker can only be built on unix-like OSes.')
if not system_has_kms_drm
error('OMX state tracker can only be built on DRM/KMS OSes.')
elif not (with_platform_x11 or with_platform_drm)
error('OMX state tracker requires X11 or drm platform support.')
with_gallium_omx = false
@@ -506,14 +514,14 @@ if with_gallium_omx
endif
if with_gallium_omx
dep_omx = declare_dependency(
compile_args : dep_omx.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['libomxil-bellagio', '--cflags']).stdout().split()
)
endif
dep_va = []
_va = get_option('gallium-va')
if _va == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_va = false
elif not with_platform_x11
with_gallium_va = false
@@ -524,8 +532,8 @@ if _va == 'auto'
with_gallium_va = dep_va.found()
endif
elif _va == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('VA state tracker can only be built on unix-like OSes.')
if not system_has_kms_drm
error('VA state tracker can only be built on DRM/KMS OSes.')
elif not (with_platform_x11 or with_platform_drm)
error('VA state tracker requires X11 or drm or wayland platform support.')
with_gallium_va = false
@@ -539,7 +547,7 @@ else
endif
if with_gallium_va
dep_va = declare_dependency(
compile_args : dep_va.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['libva', '--cflags']).stdout().split()
)
endif
@@ -550,7 +558,7 @@ endif
_xa = get_option('gallium-xa')
if _xa == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_xa = false
elif not (with_gallium_nouveau or with_gallium_freedreno or with_gallium_i915
or with_gallium_svga)
@@ -559,8 +567,8 @@ if _xa == 'auto'
with_gallium_xa = true
endif
elif _xa == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('XA state tracker can only be built on unix-like OSes.')
if not system_has_kms_drm
error('XA state tracker can only be built on DRM/KMS OSes.')
elif not (with_gallium_nouveau or with_gallium_freedreno or with_gallium_i915
or with_gallium_svga)
error('XA state tracker requires at least one of the following gallium drivers: nouveau, freedreno, i915, svga.')
@@ -692,14 +700,14 @@ if cc.compiles('struct __attribute__((packed)) foo { int bar; };',
endif
if cc.compiles('int *foo(void) __attribute__((returns_nonnull));',
name : '__attribute__((returns_nonnull))')
pre_args += '-DHAVE_FUNC_ATTRIBUTE_NONNULL'
pre_args += '-DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL'
endif
if cc.compiles('''int foo_def(void) __attribute__((visibility("default")));
int foo_hid(void) __attribute__((visibility("hidden")));
int foo_int(void) __attribute__((visibility("internal")));
int foo_pro(void) __attribute__((visibility("protected")));''',
name : '__attribute__((visibility(...)))')
pre_args += '-DHAVE_FUNC_ATTRIBUTE_VISBILITY'
pre_args += '-DHAVE_FUNC_ATTRIBUTE_VISIBILITY'
endif
if cc.compiles('int foo(void) { return 0; } int bar(void) __attribute__((alias("foo")));',
name : '__attribute__((alias(...)))')
@@ -772,7 +780,7 @@ foreach a : ['-Werror=pointer-arith', '-Werror=vla']
endforeach
if host_machine.cpu_family().startswith('x86')
pre_args += '-DHAVE_SSE41'
pre_args += '-DUSE_SSE41'
with_sse41 = true
sse41_args = ['-msse4.1']
@@ -820,23 +828,23 @@ with_asm_arch = ''
if with_asm
# TODO: SPARC and PPC
if host_machine.cpu_family() == 'x86'
if ['linux', 'bsd'].contains(host_machine.system()) # FIXME: hurd?
if system_has_kms_drm
with_asm_arch = 'x86'
pre_args += ['-DUSE_X86_ASM', '-DUSE_MMX_ASM', '-DUSE_3DNOW_ASM',
'-DUSE_SSE_ASM']
endif
elif host_machine.cpu_family() == 'x86_64'
if host_machine.system() == 'linux'
if system_has_kms_drm
with_asm_arch = 'x86_64'
pre_args += ['-DUSE_X86_64_ASM']
endif
elif host_machine.cpu_family() == 'arm'
if host_machine.system() == 'linux'
if system_has_kms_drm
with_asm_arch = 'arm'
pre_args += ['-DUSE_ARM_ASM']
endif
elif host_machine.cpu_family() == 'aarch64'
if host_machine.system() == 'linux'
if system_has_kms_drm
with_asm_arch = 'aarch64'
pre_args += ['-DUSE_AARCH64_ASM']
endif
@@ -1002,15 +1010,23 @@ if with_gallium_opencl
# TODO: optional modules
endif
if with_amd_vk
_llvm_version = '>= 4.0.0'
elif with_gallium_opencl or with_gallium_swr or with_gallium_r600 or with_gallium_radeonsi
_llvm_version = '>= 3.9.0'
else
_llvm_version = '>= 3.3.0'
endif
_llvm = get_option('llvm')
if _llvm == 'auto'
dep_llvm = dependency(
'llvm', version : '>= 3.9.0', modules : llvm_modules,
'llvm', version : _llvm_version, modules : llvm_modules,
required : with_amd_vk or with_gallium_radeonsi or with_gallium_swr or with_gallium_opencl,
)
with_llvm = dep_llvm.found()
elif _llvm == 'true'
dep_llvm = dependency('llvm', version : '>= 3.9.0', modules : llvm_modules)
dep_llvm = dependency('llvm', version : _llvm_version, modules : llvm_modules)
with_llvm = true
else
dep_llvm = []
@@ -1018,11 +1034,15 @@ else
endif
if with_llvm
_llvm_version = dep_llvm.version().split('.')
# Development versions of LLVM have an 'svn' suffix, we don't want that for
# our version checks.
# Development versions of LLVM have an 'svn' or 'git' suffix, we don't want
# that for our version checks.
# svn suffixes are stripped by meson as of 0.43, and git suffixes are
# strippped as of 0.44, but we support older meson versions.
_llvm_patch = _llvm_version[2]
if _llvm_patch.endswith('svn')
_llvm_patch = _llvm_patch.split('s')[0]
elif _llvm_patch.contains('git')
_llvm_patch = _llvm_patch.split('g')[0]
endif
pre_args += [
'-DHAVE_LLVM=0x0@0@@1@@2@'.format(_llvm_version[0], _llvm_version[1], _llvm_patch),
@@ -1211,8 +1231,10 @@ inc_include = include_directories('include')
gl_priv_reqs = [
'x11', 'xext', 'xdamage >= 1.1', 'xfixes', 'x11-xcb', 'xcb',
'xcb-glx >= 1.8.1', 'libdrm >= 2.4.75',
]
'xcb-glx >= 1.8.1']
if dep_libdrm.found()
gl_priv_reqs += 'libdrm >= 2.4.75'
endif
if dep_xxf86vm != [] and dep_xxf86vm.found()
gl_priv_reqs += 'xxf86vm'
endif

View File

@@ -41,7 +41,13 @@ option(
'dri-drivers-path',
type : 'string',
value : '',
description : 'Location of dri drivers. Default: $libdir/dri.'
description : 'Location to install dri drivers. Default: $libdir/dri.'
)
option(
'dri-search-path',
type : 'string',
value : '',
description : 'Locations to search for dri drivers, passed as colon separated list. Default: dri-drivers-path.'
)
option(
'gallium-drivers',

View File

@@ -4049,6 +4049,30 @@ static LLVMValueRef load_sample_pos(struct ac_nir_context *ctx)
return ac_build_gather_values(&ctx->ac, values, 2);
}
static LLVMValueRef load_sample_mask_in(struct ac_nir_context *ctx)
{
uint8_t log2_ps_iter_samples = ctx->nctx->shader_info->info.ps.force_persample ? ctx->nctx->options->key.fs.log2_num_samples : ctx->nctx->options->key.fs.log2_ps_iter_samples;
/* The bit pattern matches that used by fixed function fragment
* processing. */
static const uint16_t ps_iter_masks[] = {
0xffff, /* not used */
0x5555,
0x1111,
0x0101,
0x0001,
};
assert(log2_ps_iter_samples < ARRAY_SIZE(ps_iter_masks));
uint32_t ps_iter_mask = ps_iter_masks[log2_ps_iter_samples];
LLVMValueRef result, sample_id;
sample_id = unpack_param(&ctx->ac, ctx->abi->ancillary, 8, 4);
sample_id = LLVMBuildShl(ctx->ac.builder, LLVMConstInt(ctx->ac.i32, ps_iter_mask, false), sample_id, "");
result = LLVMBuildAnd(ctx->ac.builder, sample_id, ctx->abi->sample_coverage, "");
return result;
}
static LLVMValueRef visit_interp(struct nir_to_llvm_context *ctx,
const nir_intrinsic_instr *instr)
{
@@ -4353,7 +4377,10 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
result = load_sample_pos(ctx);
break;
case nir_intrinsic_load_sample_mask_in:
result = ctx->abi->sample_coverage;
if (ctx->nctx)
result = load_sample_mask_in(ctx);
else
result = ctx->abi->sample_coverage;
break;
case nir_intrinsic_load_frag_coord: {
LLVMValueRef values[4] = {
@@ -4532,8 +4559,14 @@ static LLVMValueRef radv_load_ssbo(struct ac_shader_abi *abi,
static LLVMValueRef radv_load_ubo(struct ac_shader_abi *abi, LLVMValueRef buffer_ptr)
{
struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
LLVMValueRef result;
return LLVMBuildLoad(ctx->builder, buffer_ptr, "");
LLVMSetMetadata(buffer_ptr, ctx->ac.uniform_md_kind, ctx->ac.empty_md);
result = LLVMBuildLoad(ctx->builder, buffer_ptr, "");
LLVMSetMetadata(result, ctx->ac.invariant_load_md_kind, ctx->ac.empty_md);
return result;
}
static LLVMValueRef radv_get_sampler_desc(struct ac_shader_abi *abi,

View File

@@ -60,6 +60,8 @@ struct ac_tcs_variant_key {
struct ac_fs_variant_key {
uint32_t col_format;
uint8_t log2_ps_iter_samples;
uint8_t log2_num_samples;
uint32_t is_int8;
uint32_t is_int10;
uint32_t multisample : 1;

View File

@@ -438,7 +438,7 @@ radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer)
flags = RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
si_cs_emit_cache_flush(cmd_buffer->cs, false,
si_cs_emit_cache_flush(cmd_buffer->cs,
cmd_buffer->device->physical_device->rad_info.chip_class,
NULL, 0,
radv_cmd_buffer_uses_mec(cmd_buffer),
@@ -990,7 +990,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
{
struct radv_shader_variant *ps;
uint64_t va;
unsigned spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
struct radv_blend_state *blend = &pipeline->graphics.blend;
assert (pipeline->shaders[MESA_SHADER_FRAGMENT]);
@@ -1012,13 +1011,10 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR,
ps->config.spi_ps_input_addr);
if (ps->info.info.ps.force_persample)
spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL,
S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, pipeline->graphics.spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_028710_SPI_SHADER_Z_FORMAT,
pipeline->graphics.shader_z_format);

View File

@@ -1771,7 +1771,6 @@ radv_get_preamble_cs(struct radv_queue *queue,
if (i == 0) {
si_cs_emit_cache_flush(cs,
false,
queue->device->physical_device->rad_info.chip_class,
NULL, 0,
queue->queue_family_index == RING_COMPUTE &&
@@ -1783,7 +1782,6 @@ radv_get_preamble_cs(struct radv_queue *queue,
RADV_CMD_FLAG_INV_GLOBAL_L2);
} else if (i == 1) {
si_cs_emit_cache_flush(cs,
false,
queue->device->physical_device->rad_info.chip_class,
NULL, 0,
queue->queue_family_index == RING_COMPUTE &&
@@ -1996,6 +1994,32 @@ VkResult radv_alloc_sem_info(struct radv_winsys_sem_info *sem_info,
return ret;
}
/* Signals fence as soon as all the work currently put on queue is done. */
static VkResult radv_signal_fence(struct radv_queue *queue,
struct radv_fence *fence)
{
int ret;
VkResult result;
struct radv_winsys_sem_info sem_info;
result = radv_alloc_sem_info(&sem_info, 0, NULL, 0, NULL,
radv_fence_to_handle(fence));
if (result != VK_SUCCESS)
return result;
ret = queue->device->ws->cs_submit(queue->hw_ctx, queue->queue_idx,
&queue->device->empty_cs[queue->queue_family_index],
1, NULL, NULL, &sem_info,
false, fence->fence);
radv_free_sem_info(&sem_info);
/* TODO: find a better error */
if (ret)
return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
return VK_SUCCESS;
}
VkResult radv_QueueSubmit(
VkQueue _queue,
uint32_t submitCount,
@@ -2124,18 +2148,7 @@ VkResult radv_QueueSubmit(
if (fence) {
if (!fence_emitted) {
struct radv_winsys_sem_info sem_info;
result = radv_alloc_sem_info(&sem_info, 0, NULL, 0, NULL,
_fence);
if (result != VK_SUCCESS)
return result;
ret = queue->device->ws->cs_submit(ctx, queue->queue_idx,
&queue->device->empty_cs[queue->queue_family_index],
1, NULL, NULL, &sem_info,
false, base_fence);
radv_free_sem_info(&sem_info);
radv_signal_fence(queue, fence);
}
fence->submitted = true;
}
@@ -2656,8 +2669,11 @@ radv_sparse_image_opaque_bind_memory(struct radv_device *device,
}
if (fence && !fence_emitted) {
fence->signalled = true;
if (fence) {
if (!fence_emitted) {
radv_signal_fence(queue, fence);
}
fence->submitted = true;
}
return VK_SUCCESS;

View File

@@ -81,7 +81,7 @@ EXTENSIONS = [
Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'),
Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'),
Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'),
Extension('VK_KHX_multiview', 1, True),
Extension('VK_KHX_multiview', 1, False),
Extension('VK_EXT_debug_report', 9, True),
Extension('VK_EXT_discard_rectangles', 1, True),
Extension('VK_EXT_external_memory_dma_buf', 1, True),

View File

@@ -116,7 +116,8 @@ radv_init_surface(struct radv_device *device,
pCreateInfo->mipLevels <= 1 &&
device->physical_device->rad_info.chip_class >= VI &&
((pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) ||
/* for some reason TC compat with 4/8 samples breaks some cts tests - disable for now */
(pCreateInfo->samples < 4 && pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT)) ||
(device->physical_device->rad_info.chip_class >= GFX9 &&
pCreateInfo->format == VK_FORMAT_D16_UNORM)))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;
@@ -1068,10 +1069,55 @@ radv_image_view_init(struct radv_image_view *iview,
}
if (iview->vk_format != image->vk_format) {
iview->extent.width = round_up_u32(iview->extent.width * vk_format_get_blockwidth(iview->vk_format),
vk_format_get_blockwidth(image->vk_format));
iview->extent.height = round_up_u32(iview->extent.height * vk_format_get_blockheight(iview->vk_format),
vk_format_get_blockheight(image->vk_format));
unsigned view_bw = vk_format_get_blockwidth(iview->vk_format);
unsigned view_bh = vk_format_get_blockheight(iview->vk_format);
unsigned img_bw = vk_format_get_blockwidth(image->vk_format);
unsigned img_bh = vk_format_get_blockheight(image->vk_format);
iview->extent.width = round_up_u32(iview->extent.width * view_bw, img_bw);
iview->extent.height = round_up_u32(iview->extent.height * view_bh, img_bh);
/* Comment ported from amdvlk -
* If we have the following image:
* Uncompressed pixels Compressed block sizes (4x4)
* mip0: 22 x 22 6 x 6
* mip1: 11 x 11 3 x 3
* mip2: 5 x 5 2 x 2
* mip3: 2 x 2 1 x 1
* mip4: 1 x 1 1 x 1
*
* On GFX9 the descriptor is always programmed with the WIDTH and HEIGHT of the base level and the HW is
* calculating the degradation of the block sizes down the mip-chain as follows (straight-up
* divide-by-two integer math):
* mip0: 6x6
* mip1: 3x3
* mip2: 1x1
* mip3: 1x1
*
* This means that mip2 will be missing texels.
*
* Fix this by calculating the base mip's width and height, then convert that, and round it
* back up to get the level 0 size.
* Clamp the converted size between the original values, and next power of two, which
* means we don't oversize the image.
*/
if (device->physical_device->rad_info.chip_class >= GFX9 &&
vk_format_is_compressed(image->vk_format) &&
!vk_format_is_compressed(iview->vk_format)) {
unsigned rounded_img_w = util_next_power_of_two(iview->extent.width);
unsigned rounded_img_h = util_next_power_of_two(iview->extent.height);
unsigned lvl_width = radv_minify(image->info.width , range->baseMipLevel);
unsigned lvl_height = radv_minify(image->info.height, range->baseMipLevel);
lvl_width = round_up_u32(lvl_width * view_bw, img_bw);
lvl_height = round_up_u32(lvl_height * view_bh, img_bh);
lvl_width <<= range->baseMipLevel;
lvl_height <<= range->baseMipLevel;
iview->extent.width = CLAMP(lvl_width, iview->extent.width, rounded_img_w);
iview->extent.height = CLAMP(lvl_height, iview->extent.height, rounded_img_h);
}
}
iview->base_layer = range->baseArrayLayer;

View File

@@ -26,6 +26,7 @@
#include "radv_meta.h"
#include "radv_private.h"
#include "vk_format.h"
#include "nir/nir_builder.h"
#include "sid.h"
@@ -50,7 +51,7 @@ build_nir_fs(void)
}
static VkResult
create_pass(struct radv_device *device)
create_pass(struct radv_device *device, VkFormat vk_format, VkRenderPass *pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -59,7 +60,7 @@ create_pass(struct radv_device *device)
int i;
for (i = 0; i < 2; i++) {
attachments[i].format = VK_FORMAT_UNDEFINED;
attachments[i].format = vk_format;
attachments[i].samples = 1;
attachments[i].loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachments[i].storeOp = VK_ATTACHMENT_STORE_OP_STORE;
@@ -99,14 +100,16 @@ create_pass(struct radv_device *device)
.dependencyCount = 0,
},
alloc,
&device->meta_state.resolve.pass);
pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
VkShaderModule vs_module_h,
VkPipeline *pipeline,
VkRenderPass pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -129,12 +132,14 @@ create_pipeline(struct radv_device *device,
.pPushConstantRanges = NULL,
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve.p_layout);
if (result != VK_SUCCESS)
goto cleanup;
if (!device->meta_state.resolve.p_layout) {
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve.p_layout);
if (result != VK_SUCCESS)
goto cleanup;
}
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
@@ -212,15 +217,14 @@ create_pipeline(struct radv_device *device,
},
},
.layout = device->meta_state.resolve.p_layout,
.renderPass = device->meta_state.resolve.pass,
.renderPass = pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_RESOLVE,
},
&device->meta_state.alloc,
&device->meta_state.resolve.pipeline);
&device->meta_state.alloc, pipeline);
if (result != VK_SUCCESS)
goto cleanup;
@@ -236,19 +240,37 @@ radv_device_finish_meta_resolve_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass, &state->alloc);
for (uint32_t j = 0; j < NUM_META_FS_KEYS; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass[j], &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline[j], &state->alloc);
}
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->resolve.p_layout, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline, &state->alloc);
}
static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,
VK_FORMAT_R16G16B16A16_SINT,
VK_FORMAT_R32_SFLOAT,
VK_FORMAT_R32G32_SFLOAT,
VK_FORMAT_R32G32B32A32_SFLOAT
};
VkResult
radv_device_init_meta_resolve_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
struct radv_meta_state *state = &device->meta_state;
struct radv_shader_module vs_module = { .nir = radv_meta_build_nir_vs_generate_vertices() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
@@ -256,14 +278,19 @@ radv_device_init_meta_resolve_state(struct radv_device *device)
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
for (uint32_t i = 0; i < ARRAY_SIZE(pipeline_formats); ++i) {
VkFormat format = pipeline_formats[i];
unsigned fs_key = radv_format_meta_fs_key(format);
res = create_pass(device, format, &state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h,
&state->resolve.pipeline[fs_key], state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
}
goto cleanup;
@@ -278,16 +305,18 @@ cleanup:
static void
emit_resolve(struct radv_cmd_buffer *cmd_buffer,
VkFormat vk_format,
const VkOffset2D *dest_offset,
const VkExtent2D *resolve_extent)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
unsigned fs_key = radv_format_meta_fs_key(vk_format);
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.resolve.pipeline);
device->meta_state.resolve.pipeline[fs_key]);
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) {
.x = dest_offset->x,
@@ -323,6 +352,13 @@ static void radv_pick_resolve_method_images(struct radv_image *src_image,
uint32_t queue_mask = radv_image_queue_family_mask(dest_image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
if (src_image->vk_format == VK_FORMAT_R16G16_UNORM ||
src_image->vk_format == VK_FORMAT_R16G16_SNORM)
*method = RESOLVE_COMPUTE;
else if (vk_format_is_int(src_image->vk_format))
*method = RESOLVE_COMPUTE;
if (radv_layout_dcc_compressed(dest_image, dest_image_layout, queue_mask)) {
*method = RESOLVE_FRAGMENT;
} else if (dest_image->surface.micro_tile_mode != src_image->surface.micro_tile_mode) {
@@ -413,6 +449,7 @@ void radv_CmdResolveImage(
if (dest_image->surface.dcc_size) {
radv_initialize_dcc(cmd_buffer, dest_image, 0xffffffff);
}
unsigned fs_key = radv_format_meta_fs_key(dest_image->vk_format);
for (uint32_t r = 0; r < region_count; ++r) {
const VkImageResolve *region = &regions[r];
@@ -512,7 +549,7 @@ void radv_CmdResolveImage(
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.resolve.pass,
.renderPass = device->meta_state.resolve.pass[fs_key],
.framebuffer = fb_h,
.renderArea = {
.offset = {
@@ -530,6 +567,7 @@ void radv_CmdResolveImage(
VK_SUBPASS_CONTENTS_INLINE);
emit_resolve(cmd_buffer,
dest_iview.vk_format,
&(VkOffset2D) {
.x = dstOffset.x,
.y = dstOffset.y,
@@ -624,6 +662,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
emit_resolve(cmd_buffer,
dst_img->vk_format,
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
}

View File

@@ -798,6 +798,18 @@ radv_pipeline_init_raster_state(struct radv_pipeline *pipeline,
}
static uint8_t radv_pipeline_get_ps_iter_samples(const VkPipelineMultisampleStateCreateInfo *vkms)
{
uint32_t num_samples = vkms->rasterizationSamples;
uint32_t ps_iter_samples = 1;
if (vkms->sampleShadingEnable) {
ps_iter_samples = ceil(vkms->minSampleShading * num_samples);
ps_iter_samples = util_next_power_of_two(ps_iter_samples);
}
return ps_iter_samples;
}
static void
radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
const VkGraphicsPipelineCreateInfo *pCreateInfo)
@@ -813,9 +825,9 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
else
ms->num_samples = 1;
if (vkms && vkms->sampleShadingEnable) {
ps_iter_samples = ceil(vkms->minSampleShading * ms->num_samples);
} else if (pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.force_persample) {
if (vkms)
ps_iter_samples = radv_pipeline_get_ps_iter_samples(vkms);
if (vkms && !vkms->sampleShadingEnable && pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.force_persample) {
ps_iter_samples = ms->num_samples;
}
@@ -838,7 +850,7 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
if (ms->num_samples > 1) {
unsigned log_samples = util_logbase2(ms->num_samples);
unsigned log_ps_iter_samples = util_logbase2(util_next_power_of_two(ps_iter_samples));
unsigned log_ps_iter_samples = util_logbase2(ps_iter_samples);
ms->pa_sc_mode_cntl_0 |= S_028A48_MSAA_ENABLE(1);
ms->pa_sc_line_cntl |= S_028BDC_EXPAND_LINE_WIDTH(1); /* CM_R_028BDC_PA_SC_LINE_CNTL */
ms->db_eqaa |= S_028804_MAX_ANCHOR_SAMPLES(log_samples) |
@@ -849,6 +861,8 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
S_028BE0_MAX_SAMPLE_DIST(radv_cayman_get_maxdist(log_samples)) |
S_028BE0_MSAA_EXPOSED_SAMPLES(log_samples); /* CM_R_028BE0_PA_SC_AA_CONFIG */
ms->pa_sc_mode_cntl_1 |= S_028A4C_PS_ITER_SAMPLE(ps_iter_samples > 1);
if (ps_iter_samples > 1)
pipeline->graphics.spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
}
const struct VkPipelineRasterizationStateRasterizationOrderAMD *raster_order =
@@ -1745,8 +1759,13 @@ radv_generate_graphics_pipeline_key(struct radv_pipeline *pipeline,
if (pCreateInfo->pMultisampleState &&
pCreateInfo->pMultisampleState->rasterizationSamples > 1)
pCreateInfo->pMultisampleState->rasterizationSamples > 1) {
uint32_t num_samples = pCreateInfo->pMultisampleState->rasterizationSamples;
uint32_t ps_iter_samples = radv_pipeline_get_ps_iter_samples(pCreateInfo->pMultisampleState);
key.multisample = true;
key.log2_num_samples = util_logbase2(num_samples);
key.log2_ps_iter_samples = util_logbase2(ps_iter_samples);
}
key.col_format = pipeline->graphics.blend.spi_shader_col_format;
if (pipeline->device->physical_device->rad_info.chip_class < VI)
@@ -1784,6 +1803,8 @@ radv_fill_shader_keys(struct ac_shader_variant_key *keys,
keys[MESA_SHADER_FRAGMENT].fs.col_format = key->col_format;
keys[MESA_SHADER_FRAGMENT].fs.is_int8 = key->is_int8;
keys[MESA_SHADER_FRAGMENT].fs.is_int10 = key->is_int10;
keys[MESA_SHADER_FRAGMENT].fs.log2_ps_iter_samples = key->log2_ps_iter_samples;
keys[MESA_SHADER_FRAGMENT].fs.log2_num_samples = key->log2_num_samples;
}
static void
@@ -2430,6 +2451,7 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
radv_generate_graphics_pipeline_key(pipeline, pCreateInfo, has_view_index),
pStages);
pipeline->graphics.spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
radv_pipeline_init_depth_stencil_state(pipeline, pCreateInfo, extra);
radv_pipeline_init_raster_state(pipeline, pCreateInfo);
radv_pipeline_init_multisample_state(pipeline, pCreateInfo);

View File

@@ -331,6 +331,8 @@ struct radv_pipeline_key {
uint32_t col_format;
uint32_t is_int8;
uint32_t is_int10;
uint8_t log2_ps_iter_samples;
uint8_t log2_num_samples;
uint32_t multisample : 1;
uint32_t has_multiview_view_index : 1;
};
@@ -478,8 +480,8 @@ struct radv_meta_state {
struct {
VkPipelineLayout p_layout;
VkPipeline pipeline;
VkRenderPass pass;
VkPipeline pipeline[NUM_META_FS_KEYS];
VkRenderPass pass[NUM_META_FS_KEYS];
} resolve;
struct {
@@ -1019,7 +1021,6 @@ void si_emit_wait_fence(struct radeon_winsys_cs *cs,
uint64_t va, uint32_t ref,
uint32_t mask);
void si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
bool predicated,
enum chip_class chip_class,
uint32_t *fence_ptr, uint64_t va,
bool is_mec,
@@ -1237,6 +1238,7 @@ struct radv_pipeline {
struct radv_binning_state bin;
uint32_t db_shader_control;
uint32_t shader_z_format;
uint32_t spi_baryc_cntl;
unsigned prim;
unsigned gs_out;
uint32_t vgt_gs_mode;

View File

@@ -917,7 +917,6 @@ si_emit_acquire_mem(struct radeon_winsys_cs *cs,
void
si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
bool predicated,
enum chip_class chip_class,
uint32_t *flush_cnt,
uint64_t flush_va,
@@ -948,7 +947,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
/* Necessary for DCC */
if (chip_class >= VI) {
si_cs_emit_write_event_eop(cs,
predicated,
false,
chip_class,
is_mec,
V_028A90_FLUSH_AND_INV_CB_DATA_TS,
@@ -962,12 +961,12 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_FLUSH_AND_INV_CB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_CB_META) | EVENT_INDEX(0));
}
if (flush_bits & RADV_CMD_FLAG_FLUSH_AND_INV_DB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_DB_META) | EVENT_INDEX(0));
}
@@ -980,7 +979,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_CS_PARTIAL_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_CS_PARTIAL_FLUSH) | EVENT_INDEX(4));
}
@@ -1037,14 +1036,14 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
assert(flush_cnt);
uint32_t old_fence = (*flush_cnt)++;
si_cs_emit_write_event_eop(cs, predicated, chip_class, false, cb_db_event, tc_flags, 1,
si_cs_emit_write_event_eop(cs, false, chip_class, false, cb_db_event, tc_flags, 1,
flush_va, old_fence, *flush_cnt);
si_emit_wait_fence(cs, predicated, flush_va, *flush_cnt, 0xffffffff);
si_emit_wait_fence(cs, false, flush_va, *flush_cnt, 0xffffffff);
}
/* VGT state sync */
if (flush_bits & RADV_CMD_FLAG_VGT_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | EVENT_INDEX(0));
}
@@ -1057,13 +1056,13 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
RADV_CMD_FLAG_INV_GLOBAL_L2 |
RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2))) &&
!is_mec) {
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, predicated));
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
radeon_emit(cs, 0);
}
if ((flush_bits & RADV_CMD_FLAG_INV_GLOBAL_L2) ||
(chip_class <= CIK && (flush_bits & RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2))) {
si_emit_acquire_mem(cs, is_mec, predicated, chip_class >= GFX9,
si_emit_acquire_mem(cs, is_mec, false, chip_class >= GFX9,
cp_coher_cntl |
S_0085F0_TC_ACTION_ENA(1) |
S_0085F0_TCL1_ACTION_ENA(1) |
@@ -1077,7 +1076,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
*
* WB doesn't work without NC.
*/
si_emit_acquire_mem(cs, is_mec, predicated,
si_emit_acquire_mem(cs, is_mec, false,
chip_class >= GFX9,
cp_coher_cntl |
S_0301F0_TC_WB_ACTION_ENA(1) |
@@ -1086,7 +1085,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_INV_VMEM_L1) {
si_emit_acquire_mem(cs, is_mec,
predicated, chip_class >= GFX9,
false, chip_class >= GFX9,
cp_coher_cntl |
S_0085F0_TCL1_ACTION_ENA(1));
cp_coher_cntl = 0;
@@ -1097,7 +1096,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
* Therefore, it should be last. Done in PFP.
*/
if (cp_coher_cntl)
si_emit_acquire_mem(cs, is_mec, predicated, chip_class >= GFX9, cp_coher_cntl);
si_emit_acquire_mem(cs, is_mec, false, chip_class >= GFX9, cp_coher_cntl);
}
void
@@ -1127,7 +1126,6 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
ptr = &cmd_buffer->gfx9_fence_idx;
}
si_cs_emit_cache_flush(cmd_buffer->cs,
cmd_buffer->state.predicating,
cmd_buffer->device->physical_device->rad_info.chip_class,
ptr, va,
radv_cmd_buffer_uses_mec(cmd_buffer),

View File

@@ -64,7 +64,6 @@ EXTRA_DIST = \
glsl/meson.build \
nir/meson.build \
meson.build
MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

View File

@@ -585,6 +585,7 @@ union packed_tex_data {
unsigned component:2;
unsigned has_texture_deref:1;
unsigned has_sampler_deref:1;
unsigned unused:10; /* Mark unused for valgrind. */
} u;
};

View File

@@ -160,7 +160,7 @@ libegl = shared_library(
c_args : [
c_vis_args,
c_args_for_egl,
'-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_driver_dir),
'-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_search_path),
'-D_EGL_BUILT_IN_DRIVER_DRI2',
'-D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_@0@'.format(egl_native_platform.to_upper()),
],

View File

@@ -857,7 +857,7 @@ lp_build_sample_image_nearest(struct lp_build_sample_context *bld,
LLVMValueRef img_stride_vec,
LLVMValueRef data_ptr,
LLVMValueRef mipoffsets,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef colors_out[4])
{
@@ -1004,7 +1004,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
LLVMValueRef img_stride_vec,
LLVMValueRef data_ptr,
LLVMValueRef mipoffsets,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef colors_out[4])
{
@@ -1106,7 +1106,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
struct lp_build_if_state edge_if;
LLVMTypeRef int1t;
LLVMValueRef new_faces[4], new_xcoords[4][2], new_ycoords[4][2];
LLVMValueRef coord, have_edge, have_corner;
LLVMValueRef coord0, coord1, have_edge, have_corner;
LLVMValueRef fall_off_ym_notxm, fall_off_ym_notxp, fall_off_x, fall_off_y;
LLVMValueRef fall_off_yp_notxm, fall_off_yp_notxp;
LLVMValueRef x0, x1, y0, y1, y0_clamped, y1_clamped;
@@ -1130,20 +1130,20 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
* other values might be bogus in the end too).
* So kill off the NaNs here.
*/
coords[0] = lp_build_max_ext(coord_bld, coords[0], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coords[1] = lp_build_max_ext(coord_bld, coords[1], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coord = lp_build_mul(coord_bld, coords[0], flt_width_vec);
coord0 = lp_build_max_ext(coord_bld, coords[0], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coord0 = lp_build_mul(coord_bld, coord0, flt_width_vec);
/* instead of clamp, build mask if overflowed */
coord = lp_build_sub(coord_bld, coord, half);
coord0 = lp_build_sub(coord_bld, coord0, half);
/* convert to int, compute lerp weight */
/* not ideal with AVX (and no AVX2) */
lp_build_ifloor_fract(coord_bld, coord, &x0, &s_fpart);
lp_build_ifloor_fract(coord_bld, coord0, &x0, &s_fpart);
x1 = lp_build_add(ivec_bld, x0, ivec_bld->one);
coord = lp_build_mul(coord_bld, coords[1], flt_height_vec);
coord = lp_build_sub(coord_bld, coord, half);
lp_build_ifloor_fract(coord_bld, coord, &y0, &t_fpart);
coord1 = lp_build_max_ext(coord_bld, coords[1], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coord1 = lp_build_mul(coord_bld, coord1, flt_height_vec);
coord1 = lp_build_sub(coord_bld, coord1, half);
lp_build_ifloor_fract(coord_bld, coord1, &y0, &t_fpart);
y1 = lp_build_add(ivec_bld, y0, ivec_bld->one);
fall_off[0] = lp_build_cmp(ivec_bld, PIPE_FUNC_LESS, x0, ivec_bld->zero);
@@ -1747,7 +1747,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
unsigned img_filter,
unsigned mip_filter,
boolean is_gather,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef ilevel0,
LLVMValueRef ilevel1,
@@ -1820,6 +1820,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
PIPE_FUNC_GREATER,
lod_fpart, bld->lodf_bld.zero);
need_lerp = lp_build_any_true_range(&bld->lodi_bld, bld->num_lods, need_lerp);
lp_build_name(need_lerp, "need_lerp");
}
lp_build_if(&if_ctx, bld->gallivm, need_lerp);
@@ -1888,7 +1889,7 @@ static void
lp_build_sample_mipmap_both(struct lp_build_sample_context *bld,
LLVMValueRef linear_mask,
unsigned mip_filter,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef ilevel0,
LLVMValueRef ilevel1,
@@ -1945,6 +1946,7 @@ lp_build_sample_mipmap_both(struct lp_build_sample_context *bld,
* should be able to merge the branches in this case.
*/
need_lerp = lp_build_any_true_range(&bld->lodi_bld, bld->num_lods, lod_positive);
lp_build_name(need_lerp, "need_lerp");
lp_build_if(&if_ctx, bld->gallivm, need_lerp);
{
@@ -2422,7 +2424,7 @@ static void
lp_build_sample_general(struct lp_build_sample_context *bld,
unsigned sampler_unit,
boolean is_gather,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef lod_positive,
LLVMValueRef lod_fpart,
@@ -2483,7 +2485,8 @@ lp_build_sample_general(struct lp_build_sample_context *bld,
struct lp_build_if_state if_ctx;
lod_positive = LLVMBuildTrunc(builder, lod_positive,
LLVMInt1TypeInContext(bld->gallivm->context), "");
LLVMInt1TypeInContext(bld->gallivm->context),
"lod_pos");
lp_build_if(&if_ctx, bld->gallivm, lod_positive);
{
@@ -2519,6 +2522,7 @@ lp_build_sample_general(struct lp_build_sample_context *bld,
}
need_linear = lp_build_any_true_range(&bld->lodi_bld, bld->num_lods,
linear_mask);
lp_build_name(need_linear, "need_linear");
if (bld->num_lods != bld->coord_type.length) {
linear_mask = lp_build_unpack_broadcast_aos_scalars(bld->gallivm,

View File

@@ -33,6 +33,7 @@
#include "state_tracker/drm_driver.h"
#include "pipe/p_screen.h"
#include "util/u_format.h"
#include "util/u_inlines.h"
#include "util/u_memory.h"
@@ -73,7 +74,7 @@ renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc,
struct drm_mode_create_dumb create_dumb = {
.width = rsc->width0,
.height = rsc->height0,
.bpp = 32,
.bpp = util_format_get_blocksizebits(rsc->format),
};
struct drm_mode_destroy_dumb destroy_dumb = { };

View File

@@ -46,4 +46,4 @@ ir3_compiler_LDADD = \
$(GALLIUM_COMMON_LIB_DEPS) \
$(FREEDRENO_LIBS)
EXTRA_DIST = meson.build
EXTRA_DIST += meson.build

View File

@@ -30,4 +30,4 @@ noinst_LTLIBRARIES = libi915.la
libi915_la_SOURCES = $(C_SOURCES)
EXTRA_DIST = meson.build
EXTRA_DIST = TODO meson.build

View File

@@ -766,7 +766,7 @@ static void compute_emit_cs(struct r600_context *rctx,
} else {
uint32_t rat_mask;
rat_mask = ((1ULL << (((unsigned)rctx->cb_misc_state.nr_image_rats + rctx->cb_misc_state.nr_buffer_rats) * 4)) - 1);
rat_mask = evergreen_construct_rat_mask(rctx, &rctx->cb_misc_state, 0);
radeon_compute_set_context_reg(cs, R_028238_CB_TARGET_MASK,
rat_mask);
}

View File

@@ -1998,13 +1998,31 @@ static void evergreen_emit_polygon_offset(struct r600_context *rctx, struct r600
pa_su_poly_offset_db_fmt_cntl);
}
uint32_t evergreen_construct_rat_mask(struct r600_context *rctx, struct r600_cb_misc_state *a,
unsigned nr_cbufs)
{
unsigned base_mask = 0;
unsigned dirty_mask = a->image_rat_enabled_mask;
while (dirty_mask) {
unsigned idx = u_bit_scan(&dirty_mask);
base_mask |= (0xf << (idx * 4));
}
unsigned offset = util_last_bit(a->image_rat_enabled_mask);
dirty_mask = a->buffer_rat_enabled_mask;
while (dirty_mask) {
unsigned idx = u_bit_scan(&dirty_mask);
base_mask |= (0xf << (idx + offset) * 4);
}
return base_mask << (nr_cbufs * 4);
}
static void evergreen_emit_cb_misc_state(struct r600_context *rctx, struct r600_atom *atom)
{
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
struct r600_cb_misc_state *a = (struct r600_cb_misc_state*)atom;
unsigned fb_colormask = (1ULL << ((unsigned)a->nr_cbufs * 4)) - 1;
unsigned ps_colormask = (1ULL << ((unsigned)a->nr_ps_color_outputs * 4)) - 1;
unsigned rat_colormask = ((1ULL << ((unsigned)(a->nr_image_rats + a->nr_buffer_rats) * 4)) - 1) << (a->nr_cbufs * 4);
unsigned rat_colormask = evergreen_construct_rat_mask(rctx, a, a->nr_cbufs);
radeon_set_context_reg_seq(cs, R_028238_CB_TARGET_MASK, 2);
radeon_emit(cs, (a->blend_colormask & fb_colormask) | rat_colormask); /* R_028238_CB_TARGET_MASK */
/* This must match the used export instructions exactly.
@@ -4032,8 +4050,9 @@ static void evergreen_set_shader_buffers(struct pipe_context *ctx,
if (old_mask != istate->enabled_mask)
r600_mark_atom_dirty(rctx, &rctx->framebuffer.atom);
if (rctx->cb_misc_state.nr_buffer_rats != util_bitcount(istate->enabled_mask)) {
rctx->cb_misc_state.nr_buffer_rats = util_bitcount(istate->enabled_mask);
/* construct the target mask */
if (rctx->cb_misc_state.buffer_rat_enabled_mask != istate->enabled_mask) {
rctx->cb_misc_state.buffer_rat_enabled_mask = istate->enabled_mask;
r600_mark_atom_dirty(rctx, &rctx->cb_misc_state.atom);
}
@@ -4208,8 +4227,8 @@ static void evergreen_set_shader_images(struct pipe_context *ctx,
if (old_mask != istate->enabled_mask)
r600_mark_atom_dirty(rctx, &rctx->framebuffer.atom);
if (rctx->cb_misc_state.nr_image_rats != util_bitcount(istate->enabled_mask)) {
rctx->cb_misc_state.nr_image_rats = util_bitcount(istate->enabled_mask);
if (rctx->cb_misc_state.image_rat_enabled_mask != istate->enabled_mask) {
rctx->cb_misc_state.image_rat_enabled_mask = istate->enabled_mask;
r600_mark_atom_dirty(rctx, &rctx->cb_misc_state.atom);
}

View File

@@ -152,8 +152,8 @@ struct r600_cb_misc_state {
unsigned blend_colormask; /* 8*4 bits for 8 RGBA colorbuffers */
unsigned nr_cbufs;
unsigned nr_ps_color_outputs;
unsigned nr_image_rats;
unsigned nr_buffer_rats;
unsigned image_rat_enabled_mask;
unsigned buffer_rat_enabled_mask;
bool multiwrite;
bool dual_src_blend;
};
@@ -700,6 +700,9 @@ void evergreen_init_color_surface_rat(struct r600_context *rctx,
struct r600_surface *surf);
void evergreen_update_db_shader_control(struct r600_context * rctx);
bool evergreen_adjust_gprs(struct r600_context *rctx);
uint32_t evergreen_construct_rat_mask(struct r600_context *rctx, struct r600_cb_misc_state *a,
unsigned nr_cbufs);
/* r600_blit.c */
void r600_init_blit_functions(struct r600_context *rctx);
void r600_decompress_depth_textures(struct r600_context *rctx,

View File

@@ -665,6 +665,7 @@ public:
return false;
switch (hw_chip) {
case HW_CHIP_HEMLOCK:
case HW_CHIP_CYPRESS:
case HW_CHIP_JUNIPER:
return false;

View File

@@ -208,8 +208,25 @@ void bc_finalizer::finalize_if(region_node* r) {
r->push_front(if_jump);
r->push_back(if_pop);
/* the depart/repeat 1 is actually part of the "else" code.
* if it's a depart for an outer loop region it will want to
* insert a LOOP_BREAK or LOOP_CONTINUE in here, so we need
* to emit the else clause.
*/
bool has_else = n_if->next;
if (repdep1->is_depart()) {
depart_node *dep1 = static_cast<depart_node*>(repdep1);
if (dep1->target != r && dep1->target->is_loop())
has_else = true;
}
if (repdep1->is_repeat()) {
repeat_node *rep1 = static_cast<repeat_node*>(repdep1);
if (rep1->target != r && rep1->target->is_loop())
has_else = true;
}
if (has_else) {
cf_node *nelse = sh.create_cf(CF_OP_ELSE);
n_if->insert_after(nelse);

View File

@@ -298,11 +298,21 @@ static int r600_init_surface(struct si_screen *sscreen,
return r;
}
unsigned pitch = pitch_in_bytes_override / bpe;
if (sscreen->info.chip_class >= GFX9) {
assert(!pitch_in_bytes_override ||
pitch_in_bytes_override == surface->u.gfx9.surf_pitch * bpe);
if (pitch) {
surface->u.gfx9.surf_pitch = pitch;
surface->u.gfx9.surf_slice_size =
(uint64_t)pitch * surface->u.gfx9.surf_height * bpe;
}
surface->u.gfx9.surf_offset = offset;
} else {
if (pitch) {
surface->u.legacy.level[0].nblk_x = pitch;
surface->u.legacy.level[0].slice_size_dw =
((uint64_t)pitch * surface->u.legacy.level[0].nblk_y * bpe) / 4;
}
if (offset) {
for (i = 0; i < ARRAY_SIZE(surface->u.legacy.level); ++i)
surface->u.legacy.level[i].offset += offset;

View File

@@ -610,6 +610,11 @@ struct radeon_winsys {
int (*fence_export_sync_file)(struct radeon_winsys *ws,
struct pipe_fence_handle *fence);
/**
* Return a sync file FD that is already signalled.
*/
int (*export_signalled_sync_file)(struct radeon_winsys *ws);
/**
* Initialize surface
*

View File

@@ -77,7 +77,7 @@ libradeonsi = static_library(
],
c_args : [c_vis_args],
cpp_args : [cpp_vis_args],
dependencies : [dep_llvm, idep_nir_headers],
dependencies : [dep_llvm, dep_libdrm_radeon, idep_nir_headers],
)
driver_radeonsi = declare_dependency(

View File

@@ -356,6 +356,8 @@ static int si_fence_get_fd(struct pipe_screen *screen,
/* If we don't have FDs at this point, it means we don't have fences
* either. */
if (sdma_fd == -1 && gfx_fd == -1)
return ws->export_signalled_sync_file(ws);
if (sdma_fd == -1)
return gfx_fd;
if (gfx_fd == -1)

View File

@@ -1,4 +1,4 @@
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -149,7 +149,22 @@ files_swr_arch = files(
swr_context_files = files('swr_context.h')
swr_state_files = files('rasterizer/core/state.h')
swr_event_proto_files = files('rasterizer/archrast/events.proto')
swr_gen_backend_files = files('rasterizer/codegen/templates/gen_backend.cpp')
swr_gen_rasterizer_files = files('rasterizer/codegen/templates/gen_rasterizer.cpp')
swr_gen_header_init_files = files('rasterizer/codegen/templates/gen_header_init.hpp')
swr_gen_llvm_ir_macros_py = files('rasterizer/codegen/gen_llvm_ir_macros.py')
swr_gen_backends_py = files('rasterizer/codegen/gen_backends.py')
swr_gen_builder_depends = files(
'rasterizer/codegen/templates/gen_builder.hpp',
'rasterizer/codegen/gen_common.py'
)
subdir('rasterizer/jitter')
subdir('rasterizer/codegen')
subdir('rasterizer/core/backends')
swr_incs = include_directories(
'rasterizer/codegen', 'rasterizer/core', 'rasterizer/jitter',
@@ -178,7 +193,7 @@ if with_swr_arches.contains('avx')
swr_arch_defines += '-DHAVE_SWR_AVX'
swr_arch_libs += shared_library(
'swrAVX',
files_swr_common,
[files_swr_common, files_swr_arch],
cpp_args : [swr_cpp_args, swr_avx_args, '-DKNOB_ARCH=KNOB_ARCH_AVX'],
link_args : [ld_args_gc_sections],
include_directories : [swr_incs],
@@ -210,7 +225,7 @@ if with_swr_arches.contains('avx2')
swr_arch_defines += '-DHAVE_SWR_AVX2'
swr_arch_libs += shared_library(
'swrAVX2',
files_swr_common,
[files_swr_common, files_swr_arch],
cpp_args : [swr_cpp_args, swr_avx2_args, '-DKNOB_ARCH=KNOB_ARCH_AVX2'],
link_args : [ld_args_gc_sections],
include_directories : [swr_incs],
@@ -234,7 +249,7 @@ if with_swr_arches.contains('knl')
swr_arch_defines += '-DHAVE_SWR_KNL'
swr_arch_libs += shared_library(
'swrKNL',
files_swr_common,
[files_swr_common, files_swr_arch],
cpp_args : [
swr_cpp_args, swr_knl_args, '-DKNOB_ARCH=KNOB_ARCH_AVX512',
'-DKNOB_ARCH_KNIGHTS',
@@ -261,7 +276,7 @@ if with_swr_arches.contains('skx')
swr_arch_defines += '-DHAVE_SWR_SKX'
swr_arch_libs += shared_library(
'swrSKX',
files_swr_common,
[files_swr_common, files_swr_arch],
cpp_args : [swr_cpp_args, swr_skx_args, '-DKNOB_ARCH=KNOB_ARCH_AVX512'],
link_args : [ld_args_gc_sections],
include_directories : [swr_incs],

View File

@@ -1,4 +1,4 @@
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -40,40 +40,6 @@ gen_knobs_h = custom_target(
),
)
gen_builder_hpp = custom_target(
'gen_builder.hpp',
input : [
'gen_llvm_ir_macros.py',
join_paths(
dep_llvm.get_configtool_variable('includedir'), 'llvm', 'IR',
'IRBuilder.h'
)
],
output : 'gen_builder.hpp',
command : [
prog_python2, '@INPUT0@', '--input', '@INPUT1@', '--output', '@OUTPUT@',
'--gen_h', '--output-dir', meson.current_build_dir()
],
depend_files : files(
'templates/gen_builder.hpp',
'gen_common.py',
),
build_by_default : true,
)
gen_builder_x86_hpp = custom_target(
'gen_builder_x86.hpp',
input : 'gen_llvm_ir_macros.py',
output : 'gen_builder_x86.hpp',
command : [
prog_python2, '@INPUT0@', '--gen_x86_h', '--output', '@OUTPUT@',
'--output-dir', meson.current_build_dir()
],
depend_files : files(
'templates/gen_builder.hpp',
'gen_common.py',
),
)
# The generators above this are needed individually, while the below generators
# are all inputs to the same lib, so they don't need unique names.
@@ -114,45 +80,3 @@ foreach x : [['gen_ar_event.hpp', '--gen_event_hpp'],
)
endforeach
files_swr_common += custom_target(
'gen_backend_pixel',
input : 'gen_backends.py',
output : [
'gen_BackendPixelRate0.cpp', 'gen_BackendPixelRate1.cpp',
'gen_BackendPixelRate2.cpp', 'gen_BackendPixelRate3.cpp',
'gen_BackendPixelRate.hpp',
],
command : [
prog_python2, '@INPUT@',
'--outdir', meson.current_build_dir(),
'--dim', '5', '2', '3', '2', '2', '2',
'--numfiles', '4',
'--cpp', '--hpp',
],
depend_files : files(
'templates/gen_backend.cpp',
'templates/gen_header_init.hpp',
),
)
files_swr_common += custom_target(
'gen_backend_raster',
input : 'gen_backends.py',
output : [
'gen_rasterizer0.cpp', 'gen_rasterizer1.cpp',
'gen_rasterizer2.cpp', 'gen_rasterizer3.cpp',
'gen_rasterizer.hpp',
],
command : [
prog_python2, '@INPUT@',
'--outdir', meson.current_build_dir(),
'--rast',
'--dim', '5', '2', '2', '3', '5', '2',
'--numfiles', '4',
'--cpp', '--hpp',
],
depend_files : files(
'templates/gen_rasterizer.cpp',
'templates/gen_header_init.hpp',
),
)

View File

@@ -0,0 +1,57 @@
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
files_swr_common += custom_target(
'gen_backend_pixel',
input : swr_gen_backends_py,
output : [
'gen_BackendPixelRate0.cpp', 'gen_BackendPixelRate1.cpp',
'gen_BackendPixelRate2.cpp', 'gen_BackendPixelRate3.cpp',
'gen_BackendPixelRate.hpp',
],
command : [
prog_python2, '@INPUT@',
'--outdir', '@OUTDIR@',
'--dim', '5', '2', '3', '2', '2', '2',
'--numfiles', '4',
'--cpp', '--hpp',
],
depend_files : [ swr_gen_backend_files, swr_gen_header_init_files ],
)
files_swr_common += custom_target(
'gen_backend_raster',
input : swr_gen_backends_py,
output : [
'gen_rasterizer0.cpp', 'gen_rasterizer1.cpp',
'gen_rasterizer2.cpp', 'gen_rasterizer3.cpp',
'gen_rasterizer.hpp',
],
command : [
prog_python2, '@INPUT@',
'--outdir', '@OUTDIR@',
'--rast',
'--dim', '5', '2', '2', '3', '5', '2',
'--numfiles', '4',
'--cpp', '--hpp',
],
depend_files : [ swr_gen_rasterizer_files, swr_gen_header_init_files ],
)

View File

@@ -249,9 +249,15 @@ DIType* JitManager::GetDebugType(Type* pTy)
switch (id)
{
case Type::VoidTyID: return builder.createUnspecifiedType("void"); break;
#if LLVM_VERSION_MAJOR >= 4
case Type::HalfTyID: return builder.createBasicType("float16", 16, dwarf::DW_ATE_float); break;
case Type::FloatTyID: return builder.createBasicType("float", 32, dwarf::DW_ATE_float); break;
case Type::DoubleTyID: return builder.createBasicType("double", 64, dwarf::DW_ATE_float); break;
#else
case Type::HalfTyID: return builder.createBasicType("float16", 16, 0, dwarf::DW_ATE_float); break;
case Type::FloatTyID: return builder.createBasicType("float", 32, 0, dwarf::DW_ATE_float); break;
case Type::DoubleTyID: return builder.createBasicType("double", 64, 0, dwarf::DW_ATE_float); break;
#endif
case Type::IntegerTyID: return GetDebugIntegerType(pTy); break;
case Type::StructTyID: return GetDebugStructType(pTy); break;
case Type::ArrayTyID: return GetDebugArrayType(pTy); break;
@@ -288,11 +294,19 @@ DIType* JitManager::GetDebugIntegerType(Type* pTy)
IntegerType* pIntTy = cast<IntegerType>(pTy);
switch (pIntTy->getBitWidth())
{
#if LLVM_VERSION_MAJOR >= 4
case 1: return builder.createBasicType("int1", 1, dwarf::DW_ATE_unsigned); break;
case 8: return builder.createBasicType("int8", 8, dwarf::DW_ATE_signed); break;
case 16: return builder.createBasicType("int16", 16, dwarf::DW_ATE_signed); break;
case 32: return builder.createBasicType("int", 32, dwarf::DW_ATE_signed); break;
case 64: return builder.createBasicType("int64", 64, dwarf::DW_ATE_signed); break;
#else
case 1: return builder.createBasicType("int1", 1, 0, dwarf::DW_ATE_unsigned); break;
case 8: return builder.createBasicType("int8", 8, 0, dwarf::DW_ATE_signed); break;
case 16: return builder.createBasicType("int16", 16, 0, dwarf::DW_ATE_signed); break;
case 32: return builder.createBasicType("int", 32, 0, dwarf::DW_ATE_signed); break;
case 64: return builder.createBasicType("int64", 64, 0, dwarf::DW_ATE_signed); break;
#endif
default: SWR_ASSERT(false, "Unimplemented integer bit width");
}
return nullptr;

View File

@@ -0,0 +1,50 @@
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
gen_builder_hpp = custom_target(
'gen_builder.hpp',
input : [
swr_gen_llvm_ir_macros_py,
join_paths(
dep_llvm.get_configtool_variable('includedir'), 'llvm', 'IR',
'IRBuilder.h'
)
],
output : 'gen_builder.hpp',
command : [
prog_python2, '@INPUT0@', '--input', '@INPUT1@', '--output', '@OUTPUT@',
'--gen_h', '--output-dir', '@OUTDIR@'
],
depend_files : swr_gen_builder_depends,
build_by_default : true,
)
gen_builder_x86_hpp = custom_target(
'gen_builder_x86.hpp',
input : '../codegen/gen_llvm_ir_macros.py',
output : 'gen_builder_x86.hpp',
command : [
prog_python2, '@INPUT0@', '--gen_x86_h', '--output', '@OUTPUT@',
'--output-dir', '@OUTDIR@'
],
depend_files : swr_gen_builder_depends,
)

View File

@@ -29,7 +29,6 @@ VC5_PER_VERSION_SOURCES = \
v3dx_context.h \
v3dx_format_table.c \
v3dx_simulator.c \
v3dx_simulator.h \
vc5_draw.c \
vc5_emit.c \
vc5_rcl.c \

View File

@@ -76,7 +76,6 @@ virgl_tgsi_transform_instruction(struct tgsi_transform_context *ctx,
for (unsigned i = 0; i < inst->Instruction.NumSrcRegs; i++) {
if (inst->Src[i].Register.File == TGSI_FILE_CONSTANT &&
inst->Src[i].Register.Dimension &&
!inst->Src[i].Register.Indirect &&
inst->Src[i].Dimension.Index == 0)
inst->Src[i].Register.Dimension = 0;
}

View File

@@ -68,6 +68,7 @@ libgallium_nine = shared_library(
driver_swrast, driver_r300, driver_r600, driver_radeonsi, driver_nouveau,
driver_i915, driver_svga,
],
name_prefix : '',
version : '.'.join(nine_version),
install : true,
install_dir : d3d_drivers_path,

View File

@@ -114,6 +114,27 @@ static int amdgpu_fence_export_sync_file(struct radeon_winsys *rws,
return fd;
}
static int amdgpu_export_signalled_sync_file(struct radeon_winsys *rws)
{
struct amdgpu_winsys *ws = amdgpu_winsys(rws);
uint32_t syncobj;
int fd = -1;
int r = amdgpu_cs_create_syncobj2(ws->dev, DRM_SYNCOBJ_CREATE_SIGNALED,
&syncobj);
if (r) {
return -1;
}
r = amdgpu_cs_syncobj_export_sync_file(ws->dev, syncobj, &fd);
if (r) {
fd = -1;
}
amdgpu_cs_destroy_syncobj(ws->dev, syncobj);
return fd;
}
static void amdgpu_fence_submitted(struct pipe_fence_handle *fence,
uint64_t seq_no,
uint64_t *user_fence_cpu_address)
@@ -649,11 +670,10 @@ static bool amdgpu_ib_new_buffer(struct amdgpu_winsys *ws, struct amdgpu_ib *ib,
ws->info.gart_page_size,
RADEON_DOMAIN_GTT,
RADEON_FLAG_NO_INTERPROCESS_SHARING |
RADEON_FLAG_READ_ONLY |
(ring_type == RING_GFX ||
ring_type == RING_COMPUTE ||
ring_type == RING_DMA ?
RADEON_FLAG_GTT_WC : 0));
RADEON_FLAG_READ_ONLY | RADEON_FLAG_GTT_WC : 0));
if (!pb)
return false;
@@ -1560,4 +1580,5 @@ void amdgpu_cs_init_functions(struct amdgpu_winsys *ws)
ws->base.fence_reference = amdgpu_fence_reference;
ws->base.fence_import_sync_file = amdgpu_fence_import_sync_file;
ws->base.fence_export_sync_file = amdgpu_fence_export_sync_file;
ws->base.export_signalled_sync_file = amdgpu_export_signalled_sync_file;
}

View File

@@ -215,6 +215,9 @@ static void surf_drm_to_winsys(struct radeon_drm_winsys *ws,
}
set_micro_tile_mode(surf_ws, &ws->info);
surf_ws->is_displayable = surf_ws->is_linear ||
surf_ws->micro_tile_mode == RADEON_MICRO_MODE_DISPLAY ||
surf_ws->micro_tile_mode == RADEON_MICRO_MODE_ROTATED;
}
static int radeon_winsys_surface_init(struct radeon_winsys *rws,

View File

@@ -38,7 +38,7 @@ incs_gbm = [
if with_dri2
files_gbm += files('backends/dri/gbm_dri.c', 'backends/dri/gbm_driint.h')
deps_gbm += dep_libdrm # TODO: pthread-stubs
args_gbm += '-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_driver_dir)
args_gbm += '-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_search_path)
endif
if with_platform_wayland
deps_gbm += dep_wayland_server

View File

@@ -41,7 +41,6 @@
#include "main/glheader.h"
#include "glapi.h"
#include "glapitable.h"
#include "main/dispatch.h"
#include "apple_glx.h"
#include "apple_xgl_api.h"
@@ -61,12 +60,11 @@ static void _apple_glapi_create_table(void) {
assert(__applegl_api);
memcpy(__applegl_api, __ogl_framework_api, sizeof(struct _glapi_table));
SET_ReadPixels(__applegl_api, __applegl_glReadPixels);
SET_CopyPixels(__applegl_api, __applegl_glCopyPixels);
SET_CopyColorTable(__applegl_api, __applegl_glCopyColorTable);
SET_DrawBuffer(__applegl_api, __applegl_glDrawBuffer);
SET_DrawBuffers(__applegl_api, __applegl_glDrawBuffers);
SET_Viewport(__applegl_api, __applegl_glViewport);
_glapi_table_patch(__applegl_api, "ReadPixels", __applegl_glReadPixels);
_glapi_table_patch(__applegl_api, "CopyPixels", __applegl_glCopyPixels);
_glapi_table_patch(__applegl_api, "CopyColorTable", __applegl_glCopyColorTable);
_glapi_table_patch(__applegl_api, "DrawBuffers", __applegl_glDrawBuffer);
_glapi_table_patch(__applegl_api, "Viewport", __applegl_glViewport);
}
void apple_glapi_set_dispatch(void) {

View File

@@ -32,6 +32,7 @@
#include <stdlib.h>
#include <assert.h>
#include <GL/gl.h>
#include <util/debug.h>
/* <rdar://problem/6953344> */
#define glTexImage1D glTexImage1D_OSX

View File

@@ -43,6 +43,7 @@
#ifdef GLX_USE_APPLEGL
#include "apple/apple_glx_context.h"
#include "apple/apple_glx.h"
#include "util/debug.h"
#else
#include <sys/time.h>
#ifdef XF86VIDMODE

View File

@@ -18,7 +18,9 @@
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
subdir('windows')
if with_dri_platform == 'windows'
subdir('windows')
endif
files_libglx = files(
'clientattrib.c',
@@ -111,7 +113,6 @@ elif with_dri_platform == 'windows'
extra_ld_args_libgl = '-Wl,--disable-stdcall-fixup'
endif
dri_driver_dir = join_paths(get_option('prefix'), dri_drivers_path)
if not with_glvnd
gl_lib_name = 'GL'
gl_lib_version = '1.2.0'
@@ -128,7 +129,8 @@ else
endif
gl_lib_cargs = [
'-D_REENTRANT', '-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_driver_dir),
'-D_REENTRANT',
'-DDEFAULT_DRIVER_DIR="@0@"'.format(dri_search_path),
]
if dep_xxf86vm != [] and dep_xxf86vm.found()

View File

@@ -75,6 +75,18 @@ indirect_create_context_attribs(struct glx_screen *base,
return indirect_create_context(base, config_base, shareList, 0);
}
#ifdef GLX_USE_APPLEGL
#warning Indirect GLX tests are not built
extern "C" struct glx_context *
applegl_create_context(struct glx_screen *base,
struct glx_config *config_base,
struct glx_context *shareList,
int renderType)
{
return indirect_create_context(base, config_base, shareList, renderType);
}
#endif
/* This is necessary so that we don't have to link with glxcurrent.c
* which would require us to link with X libraries and what not.
*/

View File

@@ -705,6 +705,8 @@ void __indirect_glFramebufferTextureLayer(void) { }
}
/*@}*/
#ifndef GLX_USE_APPLEGL
class IndirectAPI : public ::testing::Test {
public:
virtual void SetUp();
@@ -1518,3 +1520,5 @@ TEST_F(IndirectAPI, EXT_texture_array)
{
EXPECT_EQ((_glapi_proc) __indirect_glFramebufferTextureLayer, table[_glapi_get_proc_offset("glFramebufferTextureLayerEXT")]);
}
#endif

View File

@@ -65,6 +65,7 @@ CLEANFILES += \
EXTRA_DIST += \
$(top_srcdir)/include/vulkan/vk_icd.h \
vulkan/anv_entrypoints_gen.py \
vulkan/anv_extensions_gen.py \
vulkan/anv_extensions.py \
vulkan/anv_icd.py \
vulkan/TODO

View File

@@ -2096,15 +2096,6 @@ fs_visitor::assign_constant_locations()
if (subgroup_id_index >= 0)
max_push_components--; /* Save a slot for the thread ID */
/* FIXME: We currently have some GPU hangs that happen apparently when using
* push constants. Since we have no solution for such hangs yet, just
* go ahead and use pull constants for now.
*/
if (devinfo->gen == 10 && compiler->supports_pull_constants) {
compiler->shader_perf_log(log_data, "Disabling push constants.");
max_push_components = 0;
}
/* We push small arrays, but no bigger than 16 floats. This is big enough
* for a vec4 but hopefully not large enough to push out other stuff. We
* should probably use a better heuristic at some point.
@@ -3640,13 +3631,18 @@ fs_visitor::lower_integer_multiplication()
regions_overlap(inst->dst, inst->size_written,
inst->src[1], inst->size_read(1))) {
needs_mov = true;
low.nr = alloc.allocate(regs_written(inst));
low.offset = low.offset % REG_SIZE;
/* Get a new VGRF but keep the same stride as inst->dst */
low = fs_reg(VGRF, alloc.allocate(regs_written(inst)),
inst->dst.type);
low.stride = inst->dst.stride;
low.offset = inst->dst.offset % REG_SIZE;
}
fs_reg high = inst->dst;
high.nr = alloc.allocate(regs_written(inst));
high.offset = high.offset % REG_SIZE;
/* Get a new VGRF but keep the same stride as inst->dst */
fs_reg high(VGRF, alloc.allocate(regs_written(inst)),
inst->dst.type);
high.stride = inst->dst.stride;
high.offset = inst->dst.offset % REG_SIZE;
if (devinfo->gen >= 7) {
if (inst->src[1].file == IMM) {

View File

@@ -113,47 +113,43 @@ anv_dynamic_state_copy(struct anv_dynamic_state *dest,
}
static void
anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
anv_cmd_state_init(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_cmd_state *state = &cmd_buffer->state;
cmd_buffer->batch.status = VK_SUCCESS;
memset(state, 0, sizeof(*state));
memset(&state->descriptors, 0, sizeof(state->descriptors));
for (uint32_t i = 0; i < ARRAY_SIZE(state->push_descriptors); i++) {
vk_free(&cmd_buffer->pool->alloc, state->push_descriptors[i]);
state->push_descriptors[i] = NULL;
}
for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
vk_free(&cmd_buffer->pool->alloc, state->push_constants[i]);
state->push_constants[i] = NULL;
}
memset(state->binding_tables, 0, sizeof(state->binding_tables));
memset(state->samplers, 0, sizeof(state->samplers));
/* 0 isn't a valid config. This ensures that we always configure L3$. */
cmd_buffer->state.current_l3_config = 0;
state->dirty = 0;
state->vb_dirty = 0;
state->pending_pipe_bits = 0;
state->descriptors_dirty = 0;
state->push_constants_dirty = 0;
state->pipeline = NULL;
state->framebuffer = NULL;
state->pass = NULL;
state->subpass = NULL;
state->push_constant_stages = 0;
state->restart_index = UINT32_MAX;
state->dynamic = default_dynamic_state;
state->need_query_wa = true;
state->pma_fix_enabled = false;
state->hiz_enabled = false;
state->gfx.dynamic = default_dynamic_state;
}
static void
anv_cmd_pipeline_state_finish(struct anv_cmd_buffer *cmd_buffer,
struct anv_cmd_pipeline_state *pipe_state)
{
for (uint32_t i = 0; i < ARRAY_SIZE(pipe_state->push_descriptors); i++)
vk_free(&cmd_buffer->pool->alloc, pipe_state->push_descriptors[i]);
}
static void
anv_cmd_state_finish(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_cmd_state *state = &cmd_buffer->state;
anv_cmd_pipeline_state_finish(cmd_buffer, &state->gfx.base);
anv_cmd_pipeline_state_finish(cmd_buffer, &state->compute.base);
for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++)
vk_free(&cmd_buffer->pool->alloc, state->push_constants[i]);
vk_free(&cmd_buffer->pool->alloc, state->attachments);
state->attachments = NULL;
}
state->gen7.index_buffer = NULL;
static void
anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
{
anv_cmd_state_finish(cmd_buffer);
anv_cmd_state_init(cmd_buffer);
}
VkResult
@@ -198,14 +194,10 @@ static VkResult anv_create_cmd_buffer(
cmd_buffer->batch.status = VK_SUCCESS;
for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
cmd_buffer->state.push_constants[i] = NULL;
}
cmd_buffer->_loader_data.loaderMagic = ICD_LOADER_MAGIC;
cmd_buffer->device = device;
cmd_buffer->pool = pool;
cmd_buffer->level = level;
cmd_buffer->state.attachments = NULL;
result = anv_cmd_buffer_init_batch_bo_chain(cmd_buffer);
if (result != VK_SUCCESS)
@@ -216,8 +208,7 @@ static VkResult anv_create_cmd_buffer(
anv_state_stream_init(&cmd_buffer->dynamic_state_stream,
&device->dynamic_state_pool, 16384);
memset(cmd_buffer->state.push_descriptors, 0,
sizeof(cmd_buffer->state.push_descriptors));
anv_cmd_state_init(cmd_buffer);
if (pool) {
list_addtail(&cmd_buffer->pool_link, &pool->cmd_buffers);
@@ -276,7 +267,7 @@ anv_cmd_buffer_destroy(struct anv_cmd_buffer *cmd_buffer)
anv_state_stream_finish(&cmd_buffer->surface_state_stream);
anv_state_stream_finish(&cmd_buffer->dynamic_state_stream);
anv_cmd_state_reset(cmd_buffer);
anv_cmd_state_finish(cmd_buffer);
vk_free(&cmd_buffer->pool->alloc, cmd_buffer);
}
@@ -353,22 +344,22 @@ void anv_CmdBindPipeline(
switch (pipelineBindPoint) {
case VK_PIPELINE_BIND_POINT_COMPUTE:
cmd_buffer->state.compute_pipeline = pipeline;
cmd_buffer->state.compute_dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.compute.base.pipeline = pipeline;
cmd_buffer->state.compute.pipeline_dirty = true;
cmd_buffer->state.push_constants_dirty |= VK_SHADER_STAGE_COMPUTE_BIT;
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_COMPUTE_BIT;
break;
case VK_PIPELINE_BIND_POINT_GRAPHICS:
cmd_buffer->state.pipeline = pipeline;
cmd_buffer->state.vb_dirty |= pipeline->vb_used;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.gfx.base.pipeline = pipeline;
cmd_buffer->state.gfx.vb_dirty |= pipeline->vb_used;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.push_constants_dirty |= pipeline->active_stages;
cmd_buffer->state.descriptors_dirty |= pipeline->active_stages;
/* Apply the dynamic state from the pipeline */
cmd_buffer->state.dirty |= pipeline->dynamic_state_mask;
anv_dynamic_state_copy(&cmd_buffer->state.dynamic,
cmd_buffer->state.gfx.dirty |= pipeline->dynamic_state_mask;
anv_dynamic_state_copy(&cmd_buffer->state.gfx.dynamic,
&pipeline->dynamic_state,
pipeline->dynamic_state_mask);
break;
@@ -388,13 +379,13 @@ void anv_CmdSetViewport(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
const uint32_t total_count = firstViewport + viewportCount;
if (cmd_buffer->state.dynamic.viewport.count < total_count)
cmd_buffer->state.dynamic.viewport.count = total_count;
if (cmd_buffer->state.gfx.dynamic.viewport.count < total_count)
cmd_buffer->state.gfx.dynamic.viewport.count = total_count;
memcpy(cmd_buffer->state.dynamic.viewport.viewports + firstViewport,
memcpy(cmd_buffer->state.gfx.dynamic.viewport.viewports + firstViewport,
pViewports, viewportCount * sizeof(*pViewports));
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_VIEWPORT;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_VIEWPORT;
}
void anv_CmdSetScissor(
@@ -406,13 +397,13 @@ void anv_CmdSetScissor(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
const uint32_t total_count = firstScissor + scissorCount;
if (cmd_buffer->state.dynamic.scissor.count < total_count)
cmd_buffer->state.dynamic.scissor.count = total_count;
if (cmd_buffer->state.gfx.dynamic.scissor.count < total_count)
cmd_buffer->state.gfx.dynamic.scissor.count = total_count;
memcpy(cmd_buffer->state.dynamic.scissor.scissors + firstScissor,
memcpy(cmd_buffer->state.gfx.dynamic.scissor.scissors + firstScissor,
pScissors, scissorCount * sizeof(*pScissors));
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_SCISSOR;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_SCISSOR;
}
void anv_CmdSetLineWidth(
@@ -421,8 +412,8 @@ void anv_CmdSetLineWidth(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
cmd_buffer->state.dynamic.line_width = lineWidth;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH;
cmd_buffer->state.gfx.dynamic.line_width = lineWidth;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH;
}
void anv_CmdSetDepthBias(
@@ -433,11 +424,11 @@ void anv_CmdSetDepthBias(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
cmd_buffer->state.dynamic.depth_bias.bias = depthBiasConstantFactor;
cmd_buffer->state.dynamic.depth_bias.clamp = depthBiasClamp;
cmd_buffer->state.dynamic.depth_bias.slope = depthBiasSlopeFactor;
cmd_buffer->state.gfx.dynamic.depth_bias.bias = depthBiasConstantFactor;
cmd_buffer->state.gfx.dynamic.depth_bias.clamp = depthBiasClamp;
cmd_buffer->state.gfx.dynamic.depth_bias.slope = depthBiasSlopeFactor;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS;
}
void anv_CmdSetBlendConstants(
@@ -446,10 +437,10 @@ void anv_CmdSetBlendConstants(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
memcpy(cmd_buffer->state.dynamic.blend_constants,
memcpy(cmd_buffer->state.gfx.dynamic.blend_constants,
blendConstants, sizeof(float) * 4);
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS;
}
void anv_CmdSetDepthBounds(
@@ -459,10 +450,10 @@ void anv_CmdSetDepthBounds(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
cmd_buffer->state.dynamic.depth_bounds.min = minDepthBounds;
cmd_buffer->state.dynamic.depth_bounds.max = maxDepthBounds;
cmd_buffer->state.gfx.dynamic.depth_bounds.min = minDepthBounds;
cmd_buffer->state.gfx.dynamic.depth_bounds.max = maxDepthBounds;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BOUNDS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BOUNDS;
}
void anv_CmdSetStencilCompareMask(
@@ -473,11 +464,11 @@ void anv_CmdSetStencilCompareMask(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
if (faceMask & VK_STENCIL_FACE_FRONT_BIT)
cmd_buffer->state.dynamic.stencil_compare_mask.front = compareMask;
cmd_buffer->state.gfx.dynamic.stencil_compare_mask.front = compareMask;
if (faceMask & VK_STENCIL_FACE_BACK_BIT)
cmd_buffer->state.dynamic.stencil_compare_mask.back = compareMask;
cmd_buffer->state.gfx.dynamic.stencil_compare_mask.back = compareMask;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK;
}
void anv_CmdSetStencilWriteMask(
@@ -488,11 +479,11 @@ void anv_CmdSetStencilWriteMask(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
if (faceMask & VK_STENCIL_FACE_FRONT_BIT)
cmd_buffer->state.dynamic.stencil_write_mask.front = writeMask;
cmd_buffer->state.gfx.dynamic.stencil_write_mask.front = writeMask;
if (faceMask & VK_STENCIL_FACE_BACK_BIT)
cmd_buffer->state.dynamic.stencil_write_mask.back = writeMask;
cmd_buffer->state.gfx.dynamic.stencil_write_mask.back = writeMask;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK;
}
void anv_CmdSetStencilReference(
@@ -503,11 +494,59 @@ void anv_CmdSetStencilReference(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
if (faceMask & VK_STENCIL_FACE_FRONT_BIT)
cmd_buffer->state.dynamic.stencil_reference.front = reference;
cmd_buffer->state.gfx.dynamic.stencil_reference.front = reference;
if (faceMask & VK_STENCIL_FACE_BACK_BIT)
cmd_buffer->state.dynamic.stencil_reference.back = reference;
cmd_buffer->state.gfx.dynamic.stencil_reference.back = reference;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
}
static void
anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
struct anv_pipeline_layout *layout,
uint32_t set_index,
struct anv_descriptor_set *set,
uint32_t *dynamic_offset_count,
const uint32_t **dynamic_offsets)
{
struct anv_descriptor_set_layout *set_layout =
layout->set[set_index].layout;
struct anv_cmd_pipeline_state *pipe_state;
if (bind_point == VK_PIPELINE_BIND_POINT_COMPUTE) {
pipe_state = &cmd_buffer->state.compute.base;
} else {
assert(bind_point == VK_PIPELINE_BIND_POINT_GRAPHICS);
pipe_state = &cmd_buffer->state.gfx.base;
}
pipe_state->descriptors[set_index] = set;
if (dynamic_offsets) {
if (set_layout->dynamic_offset_count > 0) {
uint32_t dynamic_offset_start =
layout->set[set_index].dynamic_offset_start;
/* Assert that everything is in range */
assert(set_layout->dynamic_offset_count <= *dynamic_offset_count);
assert(dynamic_offset_start + set_layout->dynamic_offset_count <=
ARRAY_SIZE(pipe_state->dynamic_offsets));
typed_memcpy(&pipe_state->dynamic_offsets[dynamic_offset_start],
*dynamic_offsets, set_layout->dynamic_offset_count);
*dynamic_offsets += set_layout->dynamic_offset_count;
*dynamic_offset_count -= set_layout->dynamic_offset_count;
}
}
if (bind_point == VK_PIPELINE_BIND_POINT_COMPUTE) {
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_COMPUTE_BIT;
} else {
assert(bind_point == VK_PIPELINE_BIND_POINT_GRAPHICS);
cmd_buffer->state.descriptors_dirty |=
set_layout->shader_stages & VK_SHADER_STAGE_ALL_GRAPHICS;
}
}
void anv_CmdBindDescriptorSets(
@@ -522,35 +561,15 @@ void anv_CmdBindDescriptorSets(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_pipeline_layout, layout, _layout);
struct anv_descriptor_set_layout *set_layout;
assert(firstSet + descriptorSetCount < MAX_SETS);
uint32_t dynamic_slot = 0;
for (uint32_t i = 0; i < descriptorSetCount; i++) {
ANV_FROM_HANDLE(anv_descriptor_set, set, pDescriptorSets[i]);
set_layout = layout->set[firstSet + i].layout;
cmd_buffer->state.descriptors[firstSet + i] = set;
if (set_layout->dynamic_offset_count > 0) {
uint32_t dynamic_offset_start =
layout->set[firstSet + i].dynamic_offset_start;
/* Assert that everything is in range */
assert(dynamic_offset_start + set_layout->dynamic_offset_count <=
ARRAY_SIZE(cmd_buffer->state.dynamic_offsets));
assert(dynamic_slot + set_layout->dynamic_offset_count <=
dynamicOffsetCount);
typed_memcpy(&cmd_buffer->state.dynamic_offsets[dynamic_offset_start],
&pDynamicOffsets[dynamic_slot],
set_layout->dynamic_offset_count);
dynamic_slot += set_layout->dynamic_offset_count;
}
cmd_buffer->state.descriptors_dirty |= set_layout->shader_stages;
anv_cmd_buffer_bind_descriptor_set(cmd_buffer, pipelineBindPoint,
layout, firstSet + i, set,
&dynamicOffsetCount,
&pDynamicOffsets);
}
}
@@ -571,7 +590,7 @@ void anv_CmdBindVertexBuffers(
for (uint32_t i = 0; i < bindingCount; i++) {
vb[firstBinding + i].buffer = anv_buffer_from_handle(pBuffers[i]);
vb[firstBinding + i].offset = pOffsets[i];
cmd_buffer->state.vb_dirty |= 1 << (firstBinding + i);
cmd_buffer->state.gfx.vb_dirty |= 1 << (firstBinding + i);
}
}
@@ -653,14 +672,16 @@ struct anv_state
anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
gl_shader_stage stage)
{
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
/* If we don't have this stage, bail. */
if (!anv_pipeline_has_stage(cmd_buffer->state.pipeline, stage))
if (!anv_pipeline_has_stage(pipeline, stage))
return (struct anv_state) { .offset = 0 };
struct anv_push_constants *data =
cmd_buffer->state.push_constants[stage];
const struct brw_stage_prog_data *prog_data =
cmd_buffer->state.pipeline->shaders[stage]->prog_data;
pipeline->shaders[stage]->prog_data;
/* If we don't actually have any push constants, bail. */
if (data == NULL || prog_data == NULL || prog_data->nr_params == 0)
@@ -686,7 +707,7 @@ anv_cmd_buffer_cs_push_constants(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_push_constants *data =
cmd_buffer->state.push_constants[MESA_SHADER_COMPUTE];
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
const struct brw_cs_prog_data *cs_prog_data = get_cs_prog_data(pipeline);
const struct brw_stage_prog_data *prog_data = &cs_prog_data->base;
@@ -850,12 +871,21 @@ anv_cmd_buffer_get_depth_stencil_view(const struct anv_cmd_buffer *cmd_buffer)
return iview;
}
static VkResult
anv_cmd_buffer_ensure_push_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
uint32_t set)
static struct anv_push_descriptor_set *
anv_cmd_buffer_get_push_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
uint32_t set)
{
struct anv_cmd_pipeline_state *pipe_state;
if (bind_point == VK_PIPELINE_BIND_POINT_COMPUTE) {
pipe_state = &cmd_buffer->state.compute.base;
} else {
assert(bind_point == VK_PIPELINE_BIND_POINT_GRAPHICS);
pipe_state = &cmd_buffer->state.gfx.base;
}
struct anv_push_descriptor_set **push_set =
&cmd_buffer->state.push_descriptors[set];
&pipe_state->push_descriptors[set];
if (*push_set == NULL) {
*push_set = vk_alloc(&cmd_buffer->pool->alloc,
@@ -863,11 +893,11 @@ anv_cmd_buffer_ensure_push_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (*push_set == NULL) {
anv_batch_set_error(&cmd_buffer->batch, VK_ERROR_OUT_OF_HOST_MEMORY);
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
return NULL;
}
}
return VK_SUCCESS;
return *push_set;
}
void anv_CmdPushDescriptorSetKHR(
@@ -881,17 +911,17 @@ void anv_CmdPushDescriptorSetKHR(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_pipeline_layout, layout, _layout);
assert(pipelineBindPoint == VK_PIPELINE_BIND_POINT_GRAPHICS ||
pipelineBindPoint == VK_PIPELINE_BIND_POINT_COMPUTE);
assert(_set < MAX_SETS);
const struct anv_descriptor_set_layout *set_layout =
layout->set[_set].layout;
if (anv_cmd_buffer_ensure_push_descriptor_set(cmd_buffer, _set) != VK_SUCCESS)
return;
struct anv_push_descriptor_set *push_set =
cmd_buffer->state.push_descriptors[_set];
anv_cmd_buffer_get_push_descriptor_set(cmd_buffer,
pipelineBindPoint, _set);
if (!push_set)
return;
struct anv_descriptor_set *set = &push_set->set;
set->layout = set_layout;
@@ -958,8 +988,8 @@ void anv_CmdPushDescriptorSetKHR(
}
}
cmd_buffer->state.descriptors[_set] = set;
cmd_buffer->state.descriptors_dirty |= set_layout->shader_stages;
anv_cmd_buffer_bind_descriptor_set(cmd_buffer, pipelineBindPoint,
layout, _set, set, NULL, NULL);
}
void anv_CmdPushDescriptorSetWithTemplateKHR(
@@ -979,10 +1009,12 @@ void anv_CmdPushDescriptorSetWithTemplateKHR(
const struct anv_descriptor_set_layout *set_layout =
layout->set[_set].layout;
if (anv_cmd_buffer_ensure_push_descriptor_set(cmd_buffer, _set) != VK_SUCCESS)
return;
struct anv_push_descriptor_set *push_set =
cmd_buffer->state.push_descriptors[_set];
anv_cmd_buffer_get_push_descriptor_set(cmd_buffer,
template->bind_point, _set);
if (!push_set)
return;
struct anv_descriptor_set *set = &push_set->set;
set->layout = set_layout;
@@ -996,6 +1028,6 @@ void anv_CmdPushDescriptorSetWithTemplateKHR(
template,
pData);
cmd_buffer->state.descriptors[_set] = set;
cmd_buffer->state.descriptors_dirty |= set_layout->shader_stages;
anv_cmd_buffer_bind_descriptor_set(cmd_buffer, template->bind_point,
layout, _set, set, NULL, NULL);
}

View File

@@ -893,6 +893,8 @@ VkResult anv_CreateDescriptorUpdateTemplateKHR(
if (template == NULL)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
template->bind_point = pCreateInfo->pipelineBindPoint;
if (pCreateInfo->templateType == VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR)
template->set = pCreateInfo->set;

View File

@@ -83,7 +83,7 @@ EXTENSIONS = [
Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'),
Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'),
Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'),
Extension('VK_KHX_multiview', 1, True),
Extension('VK_KHX_multiview', 1, False),
Extension('VK_EXT_debug_report', 8, True),
Extension('VK_EXT_external_memory_dma_buf', 1, True),
]

View File

@@ -44,4 +44,4 @@ if __name__ == '__main__':
}
with open(args.out, 'w') as f:
json.dump(json_data, f, indent = 4)
json.dump(json_data, f, indent = 4, sort_keys=True)

View File

@@ -313,10 +313,10 @@ VkResult __vk_errorf(struct anv_instance *instance, const void *object,
#ifdef DEBUG
#define vk_error(error) __vk_errorf(NULL, NULL,\
VK_DEBUG_REPORT_OBJECT_TYPE_UNKNOWN_EXT,\
error, __FILE__, __LINE__, NULL);
error, __FILE__, __LINE__, NULL)
#define vk_errorf(instance, obj, error, format, ...)\
__vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\
__FILE__, __LINE__, format, ## __VA_ARGS__);
__FILE__, __LINE__, format, ## __VA_ARGS__)
#else
#define vk_error(error) error
#define vk_errorf(instance, obj, error, format, ...) error
@@ -1306,6 +1306,8 @@ struct anv_descriptor_template_entry {
};
struct anv_descriptor_update_template {
VkPipelineBindPoint bind_point;
/* The descriptor set this template corresponds to. This value is only
* valid if the template was created with the templateType
* VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR.
@@ -1663,38 +1665,83 @@ struct anv_attachment_state {
bool clear_color_is_zero;
};
/** State tracking for particular pipeline bind point
*
* This struct is the base struct for anv_cmd_graphics_state and
* anv_cmd_compute_state. These are used to track state which is bound to a
* particular type of pipeline. Generic state that applies per-stage such as
* binding table offsets and push constants is tracked generically with a
* per-stage array in anv_cmd_state.
*/
struct anv_cmd_pipeline_state {
struct anv_pipeline *pipeline;
struct anv_descriptor_set *descriptors[MAX_SETS];
uint32_t dynamic_offsets[MAX_DYNAMIC_BUFFERS];
struct anv_push_descriptor_set *push_descriptors[MAX_SETS];
};
/** State tracking for graphics pipeline
*
* This has anv_cmd_pipeline_state as a base struct to track things which get
* bound to a graphics pipeline. Along with general pipeline bind point state
* which is in the anv_cmd_pipeline_state base struct, it also contains other
* state which is graphics-specific.
*/
struct anv_cmd_graphics_state {
struct anv_cmd_pipeline_state base;
anv_cmd_dirty_mask_t dirty;
uint32_t vb_dirty;
struct anv_dynamic_state dynamic;
struct {
struct anv_buffer *index_buffer;
uint32_t index_type; /**< 3DSTATE_INDEX_BUFFER.IndexFormat */
uint32_t index_offset;
} gen7;
};
/** State tracking for compute pipeline
*
* This has anv_cmd_pipeline_state as a base struct to track things which get
* bound to a compute pipeline. Along with general pipeline bind point state
* which is in the anv_cmd_pipeline_state base struct, it also contains other
* state which is compute-specific.
*/
struct anv_cmd_compute_state {
struct anv_cmd_pipeline_state base;
bool pipeline_dirty;
struct anv_address num_workgroups;
};
/** State required while building cmd buffer */
struct anv_cmd_state {
/* PIPELINE_SELECT.PipelineSelection */
uint32_t current_pipeline;
const struct gen_l3_config * current_l3_config;
uint32_t vb_dirty;
anv_cmd_dirty_mask_t dirty;
anv_cmd_dirty_mask_t compute_dirty;
struct anv_cmd_graphics_state gfx;
struct anv_cmd_compute_state compute;
enum anv_pipe_bits pending_pipe_bits;
uint32_t num_workgroups_offset;
struct anv_bo *num_workgroups_bo;
VkShaderStageFlags descriptors_dirty;
VkShaderStageFlags push_constants_dirty;
uint32_t scratch_size;
struct anv_pipeline * pipeline;
struct anv_pipeline * compute_pipeline;
struct anv_framebuffer * framebuffer;
struct anv_render_pass * pass;
struct anv_subpass * subpass;
VkRect2D render_area;
uint32_t restart_index;
struct anv_vertex_binding vertex_bindings[MAX_VBS];
struct anv_descriptor_set * descriptors[MAX_SETS];
uint32_t dynamic_offsets[MAX_DYNAMIC_BUFFERS];
VkShaderStageFlags push_constant_stages;
struct anv_push_constants * push_constants[MESA_SHADER_STAGES];
struct anv_state binding_tables[MESA_SHADER_STAGES];
struct anv_state samplers[MESA_SHADER_STAGES];
struct anv_dynamic_state dynamic;
bool need_query_wa;
struct anv_push_descriptor_set * push_descriptors[MAX_SETS];
/**
* Whether or not the gen8 PMA fix is enabled. We ensure that, at the top
@@ -1728,12 +1775,6 @@ struct anv_cmd_state {
* is one of the states in render_pass_states.
*/
struct anv_state null_surface_state;
struct {
struct anv_buffer * index_buffer;
uint32_t index_type; /**< 3DSTATE_INDEX_BUFFER.IndexFormat */
uint32_t index_offset;
} gen7;
};
struct anv_cmd_pool {

View File

@@ -48,8 +48,8 @@ clamp_int64(int64_t x, int64_t min, int64_t max)
void
gen7_cmd_buffer_emit_scissor(struct anv_cmd_buffer *cmd_buffer)
{
uint32_t count = cmd_buffer->state.dynamic.scissor.count;
const VkRect2D *scissors = cmd_buffer->state.dynamic.scissor.scissors;
uint32_t count = cmd_buffer->state.gfx.dynamic.scissor.count;
const VkRect2D *scissors = cmd_buffer->state.gfx.dynamic.scissor.scissors;
struct anv_state scissor_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer, count * 8, 32);
@@ -113,12 +113,12 @@ void genX(CmdBindIndexBuffer)(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
if (GEN_IS_HASWELL)
cmd_buffer->state.restart_index = restart_index_for_type[indexType];
cmd_buffer->state.gen7.index_buffer = buffer;
cmd_buffer->state.gen7.index_type = vk_to_gen_index_type[indexType];
cmd_buffer->state.gen7.index_offset = offset;
cmd_buffer->state.gfx.gen7.index_buffer = buffer;
cmd_buffer->state.gfx.gen7.index_type = vk_to_gen_index_type[indexType];
cmd_buffer->state.gfx.gen7.index_offset = offset;
}
static uint32_t
@@ -154,38 +154,38 @@ get_depth_format(struct anv_cmd_buffer *cmd_buffer)
void
genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
struct anv_dynamic_state *d = &cmd_buffer->state.gfx.dynamic;
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)) {
uint32_t sf_dw[GENX(3DSTATE_SF_length)];
struct GENX(3DSTATE_SF) sf = {
GENX(3DSTATE_SF_header),
.DepthBufferSurfaceFormat = get_depth_format(cmd_buffer),
.LineWidth = cmd_buffer->state.dynamic.line_width,
.GlobalDepthOffsetConstant = cmd_buffer->state.dynamic.depth_bias.bias,
.GlobalDepthOffsetScale = cmd_buffer->state.dynamic.depth_bias.slope,
.GlobalDepthOffsetClamp = cmd_buffer->state.dynamic.depth_bias.clamp
.LineWidth = d->line_width,
.GlobalDepthOffsetConstant = d->depth_bias.bias,
.GlobalDepthOffsetScale = d->depth_bias.slope,
.GlobalDepthOffsetClamp = d->depth_bias.clamp
};
GENX(3DSTATE_SF_pack)(NULL, sf_dw, &sf);
anv_batch_emit_merge(&cmd_buffer->batch, sf_dw, pipeline->gen7.sf);
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
GENX(COLOR_CALC_STATE_length) * 4,
64);
struct GENX(COLOR_CALC_STATE) cc = {
.BlendConstantColorRed = cmd_buffer->state.dynamic.blend_constants[0],
.BlendConstantColorGreen = cmd_buffer->state.dynamic.blend_constants[1],
.BlendConstantColorBlue = cmd_buffer->state.dynamic.blend_constants[2],
.BlendConstantColorAlpha = cmd_buffer->state.dynamic.blend_constants[3],
.BlendConstantColorRed = d->blend_constants[0],
.BlendConstantColorGreen = d->blend_constants[1],
.BlendConstantColorBlue = d->blend_constants[2],
.BlendConstantColorAlpha = d->blend_constants[3],
.StencilReferenceValue = d->stencil_reference.front & 0xff,
.BackfaceStencilReferenceValue = d->stencil_reference.back & 0xff,
};
@@ -197,12 +197,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
uint32_t depth_stencil_dw[GENX(DEPTH_STENCIL_STATE_length)];
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
struct GENX(DEPTH_STENCIL_STATE) depth_stencil = {
.StencilTestMask = d->stencil_compare_mask.front & 0xff,
@@ -228,11 +227,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.gen7.index_buffer &&
cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
struct anv_buffer *buffer = cmd_buffer->state.gen7.index_buffer;
uint32_t offset = cmd_buffer->state.gen7.index_offset;
if (cmd_buffer->state.gfx.gen7.index_buffer &&
cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
struct anv_buffer *buffer = cmd_buffer->state.gfx.gen7.index_buffer;
uint32_t offset = cmd_buffer->state.gfx.gen7.index_offset;
#if GEN_IS_HASWELL
anv_batch_emit(&cmd_buffer->batch, GEN75_3DSTATE_VF, vf) {
@@ -245,7 +244,7 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
#if !GEN_IS_HASWELL
ib.CutIndexEnable = pipeline->primitive_restart;
#endif
ib.IndexFormat = cmd_buffer->state.gen7.index_type;
ib.IndexFormat = cmd_buffer->state.gfx.gen7.index_type;
ib.MemoryObjectControlState = GENX(MOCS);
ib.BufferStartingAddress =
@@ -255,7 +254,7 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
cmd_buffer->state.dirty = 0;
cmd_buffer->state.gfx.dirty = 0;
}
void

View File

@@ -36,8 +36,9 @@
void
gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer *cmd_buffer)
{
uint32_t count = cmd_buffer->state.dynamic.viewport.count;
const VkViewport *viewports = cmd_buffer->state.dynamic.viewport.viewports;
uint32_t count = cmd_buffer->state.gfx.dynamic.viewport.count;
const VkViewport *viewports =
cmd_buffer->state.gfx.dynamic.viewport.viewports;
struct anv_state sf_clip_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer, count * 64, 64);
@@ -79,8 +80,9 @@ void
gen8_cmd_buffer_emit_depth_viewport(struct anv_cmd_buffer *cmd_buffer,
bool depth_clamp_enable)
{
uint32_t count = cmd_buffer->state.dynamic.viewport.count;
const VkViewport *viewports = cmd_buffer->state.dynamic.viewport.viewports;
uint32_t count = cmd_buffer->state.gfx.dynamic.viewport.count;
const VkViewport *viewports =
cmd_buffer->state.gfx.dynamic.viewport.viewports;
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer, count * 8, 32);
@@ -218,7 +220,7 @@ want_depth_pma_fix(struct anv_cmd_buffer *cmd_buffer)
return false;
/* 3DSTATE_PS_EXTRA::PixelShaderValid */
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
if (!anv_pipeline_has_stage(pipeline, MESA_SHADER_FRAGMENT))
return false;
@@ -328,7 +330,7 @@ want_stencil_pma_fix(struct anv_cmd_buffer *cmd_buffer)
assert(ds_iview && ds_iview->image->planes[0].aux_usage == ISL_AUX_USAGE_HIZ);
/* 3DSTATE_PS_EXTRA::PixelShaderValid */
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
if (!anv_pipeline_has_stage(pipeline, MESA_SHADER_FRAGMENT))
return false;
@@ -381,36 +383,36 @@ want_stencil_pma_fix(struct anv_cmd_buffer *cmd_buffer)
void
genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
struct anv_dynamic_state *d = &cmd_buffer->state.gfx.dynamic;
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH)) {
uint32_t sf_dw[GENX(3DSTATE_SF_length)];
struct GENX(3DSTATE_SF) sf = {
GENX(3DSTATE_SF_header),
};
#if GEN_GEN == 8
if (cmd_buffer->device->info.is_cherryview) {
sf.CHVLineWidth = cmd_buffer->state.dynamic.line_width;
sf.CHVLineWidth = d->line_width;
} else {
sf.LineWidth = cmd_buffer->state.dynamic.line_width;
sf.LineWidth = d->line_width;
}
#else
sf.LineWidth = cmd_buffer->state.dynamic.line_width,
sf.LineWidth = d->line_width,
#endif
GENX(3DSTATE_SF_pack)(NULL, sf_dw, &sf);
anv_batch_emit_merge(&cmd_buffer->batch, sf_dw,
cmd_buffer->state.pipeline->gen8.sf);
anv_batch_emit_merge(&cmd_buffer->batch, sf_dw, pipeline->gen8.sf);
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)){
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)){
uint32_t raster_dw[GENX(3DSTATE_RASTER_length)];
struct GENX(3DSTATE_RASTER) raster = {
GENX(3DSTATE_RASTER_header),
.GlobalDepthOffsetConstant = cmd_buffer->state.dynamic.depth_bias.bias,
.GlobalDepthOffsetScale = cmd_buffer->state.dynamic.depth_bias.slope,
.GlobalDepthOffsetClamp = cmd_buffer->state.dynamic.depth_bias.clamp
.GlobalDepthOffsetConstant = d->depth_bias.bias,
.GlobalDepthOffsetScale = d->depth_bias.slope,
.GlobalDepthOffsetClamp = d->depth_bias.clamp
};
GENX(3DSTATE_RASTER_pack)(NULL, raster_dw, &raster);
anv_batch_emit_merge(&cmd_buffer->batch, raster_dw,
@@ -423,18 +425,17 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
* using a big old #if switch here.
*/
#if GEN_GEN == 8
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
GENX(COLOR_CALC_STATE_length) * 4,
64);
struct GENX(COLOR_CALC_STATE) cc = {
.BlendConstantColorRed = cmd_buffer->state.dynamic.blend_constants[0],
.BlendConstantColorGreen = cmd_buffer->state.dynamic.blend_constants[1],
.BlendConstantColorBlue = cmd_buffer->state.dynamic.blend_constants[2],
.BlendConstantColorAlpha = cmd_buffer->state.dynamic.blend_constants[3],
.BlendConstantColorRed = d->blend_constants[0],
.BlendConstantColorGreen = d->blend_constants[1],
.BlendConstantColorBlue = d->blend_constants[2],
.BlendConstantColorAlpha = d->blend_constants[3],
.StencilReferenceValue = d->stencil_reference.front & 0xff,
.BackfaceStencilReferenceValue = d->stencil_reference.back & 0xff,
};
@@ -448,12 +449,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
uint32_t wm_depth_stencil_dw[GENX(3DSTATE_WM_DEPTH_STENCIL_length)];
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
struct GENX(3DSTATE_WM_DEPTH_STENCIL wm_depth_stencil) = {
GENX(3DSTATE_WM_DEPTH_STENCIL_header),
@@ -478,16 +478,16 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
want_depth_pma_fix(cmd_buffer));
}
#else
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS) {
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS) {
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
GENX(COLOR_CALC_STATE_length) * 4,
64);
struct GENX(COLOR_CALC_STATE) cc = {
.BlendConstantColorRed = cmd_buffer->state.dynamic.blend_constants[0],
.BlendConstantColorGreen = cmd_buffer->state.dynamic.blend_constants[1],
.BlendConstantColorBlue = cmd_buffer->state.dynamic.blend_constants[2],
.BlendConstantColorAlpha = cmd_buffer->state.dynamic.blend_constants[3],
.BlendConstantColorRed = d->blend_constants[0],
.BlendConstantColorGreen = d->blend_constants[1],
.BlendConstantColorBlue = d->blend_constants[2],
.BlendConstantColorAlpha = d->blend_constants[3],
};
GENX(COLOR_CALC_STATE_pack)(NULL, cc_state.map, &cc);
@@ -499,13 +499,12 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
uint32_t dwords[GENX(3DSTATE_WM_DEPTH_STENCIL_length)];
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
struct GENX(3DSTATE_WM_DEPTH_STENCIL) wm_depth_stencil = {
GENX(3DSTATE_WM_DEPTH_STENCIL_header),
@@ -532,15 +531,15 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
#endif
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_VF), vf) {
vf.IndexedDrawCutIndexEnable = pipeline->primitive_restart;
vf.CutIndex = cmd_buffer->state.restart_index;
}
}
cmd_buffer->state.dirty = 0;
cmd_buffer->state.gfx.dirty = 0;
}
void genX(CmdBindIndexBuffer)(
@@ -572,7 +571,7 @@ void genX(CmdBindIndexBuffer)(
ib.BufferSize = buffer->size - offset;
}
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
}
/* Set of stage bits for which are pipelined, i.e. they get queued by the

View File

@@ -218,7 +218,7 @@ genX(blorp_exec)(struct blorp_batch *batch,
blorp_exec(batch, params);
cmd_buffer->state.vb_dirty = ~0;
cmd_buffer->state.dirty = ~0;
cmd_buffer->state.gfx.vb_dirty = ~0;
cmd_buffer->state.gfx.dirty = ~0;
cmd_buffer->state.push_constants_dirty = ~0;
}

View File

@@ -969,6 +969,15 @@ genX(BeginCommandBuffer)(
if (cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY)
cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_VF_CACHE_INVALIDATE_BIT;
/* We send an "Indirect State Pointers Disable" packet at
* EndCommandBuffer, so all push contant packets are ignored during a
* context restore. Documentation says after that command, we need to
* emit push constants again before any rendering operation. So we
* flag them dirty here to make sure they get emitted.
*/
if (GEN_GEN == 10)
cmd_buffer->state.push_constants_dirty |= VK_SHADER_STAGE_ALL_GRAPHICS;
VkResult result = VK_SUCCESS;
if (cmd_buffer->usage_flags &
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT) {
@@ -1002,12 +1011,53 @@ genX(BeginCommandBuffer)(
}
}
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
}
return result;
}
/* From the PRM, Volume 2a:
*
* "Indirect State Pointers Disable
*
* At the completion of the post-sync operation associated with this pipe
* control packet, the indirect state pointers in the hardware are
* considered invalid; the indirect pointers are not saved in the context.
* If any new indirect state commands are executed in the command stream
* while the pipe control is pending, the new indirect state commands are
* preserved.
*
* [DevIVB+]: Using Invalidate State Pointer (ISP) only inhibits context
* restoring of Push Constant (3DSTATE_CONSTANT_*) commands. Push Constant
* commands are only considered as Indirect State Pointers. Once ISP is
* issued in a context, SW must initialize by programming push constant
* commands for all the shaders (at least to zero length) before attempting
* any rendering operation for the same context."
*
* 3DSTATE_CONSTANT_* packets are restored during a context restore,
* even though they point to a BO that has been already unreferenced at
* the end of the previous batch buffer. This has been fine so far since
* we are protected by these scratch page (every address not covered by
* a BO should be pointing to the scratch page). But on CNL, it is
* causing a GPU hang during context restore at the 3DSTATE_CONSTANT_*
* instruction.
*
* The flag "Indirect State Pointers Disable" in PIPE_CONTROL tells the
* hardware to ignore previous 3DSTATE_CONSTANT_* packets during a
* context restore, so the mentioned hang doesn't happen. However,
* software must program push constant commands for all stages prior to
* rendering anything. So we flag them dirty in BeginCommandBuffer.
*/
static void
emit_isp_disable(struct anv_cmd_buffer *cmd_buffer)
{
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.IndirectStatePointersDisable = true;
pc.CommandStreamerStallEnable = true;
}
}
VkResult
genX(EndCommandBuffer)(
VkCommandBuffer commandBuffer)
@@ -1024,6 +1074,9 @@ genX(EndCommandBuffer)(
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
if (GEN_GEN == 10)
emit_isp_disable(cmd_buffer);
anv_cmd_buffer_end_batch_buffer(cmd_buffer);
return VK_SUCCESS;
@@ -1398,7 +1451,8 @@ void genX(CmdPipelineBarrier)(
static void
cmd_buffer_alloc_push_constants(struct anv_cmd_buffer *cmd_buffer)
{
VkShaderStageFlags stages = cmd_buffer->state.pipeline->active_stages;
VkShaderStageFlags stages =
cmd_buffer->state.gfx.base.pipeline->active_stages;
/* In order to avoid thrash, we assume that vertex and fragment stages
* always exist. In the rare case where one is missing *and* the other
@@ -1462,32 +1516,32 @@ cmd_buffer_alloc_push_constants(struct anv_cmd_buffer *cmd_buffer)
}
static const struct anv_descriptor *
anv_descriptor_for_binding(const struct anv_cmd_buffer *cmd_buffer,
anv_descriptor_for_binding(const struct anv_cmd_pipeline_state *pipe_state,
const struct anv_pipeline_binding *binding)
{
assert(binding->set < MAX_SETS);
const struct anv_descriptor_set *set =
cmd_buffer->state.descriptors[binding->set];
pipe_state->descriptors[binding->set];
const uint32_t offset =
set->layout->binding[binding->binding].descriptor_index;
return &set->descriptors[offset + binding->index];
}
static uint32_t
dynamic_offset_for_binding(const struct anv_cmd_buffer *cmd_buffer,
dynamic_offset_for_binding(const struct anv_cmd_pipeline_state *pipe_state,
const struct anv_pipeline *pipeline,
const struct anv_pipeline_binding *binding)
{
assert(binding->set < MAX_SETS);
const struct anv_descriptor_set *set =
cmd_buffer->state.descriptors[binding->set];
pipe_state->descriptors[binding->set];
uint32_t dynamic_offset_idx =
pipeline->layout->set[binding->set].dynamic_offset_start +
set->layout->binding[binding->binding].dynamic_offset_index +
binding->index;
return cmd_buffer->state.dynamic_offsets[dynamic_offset_idx];
return pipe_state->dynamic_offsets[dynamic_offset_idx];
}
static VkResult
@@ -1496,19 +1550,21 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
struct anv_state *bt_state)
{
struct anv_subpass *subpass = cmd_buffer->state.subpass;
struct anv_cmd_pipeline_state *pipe_state;
struct anv_pipeline *pipeline;
uint32_t bias, state_offset;
switch (stage) {
case MESA_SHADER_COMPUTE:
pipeline = cmd_buffer->state.compute_pipeline;
pipe_state = &cmd_buffer->state.compute.base;
bias = 1;
break;
default:
pipeline = cmd_buffer->state.pipeline;
pipe_state = &cmd_buffer->state.gfx.base;
bias = 0;
break;
}
pipeline = pipe_state->pipeline;
if (!anv_pipeline_has_stage(pipeline, stage)) {
*bt_state = (struct anv_state) { 0, };
@@ -1530,9 +1586,9 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
return VK_ERROR_OUT_OF_DEVICE_MEMORY;
if (stage == MESA_SHADER_COMPUTE &&
get_cs_prog_data(cmd_buffer->state.compute_pipeline)->uses_num_work_groups) {
struct anv_bo *bo = cmd_buffer->state.num_workgroups_bo;
uint32_t bo_offset = cmd_buffer->state.num_workgroups_offset;
get_cs_prog_data(pipeline)->uses_num_work_groups) {
struct anv_bo *bo = cmd_buffer->state.compute.num_workgroups.bo;
uint32_t bo_offset = cmd_buffer->state.compute.num_workgroups.offset;
struct anv_state surface_state;
surface_state =
@@ -1593,7 +1649,7 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
}
const struct anv_descriptor *desc =
anv_descriptor_for_binding(cmd_buffer, binding);
anv_descriptor_for_binding(pipe_state, binding);
switch (desc->type) {
case VK_DESCRIPTOR_TYPE_SAMPLER:
@@ -1669,7 +1725,7 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC: {
/* Compute the offset within the buffer */
uint32_t dynamic_offset =
dynamic_offset_for_binding(cmd_buffer, pipeline, binding);
dynamic_offset_for_binding(pipe_state, pipeline, binding);
uint64_t offset = desc->offset + dynamic_offset;
/* Clamp to the buffer size */
offset = MIN2(offset, desc->buffer->size);
@@ -1725,12 +1781,10 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
gl_shader_stage stage,
struct anv_state *state)
{
struct anv_pipeline *pipeline;
if (stage == MESA_SHADER_COMPUTE)
pipeline = cmd_buffer->state.compute_pipeline;
else
pipeline = cmd_buffer->state.pipeline;
struct anv_cmd_pipeline_state *pipe_state =
stage == MESA_SHADER_COMPUTE ? &cmd_buffer->state.compute.base :
&cmd_buffer->state.gfx.base;
struct anv_pipeline *pipeline = pipe_state->pipeline;
if (!anv_pipeline_has_stage(pipeline, stage)) {
*state = (struct anv_state) { 0, };
@@ -1751,10 +1805,8 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
for (uint32_t s = 0; s < map->sampler_count; s++) {
struct anv_pipeline_binding *binding = &map->sampler_to_descriptor[s];
struct anv_descriptor_set *set =
cmd_buffer->state.descriptors[binding->set];
uint32_t offset = set->layout->binding[binding->binding].descriptor_index;
struct anv_descriptor *desc = &set->descriptors[offset + binding->index];
const struct anv_descriptor *desc =
anv_descriptor_for_binding(pipe_state, binding);
if (desc->type != VK_DESCRIPTOR_TYPE_SAMPLER &&
desc->type != VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER)
@@ -1780,8 +1832,10 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
static uint32_t
flush_descriptor_sets(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
VkShaderStageFlags dirty = cmd_buffer->state.descriptors_dirty &
cmd_buffer->state.pipeline->active_stages;
pipeline->active_stages;
VkResult result = VK_SUCCESS;
anv_foreach_stage(s, dirty) {
@@ -1807,7 +1861,7 @@ flush_descriptor_sets(struct anv_cmd_buffer *cmd_buffer)
genX(cmd_buffer_emit_state_base_address)(cmd_buffer);
/* Re-emit all active binding tables */
dirty |= cmd_buffer->state.pipeline->active_stages;
dirty |= pipeline->active_stages;
anv_foreach_stage(s, dirty) {
result = emit_samplers(cmd_buffer, s, &cmd_buffer->state.samplers[s]);
if (result != VK_SUCCESS) {
@@ -1876,7 +1930,8 @@ static void
cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
VkShaderStageFlags dirty_stages)
{
UNUSED const struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
const struct anv_cmd_graphics_state *gfx_state = &cmd_buffer->state.gfx;
const struct anv_pipeline *pipeline = gfx_state->base.pipeline;
static const uint32_t push_constant_opcodes[] = {
[MESA_SHADER_VERTEX] = 21,
@@ -1896,7 +1951,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c) {
c._3DCommandSubOpcode = push_constant_opcodes[stage];
if (anv_pipeline_has_stage(cmd_buffer->state.pipeline, stage)) {
if (anv_pipeline_has_stage(pipeline, stage)) {
#if GEN_GEN >= 8 || GEN_IS_HASWELL
const struct brw_stage_prog_data *prog_data =
pipeline->shaders[stage]->prog_data;
@@ -1929,7 +1984,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
&bind_map->surface_to_descriptor[surface];
const struct anv_descriptor *desc =
anv_descriptor_for_binding(cmd_buffer, binding);
anv_descriptor_for_binding(&gfx_state->base, binding);
struct anv_address read_addr;
uint32_t read_len;
@@ -1945,7 +2000,8 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
assert(desc->type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC);
uint32_t dynamic_offset =
dynamic_offset_for_binding(cmd_buffer, pipeline, binding);
dynamic_offset_for_binding(&gfx_state->base,
pipeline, binding);
uint32_t buf_offset =
MIN2(desc->offset + dynamic_offset, desc->buffer->size);
uint32_t buf_range =
@@ -2005,10 +2061,10 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
void
genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
uint32_t *p;
uint32_t vb_emit = cmd_buffer->state.vb_dirty & pipeline->vb_used;
uint32_t vb_emit = cmd_buffer->state.gfx.vb_dirty & pipeline->vb_used;
assert((pipeline->active_stages & VK_SHADER_STAGE_COMPUTE_BIT) == 0);
@@ -2059,16 +2115,15 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
cmd_buffer->state.vb_dirty &= ~vb_emit;
cmd_buffer->state.gfx.vb_dirty &= ~vb_emit;
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_PIPELINE) {
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_PIPELINE) {
anv_batch_emit_batch(&cmd_buffer->batch, &pipeline->batch);
/* The exact descriptor layout is pulled from the pipeline, so we need
* to re-emit binding tables on every pipeline change.
*/
cmd_buffer->state.descriptors_dirty |=
cmd_buffer->state.pipeline->active_stages;
cmd_buffer->state.descriptors_dirty |= pipeline->active_stages;
/* If the pipeline changed, we may need to re-allocate push constant
* space in the URB.
@@ -2099,7 +2154,7 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
#endif
/* Render targets live in the same binding table as fragment descriptors */
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_RENDER_TARGETS)
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_RENDER_TARGETS)
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_FRAGMENT_BIT;
/* We emit the binding tables and sampler tables first, then emit push
@@ -2125,16 +2180,16 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
if (dirty)
cmd_buffer_emit_descriptor_pointers(cmd_buffer, dirty);
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_DYNAMIC_VIEWPORT)
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_DYNAMIC_VIEWPORT)
gen8_cmd_buffer_emit_viewport(cmd_buffer);
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_DYNAMIC_VIEWPORT |
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_DYNAMIC_VIEWPORT |
ANV_CMD_DIRTY_PIPELINE)) {
gen8_cmd_buffer_emit_depth_viewport(cmd_buffer,
pipeline->depth_clamp_enable);
}
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_DYNAMIC_SCISSOR)
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_DYNAMIC_SCISSOR)
gen7_cmd_buffer_emit_scissor(cmd_buffer);
genX(cmd_buffer_flush_dynamic_state)(cmd_buffer);
@@ -2213,7 +2268,7 @@ void genX(CmdDraw)(
uint32_t firstInstance)
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2251,7 +2306,7 @@ void genX(CmdDrawIndexed)(
uint32_t firstInstance)
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2403,7 +2458,7 @@ void genX(CmdDrawIndirect)(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2441,7 +2496,7 @@ void genX(CmdDrawIndexedIndirect)(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2474,7 +2529,7 @@ void genX(CmdDrawIndexedIndirect)(
static VkResult
flush_compute_descriptor_set(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
struct anv_state surfaces = { 0, }, samplers = { 0, };
VkResult result;
@@ -2530,7 +2585,7 @@ flush_compute_descriptor_set(struct anv_cmd_buffer *cmd_buffer)
void
genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
MAYBE_UNUSED VkResult result;
assert(pipeline->active_stages == VK_SHADER_STAGE_COMPUTE_BIT);
@@ -2539,7 +2594,7 @@ genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
genX(flush_pipeline_select_gpgpu)(cmd_buffer);
if (cmd_buffer->state.compute_dirty & ANV_CMD_DIRTY_PIPELINE) {
if (cmd_buffer->state.compute.pipeline_dirty) {
/* From the Sky Lake PRM Vol 2a, MEDIA_VFE_STATE:
*
* "A stalling PIPE_CONTROL is required before MEDIA_VFE_STATE unless
@@ -2555,7 +2610,7 @@ genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
}
if ((cmd_buffer->state.descriptors_dirty & VK_SHADER_STAGE_COMPUTE_BIT) ||
(cmd_buffer->state.compute_dirty & ANV_CMD_DIRTY_PIPELINE)) {
cmd_buffer->state.compute.pipeline_dirty) {
/* FIXME: figure out descriptors for gen7 */
result = flush_compute_descriptor_set(cmd_buffer);
if (result != VK_SUCCESS)
@@ -2576,7 +2631,7 @@ genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
cmd_buffer->state.compute_dirty = 0;
cmd_buffer->state.compute.pipeline_dirty = false;
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
}
@@ -2607,7 +2662,7 @@ void genX(CmdDispatch)(
uint32_t z)
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
const struct brw_cs_prog_data *prog_data = get_cs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2621,9 +2676,10 @@ void genX(CmdDispatch)(
sizes[1] = y;
sizes[2] = z;
anv_state_flush(cmd_buffer->device, state);
cmd_buffer->state.num_workgroups_offset = state.offset;
cmd_buffer->state.num_workgroups_bo =
&cmd_buffer->device->dynamic_state_pool.block_pool.bo;
cmd_buffer->state.compute.num_workgroups = (struct anv_address) {
.bo = &cmd_buffer->device->dynamic_state_pool.block_pool.bo,
.offset = state.offset,
};
}
genX(cmd_buffer_flush_compute_state)(cmd_buffer);
@@ -2654,7 +2710,7 @@ void genX(CmdDispatchIndirect)(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
const struct brw_cs_prog_data *prog_data = get_cs_prog_data(pipeline);
struct anv_bo *bo = buffer->bo;
uint32_t bo_offset = buffer->offset + offset;
@@ -2670,8 +2726,10 @@ void genX(CmdDispatchIndirect)(
#endif
if (prog_data->uses_num_work_groups) {
cmd_buffer->state.num_workgroups_offset = bo_offset;
cmd_buffer->state.num_workgroups_bo = bo;
cmd_buffer->state.compute.num_workgroups = (struct anv_address) {
.bo = bo,
.offset = bo_offset,
};
}
genX(cmd_buffer_flush_compute_state)(cmd_buffer);
@@ -3138,7 +3196,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
{
cmd_buffer->state.subpass = subpass;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
/* Our implementation of VK_KHR_multiview uses instancing to draw the
* different views. If the client asks for instancing, we need to use the
@@ -3148,7 +3206,18 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
* of each subpass.
*/
if (GEN_GEN == 7)
cmd_buffer->state.vb_dirty |= ~0;
cmd_buffer->state.gfx.vb_dirty |= ~0;
/* It is possible to start a render pass with an old pipeline. Because the
* render pass and subpass index are both baked into the pipeline, this is
* highly unlikely. In order to do so, it requires that you have a render
* pass with a single subpass and that you use that render pass twice
* back-to-back and use the same pipeline at the start of the second render
* pass as at the end of the first. In order to avoid unpredictable issues
* with this edge case, we just dirty the pipeline at the start of every
* subpass.
*/
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_PIPELINE;
/* Perform transitions to the subpass layout before any writes have
* occurred.

View File

@@ -272,5 +272,5 @@ genX(cmd_buffer_so_memcpy)(struct anv_cmd_buffer *cmd_buffer,
prim.BaseVertexLocation = 0;
}
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_PIPELINE;
}

View File

@@ -1081,7 +1081,13 @@ emit_3dstate_streamout(struct anv_pipeline *pipeline,
static uint32_t
get_sampler_count(const struct anv_shader_bin *bin)
{
return DIV_ROUND_UP(bin->bind_map.sampler_count, 4);
uint32_t count_by_4 = DIV_ROUND_UP(bin->bind_map.sampler_count, 4);
/* We can potentially have way more than 32 samplers and that's ok.
* However, the 3DSTATE_XS packets only have 3 bits to specify how
* many to pre-fetch and all values above 4 are marked reserved.
*/
return MIN2(count_by_4, 4);
}
static uint32_t
@@ -1345,10 +1351,10 @@ has_color_buffer_write_enabled(const struct anv_pipeline *pipeline,
if (binding->set != ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS)
continue;
const VkPipelineColorBlendAttachmentState *a =
&blend->pAttachments[binding->index];
if (binding->index == UINT32_MAX)
continue;
if (binding->index != UINT32_MAX && a->colorWriteMask != 0)
if (blend->pAttachments[binding->index].colorWriteMask != 0)
return true;
}

View File

@@ -409,20 +409,6 @@ void genX(CmdBeginQuery)(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_query_pool, pool, queryPool);
/* Workaround: When meta uses the pipeline with the VS disabled, it seems
* that the pipelining of the depth write breaks. What we see is that
* samples from the render pass clear leaks into the first query
* immediately after the clear. Doing a pipecontrol with a post-sync
* operation and DepthStallEnable seems to work around the issue.
*/
if (cmd_buffer->state.need_query_wa) {
cmd_buffer->state.need_query_wa = false;
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.DepthCacheFlushEnable = true;
pc.DepthStallEnable = true;
}
}
switch (pool->type) {
case VK_QUERY_TYPE_OCCLUSION:
emit_ps_depth_count(cmd_buffer, &pool->bo, query * pool->stride + 8);

View File

@@ -56,6 +56,7 @@ header = """/* GLXEXT is the define used in the xserver when the GLX extension i
#endif
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "main/glheader.h"
@@ -144,6 +145,19 @@ _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
return disp;
}
void
_glapi_table_patch(struct _glapi_table *table, const char *name, void *wrapper)
{
for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
if (!strcmp(_glapi_table_func_names[func_index], name)) {
((void **)table)[func_index] = wrapper;
return;
}
}
fprintf(stderr, "could not patch %s in dispatch table\\n", name);
}
"""

View File

@@ -161,6 +161,9 @@ _glapi_get_proc_name(unsigned int offset);
#if defined(GLX_USE_APPLEGL) || defined(GLX_USE_WINDOWSGL)
_GLAPI_EXPORT struct _glapi_table *
_glapi_create_table_from_handle(void *handle, const char *symbol_prefix);
_GLAPI_EXPORT void
_glapi_table_patch(struct _glapi_table *, const char *name, void *wrapper);
#endif

View File

@@ -57,7 +57,7 @@ mesa_dri_drivers_la_LDFLAGS = \
-module \
-no-undefined \
-avoid-version \
-Wl,--build-id=sha1 \
$(LD_BUILD_ID) \
$(BSYMBOLIC) \
$(GC_SECTIONS) \
$(LD_NO_UNDEFINED)

View File

@@ -105,7 +105,6 @@ BUILT_SOURCES = $(i965_oa_GENERATED_FILES)
CLEANFILES = $(BUILT_SOURCES)
EXTRA_DIST = \
meson.build \
brw_oa_hsw.xml \
brw_oa_bdw.xml \
brw_oa_chv.xml \
@@ -118,7 +117,8 @@ EXTRA_DIST = \
brw_oa_glk.xml \
brw_oa_cflgt2.xml \
brw_oa_cflgt3.xml \
brw_oa.py
brw_oa.py \
meson.build
# Note: we avoid using a multi target rule here and outputting both the
# .c and .h files in one go so we don't hit problems with parallel

View File

@@ -320,7 +320,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
enum isl_format dst_isl_format =
brw_blorp_to_isl_format(brw, dst_format, true);
enum isl_aux_usage dst_aux_usage =
intel_miptree_render_aux_usage(brw, dst_mt, dst_isl_format, false);
intel_miptree_render_aux_usage(brw, dst_mt, dst_isl_format,
false, false);
const bool dst_clear_supported = dst_aux_usage != ISL_AUX_USAGE_NONE;
intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
dst_aux_usage, dst_clear_supported);
@@ -1267,9 +1268,10 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
irb->mt, irb->mt_level, irb->mt_layer, num_layers);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format, false);
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
false, false);
intel_miptree_prepare_render(brw, irb->mt, level, irb->mt_layer,
num_layers, isl_format, false);
num_layers, aux_usage);
struct isl_surf isl_tmp[2];
struct blorp_surf surf;
@@ -1288,7 +1290,7 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
blorp_batch_finish(&batch);
intel_miptree_finish_render(brw, irb->mt, level, irb->mt_layer,
num_layers, isl_format, false);
num_layers, aux_usage);
}
return;

View File

@@ -177,7 +177,7 @@ brw_dispatch_compute_common(struct gl_context *ctx)
brw_validate_textures(brw);
brw_predraw_resolve_inputs(brw, false);
brw_predraw_resolve_inputs(brw, false, NULL);
/* Flush the batch if the batch/state buffers are nearly full. We can
* grow them if needed, but this is not free, so we'd like to avoid it.

View File

@@ -73,6 +73,7 @@
#include "tnl/t_pipeline.h"
#include "util/ralloc.h"
#include "util/debug.h"
#include "util/disk_cache.h"
#include "isl/isl.h"
/***************************************
@@ -1129,6 +1130,8 @@ intelDestroyContext(__DRIcontext * driContextPriv)
driDestroyOptionCache(&brw->optionCache);
disk_cache_destroy(brw->ctx.Cache);
/* free the Mesa context */
_mesa_free_context_data(&brw->ctx);
@@ -1282,6 +1285,21 @@ intel_resolve_for_dri2_flush(struct brw_context *brw,
intel_miptree_prepare_external(brw, rb->mt);
} else {
intel_renderbuffer_downsample(brw, rb);
/* Call prepare_external on the single-sample miptree to do any
* needed resolves prior to handing it off to the window system.
* This is needed in the case that rb->singlesample_mt is Y-tiled
* with CCS_E enabled but without I915_FORMAT_MOD_Y_TILED_CCS_E. In
* this case, the MSAA resolve above will write compressed data into
* rb->singlesample_mt.
*
* TODO: Some day, if we decide to care about the tiny performance
* hit we're taking by doing the MSAA resolve and then a CCS resolve,
* we could detect this case and just allocate the single-sampled
* miptree without aux. However, that would be a lot of plumbing and
* this is a rather exotic case so it's not really worth it.
*/
intel_miptree_prepare_external(brw, rb->singlesample_mt);
}
}
}

View File

@@ -1290,15 +1290,11 @@ struct brw_context
struct brw_fast_clear_state *fast_clear_state;
/* Array of flags telling if auxiliary buffer is disabled for corresponding
* renderbuffer. If draw_aux_buffer_disabled[i] is set then use of
* auxiliary buffer for gl_framebuffer::_ColorDrawBuffers[i] is
* disabled.
* This is needed in case the same underlying buffer is also configured
* to be sampled but with a format that the sampling engine can't treat
* compressed or fast cleared.
/* Array of aux usages to use for drawing. Aux usage for render targets is
* a bit more complex than simply calling a single function so we need some
* way of passing it form brw_draw.c to surface state setup.
*/
bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS];
enum isl_aux_usage draw_aux_usage[MAX_DRAW_BUFFERS];
__DRIcontext *driContext;
struct intel_screen *screen;
@@ -1324,7 +1320,8 @@ void intel_update_renderbuffers(__DRIcontext *context,
__DRIdrawable *drawable);
void intel_prepare_render(struct brw_context *brw);
void brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering);
void brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering,
bool *draw_aux_buffer_disabled);
void intel_resolve_for_dri2_flush(struct brw_context *brw,
__DRIdrawable *drawable);

View File

@@ -185,6 +185,7 @@ read_and_upload(struct brw_context *brw, struct disk_cache *cache,
}
disk_cache_remove(cache, binary_sha1);
ralloc_free(prog_data);
free(buffer);
return false;
}
@@ -236,6 +237,7 @@ read_and_upload(struct brw_context *brw, struct disk_cache *cache,
prog->program_written_to_cache = true;
ralloc_free(prog_data);
free(buffer);
return true;

View File

@@ -341,6 +341,7 @@ brw_merge_inputs(struct brw_context *brw,
*/
static bool
intel_disable_rb_aux_buffer(struct brw_context *brw,
bool *draw_aux_buffer_disabled,
struct intel_mipmap_tree *tex_mt,
unsigned min_level, unsigned num_levels,
const char *usage)
@@ -360,7 +361,7 @@ intel_disable_rb_aux_buffer(struct brw_context *brw,
if (irb && irb->mt->bo == tex_mt->bo &&
irb->mt_level >= min_level &&
irb->mt_level < min_level + num_levels) {
found = brw->draw_aux_buffer_disabled[i] = true;
found = draw_aux_buffer_disabled[i] = true;
}
}
@@ -393,14 +394,12 @@ mark_textures_used_for_txf(BITSET_WORD *used_for_txf,
* enabled depth texture, and flush the render cache for any dirty textures.
*/
void
brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering,
bool *draw_aux_buffer_disabled)
{
struct gl_context *ctx = &brw->ctx;
struct intel_texture_object *tex_obj;
memset(brw->draw_aux_buffer_disabled, 0,
sizeof(brw->draw_aux_buffer_disabled));
BITSET_DECLARE(used_for_txf, MAX_COMBINED_TEXTURE_IMAGE_UNITS);
memset(used_for_txf, 0, sizeof(used_for_txf));
if (rendering) {
@@ -441,7 +440,8 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
}
const bool disable_aux = rendering &&
intel_disable_rb_aux_buffer(brw, tex_obj->mt, min_level, num_levels,
intel_disable_rb_aux_buffer(brw, draw_aux_buffer_disabled,
tex_obj->mt, min_level, num_levels,
"for sampling");
intel_miptree_prepare_texture(brw, tex_obj->mt, view_format,
@@ -482,8 +482,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
tex_obj = intel_texture_object(u->TexObj);
if (tex_obj && tex_obj->mt) {
intel_disable_rb_aux_buffer(brw, tex_obj->mt, 0, ~0,
"as a shader image");
if (rendering) {
intel_disable_rb_aux_buffer(brw, draw_aux_buffer_disabled,
tex_obj->mt, 0, ~0,
"as a shader image");
}
intel_miptree_prepare_image(brw, tex_obj->mt);
@@ -495,7 +498,8 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
}
static void
brw_predraw_resolve_framebuffer(struct brw_context *brw)
brw_predraw_resolve_framebuffer(struct brw_context *brw,
bool *draw_aux_buffer_disabled)
{
struct gl_context *ctx = &brw->ctx;
struct intel_renderbuffer *depth_irb;
@@ -547,11 +551,16 @@ brw_predraw_resolve_framebuffer(struct brw_context *brw)
bool blend_enabled = ctx->Color.BlendEnabled & (1 << i);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
blend_enabled);
blend_enabled,
draw_aux_buffer_disabled[i]);
if (brw->draw_aux_usage[i] != aux_usage) {
brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
brw->draw_aux_usage[i] = aux_usage;
}
intel_miptree_prepare_render(brw, irb->mt, irb->mt_level,
irb->mt_layer, irb->layer_count,
isl_format, blend_enabled);
aux_usage);
brw_cache_flush_for_render(brw, irb->mt->bo,
isl_format, aux_usage);
@@ -620,16 +629,13 @@ brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)
mesa_format mesa_format =
_mesa_get_render_format(ctx, intel_rb_format(irb));
enum isl_format isl_format = brw_isl_format_for_mesa_format(mesa_format);
bool blend_enabled = ctx->Color.BlendEnabled & (1 << i);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
blend_enabled);
enum isl_aux_usage aux_usage = brw->draw_aux_usage[i];
brw_render_cache_add_bo(brw, irb->mt->bo, isl_format, aux_usage);
intel_miptree_finish_render(brw, irb->mt, irb->mt_level,
irb->mt_layer, irb->layer_count,
isl_format, blend_enabled);
aux_usage);
}
}
@@ -732,8 +738,9 @@ brw_prepare_drawing(struct gl_context *ctx,
* and finalizing textures but before setting up any hardware state for
* this draw call.
*/
brw_predraw_resolve_inputs(brw, true);
brw_predraw_resolve_framebuffer(brw);
bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS] = { };
brw_predraw_resolve_inputs(brw, true, draw_aux_buffer_disabled);
brw_predraw_resolve_framebuffer(brw, draw_aux_buffer_disabled);
/* Bind all inputs, derive varying and size information:
*/

View File

@@ -317,6 +317,53 @@ gen7_emit_vs_workaround_flush(struct brw_context *brw)
brw->workaround_bo, 0, 0);
}
/**
* From the PRM, Volume 2a:
*
* "Indirect State Pointers Disable
*
* At the completion of the post-sync operation associated with this pipe
* control packet, the indirect state pointers in the hardware are
* considered invalid; the indirect pointers are not saved in the context.
* If any new indirect state commands are executed in the command stream
* while the pipe control is pending, the new indirect state commands are
* preserved.
*
* [DevIVB+]: Using Invalidate State Pointer (ISP) only inhibits context
* restoring of Push Constant (3DSTATE_CONSTANT_*) commands. Push Constant
* commands are only considered as Indirect State Pointers. Once ISP is
* issued in a context, SW must initialize by programming push constant
* commands for all the shaders (at least to zero length) before attempting
* any rendering operation for the same context."
*
* 3DSTATE_CONSTANT_* packets are restored during a context restore,
* even though they point to a BO that has been already unreferenced at
* the end of the previous batch buffer. This has been fine so far since
* we are protected by these scratch page (every address not covered by
* a BO should be pointing to the scratch page). But on CNL, it is
* causing a GPU hang during context restore at the 3DSTATE_CONSTANT_*
* instruction.
*
* The flag "Indirect State Pointers Disable" in PIPE_CONTROL tells the
* hardware to ignore previous 3DSTATE_CONSTANT_* packets during a
* context restore, so the mentioned hang doesn't happen. However,
* software must program push constant commands for all stages prior to
* rendering anything, so we flag them as dirty.
*/
void
gen10_emit_isp_disable(struct brw_context *brw)
{
brw_emit_pipe_control(brw,
PIPE_CONTROL_ISP_DIS |
PIPE_CONTROL_CS_STALL,
NULL, 0, 0);
brw->vs.base.push_constants_dirty = true;
brw->tcs.base.push_constants_dirty = true;
brw->tes.base.push_constants_dirty = true;
brw->gs.base.push_constants_dirty = true;
brw->wm.base.push_constants_dirty = true;
}
/**
* Emit a PIPE_CONTROL command for gen7 with the CS Stall bit set.

View File

@@ -85,5 +85,6 @@ void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
void brw_emit_depth_stall_flushes(struct brw_context *brw);
void gen7_emit_vs_workaround_flush(struct brw_context *brw);
void gen7_emit_cs_stall_flush(struct brw_context *brw);
void gen10_emit_isp_disable(struct brw_context *brw);
#endif

View File

@@ -460,6 +460,7 @@ brw_program_cache_check_size(struct brw_context *brw)
perf_debug("Exceeded state cache size limit. Clearing the set "
"of compiled programs, which will trigger recompiles\n");
brw_clear_cache(brw, &brw->cache);
brw_cache_new_bo(&brw->cache, brw->cache.bo->size);
}
}

View File

@@ -229,11 +229,6 @@ gen6_update_renderbuffer_surface(struct brw_context *brw,
}
enum isl_format isl_format = brw->mesa_to_isl_render_format[rb_format];
enum isl_aux_usage aux_usage =
brw->draw_aux_buffer_disabled[unit] ? ISL_AUX_USAGE_NONE :
intel_miptree_render_aux_usage(brw, mt, isl_format,
ctx->Color.BlendEnabled & (1 << unit));
struct isl_view view = {
.format = isl_format,
.base_level = irb->mt_level - irb->mt->first_level,
@@ -245,7 +240,8 @@ gen6_update_renderbuffer_surface(struct brw_context *brw,
};
uint32_t offset;
brw_emit_surface_state(brw, mt, mt->target, view, aux_usage,
brw_emit_surface_state(brw, mt, mt->target, view,
brw->draw_aux_usage[unit],
&offset, surf_index,
RELOC_WRITE);
return offset;
@@ -441,25 +437,7 @@ swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
}
static bool
brw_aux_surface_disabled(const struct brw_context *brw,
const struct intel_mipmap_tree *mt)
{
const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
const struct intel_renderbuffer *irb =
intel_renderbuffer(fb->_ColorDrawBuffers[i]);
if (irb && irb->mt == mt)
return brw->draw_aux_buffer_disabled[i];
}
return false;
}
static void
brw_update_texture_surface(struct gl_context *ctx,
static void brw_update_texture_surface(struct gl_context *ctx,
unsigned unit,
uint32_t *surf_offset,
bool for_gather,
@@ -588,9 +566,6 @@ brw_update_texture_surface(struct gl_context *ctx,
enum isl_aux_usage aux_usage =
intel_miptree_texture_aux_usage(brw, mt, format);
if (brw_aux_surface_disabled(brw, mt))
aux_usage = ISL_AUX_USAGE_NONE;
brw_emit_surface_state(brw, mt, mt->target, view, aux_usage,
surf_offset, surf_index,
0);
@@ -1069,7 +1044,7 @@ update_renderbuffer_read_surfaces(struct brw_context *brw)
enum isl_aux_usage aux_usage =
intel_miptree_texture_aux_usage(brw, irb->mt, format);
if (brw->draw_aux_buffer_disabled[i])
if (brw->draw_aux_usage[i] == ISL_AUX_USAGE_NONE)
aux_usage = ISL_AUX_USAGE_NONE;
brw_emit_surface_state(brw, irb->mt, target, view, aux_usage,

View File

@@ -364,11 +364,15 @@ is_passthru_format(uint32_t format)
}
UNUSED static int
uploads_needed(uint32_t format)
uploads_needed(uint32_t format,
bool is_dual_slot)
{
if (!is_passthru_format(format))
return 1;
if (is_dual_slot)
return 2;
switch (format) {
case ISL_FORMAT_R64_PASSTHRU:
case ISL_FORMAT_R64G64_PASSTHRU:
@@ -397,11 +401,19 @@ downsize_format_if_needed(uint32_t format,
if (!is_passthru_format(format))
return format;
/* ISL_FORMAT_R64_PASSTHRU and ISL_FORMAT_R64G64_PASSTHRU with an upload ==
* 1 means that we have been forced to do 2 uploads for a size <= 2. This
* happens with gen < 8 and dvec3 or dvec4 vertex shader input
* variables. In those cases, we return ISL_FORMAT_R32_FLOAT as a way of
* flagging that we want to fill with zeroes this second forced upload.
*/
switch (format) {
case ISL_FORMAT_R64_PASSTHRU:
return ISL_FORMAT_R32G32_FLOAT;
return !upload ? ISL_FORMAT_R32G32_FLOAT
: ISL_FORMAT_R32_FLOAT;
case ISL_FORMAT_R64G64_PASSTHRU:
return ISL_FORMAT_R32G32B32A32_FLOAT;
return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
: ISL_FORMAT_R32_FLOAT;
case ISL_FORMAT_R64G64B64_PASSTHRU:
return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
: ISL_FORMAT_R32G32_FLOAT;
@@ -420,6 +432,15 @@ static int
upload_format_size(uint32_t upload_format)
{
switch (upload_format) {
case ISL_FORMAT_R32_FLOAT:
/* downsized_format has returned this one in order to flag that we are
* performing a second upload which we want to have filled with
* zeroes. This happens with gen < 8, a size <= 2, and dvec3 or dvec4
* vertex shader input variables.
*/
return 0;
case ISL_FORMAT_R32G32_FLOAT:
return 2;
case ISL_FORMAT_R32G32B32A32_FLOAT:
@@ -517,7 +538,7 @@ genX(emit_vertices)(struct brw_context *brw)
struct brw_vertex_element *input = brw->vb.enabled[i];
uint32_t format = brw_get_vertex_surface_type(brw, input->glarray);
if (uploads_needed(format) > 1)
if (uploads_needed(format, input->is_dual_slot) > 1)
nr_elements++;
}
#endif
@@ -613,7 +634,8 @@ genX(emit_vertices)(struct brw_context *brw)
uint32_t comp1 = VFCOMP_STORE_SRC;
uint32_t comp2 = VFCOMP_STORE_SRC;
uint32_t comp3 = VFCOMP_STORE_SRC;
const unsigned num_uploads = GEN_GEN < 8 ? uploads_needed(format) : 1;
const unsigned num_uploads = GEN_GEN < 8 ?
uploads_needed(format, input->is_dual_slot) : 1;
#if GEN_GEN >= 8
/* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):

View File

@@ -764,6 +764,10 @@ brw_finish_batch(struct brw_context *brw)
brw_emit_pipe_control_flush(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH |
PIPE_CONTROL_CS_STALL);
}
/* Do not restore push constant packets during context restore. */
if (devinfo->gen == 10)
gen10_emit_isp_disable(brw);
}
/* Emit MI_BATCH_BUFFER_END to finish our batch. Note that execbuf2

View File

@@ -2682,10 +2682,14 @@ enum isl_aux_usage
intel_miptree_render_aux_usage(struct brw_context *brw,
struct intel_mipmap_tree *mt,
enum isl_format render_format,
bool blend_enabled)
bool blend_enabled,
bool draw_aux_disabled)
{
struct gen_device_info *devinfo = &brw->screen->devinfo;
if (draw_aux_disabled)
return ISL_AUX_USAGE_NONE;
switch (mt->aux_usage) {
case ISL_AUX_USAGE_MCS:
assert(mt->mcs_buf);
@@ -2724,11 +2728,8 @@ void
intel_miptree_prepare_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled)
enum isl_aux_usage aux_usage)
{
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, mt, render_format, blend_enabled);
intel_miptree_prepare_access(brw, mt, level, 1, start_layer, layer_count,
aux_usage, aux_usage != ISL_AUX_USAGE_NONE);
}
@@ -2737,13 +2738,10 @@ void
intel_miptree_finish_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled)
enum isl_aux_usage aux_usage)
{
assert(_mesa_is_format_color_format(mt->format));
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, mt, render_format, blend_enabled);
intel_miptree_finish_write(brw, mt, level, start_layer, layer_count,
aux_usage);
}

View File

@@ -652,19 +652,18 @@ enum isl_aux_usage
intel_miptree_render_aux_usage(struct brw_context *brw,
struct intel_mipmap_tree *mt,
enum isl_format render_format,
bool blend_enabled);
bool blend_enabled,
bool draw_aux_disabled);
void
intel_miptree_prepare_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled);
enum isl_aux_usage aux_usage);
void
intel_miptree_finish_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled);
enum isl_aux_usage aux_usage);
void
intel_miptree_prepare_depth(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,

View File

@@ -1776,8 +1776,8 @@ intel_init_bufmgr(struct intel_screen *screen)
return false;
}
if (!intel_get_boolean(screen, I915_PARAM_HAS_WAIT_TIMEOUT)) {
fprintf(stderr, "[%s: %u] Kernel 3.6 required.\n", __func__, __LINE__);
if (!intel_get_boolean(screen, I915_PARAM_HAS_EXEC_NO_RELOC)) {
fprintf(stderr, "[%s: %u] Kernel 3.9 required.\n", __func__, __LINE__);
return false;
}

View File

@@ -67,12 +67,17 @@ endif
# This needs to be installed if any dri drivers (including gallium dri drivers)
# are built.
if with_dri
dri_req_private = []
if dep_libdrm.found()
dri_req_private = ['libdrm >= 2.4.75'] # FIXME: don't hardcode this
endif
pkg.generate(
name : 'dri',
filebase : 'dri',
description : 'Direct Rendering Infrastructure',
version : meson.project_version(),
variables : ['dridriverdir=${prefix}/' + dri_drivers_path],
requires_private : ['libdrm >= 2.4.75'], # FIXME: don't hardcode this
requires_private : dri_req_private,
)
endif

View File

@@ -330,12 +330,6 @@ r100CreateContext( gl_api api,
rmesa->radeon.do_usleeps = (fthrottle_mode == DRI_CONF_FTHROTTLE_USLEEPS);
#if DO_DEBUG
RADEON_DEBUG = parse_debug_string( getenv( "RADEON_DEBUG" ),
debug_control );
#endif
tcl_mode = driQueryOptioni(&rmesa->radeon.optionCache, "tcl_mode");
if (driQueryOptionb(&rmesa->radeon.optionCache, "no_rast")) {
fprintf(stderr, "disabling 3D acceleration\n");

View File

@@ -1568,15 +1568,9 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format,
if (format == GL_RGBA && type == GL_UNSIGNED_SHORT_4_4_4_4 && !swapBytes)
return GL_TRUE;
if (format == GL_RGBA && type == GL_UNSIGNED_SHORT_4_4_4_4_REV && swapBytes)
return GL_TRUE;
if (format == GL_ABGR_EXT && type == GL_UNSIGNED_SHORT_4_4_4_4_REV && !swapBytes)
return GL_TRUE;
if (format == GL_ABGR_EXT && type == GL_UNSIGNED_SHORT_4_4_4_4 && swapBytes)
return GL_TRUE;
return GL_FALSE;
case MESA_FORMAT_R4G4B4A4_UNORM:

View File

@@ -721,7 +721,7 @@ libmesa_gallium = static_library(
cpp_args : [cpp_vis_args, cpp_msvc_compat_args],
include_directories : [inc_common, include_directories('main')],
link_with : [libglsl, libmesa_sse41],
dependencies : idep_nir_headers,
dependencies : [idep_nir_headers, dep_vdpau],
build_by_default : false,
)

View File

@@ -757,8 +757,8 @@ st_init_driver_functions(struct pipe_screen *screen,
functions->UpdateState = st_invalidate_state;
functions->QueryMemoryInfo = st_query_memory_info;
functions->SetBackgroundContext = st_set_background_context;
functions->GetDriverUuid = st_get_device_uuid;
functions->GetDeviceUuid = st_get_driver_uuid;
functions->GetDriverUuid = st_get_driver_uuid;
functions->GetDeviceUuid = st_get_device_uuid;
/* GL_ARB_get_program_binary */
functions->GetProgramBinaryDriverSHA1 = st_get_program_binary_driver_sha1;

View File

@@ -142,10 +142,11 @@ read_stream_out_from_cache(struct blob_reader *blob_reader,
static void
read_tgsi_from_cache(struct blob_reader *blob_reader,
const struct tgsi_token **tokens)
const struct tgsi_token **tokens,
unsigned *num_tokens)
{
uint32_t num_tokens = blob_read_uint32(blob_reader);
unsigned tokens_size = num_tokens * sizeof(struct tgsi_token);
*num_tokens = blob_read_uint32(blob_reader);
unsigned tokens_size = *num_tokens * sizeof(struct tgsi_token);
*tokens = (const struct tgsi_token*) MALLOC(tokens_size);
blob_copy_bytes(blob_reader, (uint8_t *) *tokens, tokens_size);
}
@@ -175,7 +176,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
sizeof(stvp->result_to_output));
read_stream_out_from_cache(&blob_reader, &stvp->tgsi);
read_tgsi_from_cache(&blob_reader, &stvp->tgsi.tokens);
read_tgsi_from_cache(&blob_reader, &stvp->tgsi.tokens,
&stvp->num_tgsi_tokens);
if (st->vp == stvp)
st->dirty |= ST_NEW_VERTEX_PROGRAM(st, stvp);
@@ -189,7 +191,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
&sttcp->variants, &sttcp->tgsi);
read_stream_out_from_cache(&blob_reader, &sttcp->tgsi);
read_tgsi_from_cache(&blob_reader, &sttcp->tgsi.tokens);
read_tgsi_from_cache(&blob_reader, &sttcp->tgsi.tokens,
&sttcp->num_tgsi_tokens);
if (st->tcp == sttcp)
st->dirty |= sttcp->affected_states;
@@ -203,7 +206,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
&sttep->variants, &sttep->tgsi);
read_stream_out_from_cache(&blob_reader, &sttep->tgsi);
read_tgsi_from_cache(&blob_reader, &sttep->tgsi.tokens);
read_tgsi_from_cache(&blob_reader, &sttep->tgsi.tokens,
&sttep->num_tgsi_tokens);
if (st->tep == sttep)
st->dirty |= sttep->affected_states;
@@ -217,7 +221,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
&stgp->tgsi);
read_stream_out_from_cache(&blob_reader, &stgp->tgsi);
read_tgsi_from_cache(&blob_reader, &stgp->tgsi.tokens);
read_tgsi_from_cache(&blob_reader, &stgp->tgsi.tokens,
&stgp->num_tgsi_tokens);
if (st->gp == stgp)
st->dirty |= stgp->affected_states;
@@ -229,7 +234,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
st_release_fp_variants(st, stfp);
read_tgsi_from_cache(&blob_reader, &stfp->tgsi.tokens);
read_tgsi_from_cache(&blob_reader, &stfp->tgsi.tokens,
&stfp->num_tgsi_tokens);
if (st->fp == stfp)
st->dirty |= stfp->affected_states;
@@ -242,7 +248,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
st_release_cp_variants(st, stcp);
read_tgsi_from_cache(&blob_reader,
(const struct tgsi_token**) &stcp->tgsi.prog);
(const struct tgsi_token**) &stcp->tgsi.prog,
&stcp->num_tgsi_tokens);
stcp->tgsi.req_local_mem = stcp->Base.info.cs.shared_size;
stcp->tgsi.req_private_mem = 0;

View File

@@ -74,7 +74,8 @@ struct vbo_save_vertex_list {
GLuint current_size;
GLuint buffer_offset; /**< in bytes */
GLuint vertex_count;
GLuint start_vertex; /**< first vertex used by any primitive */
GLuint vertex_count; /**< number of vertices in this list */
GLuint wrap_count; /* number of copied vertices at start */
GLboolean dangling_attr_ref; /* current attr implicitly referenced
outside the list */

View File

@@ -558,6 +558,9 @@ compile_vertex_list(struct gl_context *ctx)
for (unsigned i = 0; i < save->prim_count; i++) {
save->prims[i].start += start_offset;
}
node->start_vertex = start_offset;
} else {
node->start_vertex = 0;
}
/* Reset our structures for the next run of vertices:

View File

@@ -325,13 +325,14 @@ vbo_save_playback_vertex_list(struct gl_context *ctx, void *data)
_mesa_update_state(ctx);
if (node->vertex_count > 0) {
GLuint min_index = node->start_vertex;
GLuint max_index = min_index + node->vertex_count - 1;
vbo_context(ctx)->draw_prims(ctx,
node->prims,
node->prim_count,
NULL,
GL_TRUE,
0, /* Node is a VBO, so this is ok */
node->vertex_count - 1,
min_index, max_index,
NULL, 0, NULL);
}
}

View File

@@ -58,7 +58,18 @@ build_id_find_nhdr_callback(struct dl_phdr_info *info, size_t size, void *data_)
{
struct callback_data *data = data_;
if ((void *)info->dlpi_addr != data->dli_fbase)
/* Calculate address where shared object is mapped into the process space.
* (Using the base address and the virtual address of the first LOAD segment)
*/
void *map_start = NULL;
for (unsigned i = 0; i < info->dlpi_phnum; i++) {
if (info->dlpi_phdr[i].p_type == PT_LOAD) {
map_start = (void *)(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr);
break;
}
}
if (map_start != data->dli_fbase)
return 0;
for (unsigned i = 0; i < info->dlpi_phnum; i++) {

View File

@@ -112,8 +112,12 @@ libxmlconfig = static_library(
files_xmlconfig,
include_directories : inc_common,
dependencies : [dep_expat, dep_m],
c_args : [c_msvc_compat_args, c_vis_args,
'-DSYSCONFDIR="@0@"'.format(get_option('sysconfdir'))],
c_args : [
c_msvc_compat_args, c_vis_args,
'-DSYSCONFDIR="@0@"'.format(
join_paths(get_option('prefix'), get_option('sysconfdir'))
),
],
build_by_default : false,
)