Compare commits

...

426 Commits
21.0 ... 19.1

Author SHA1 Message Date
Juan A. Suarez Romero
cc88eeb6ff docs: add release notes for 19.1.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-21 19:10:28 +02:00
Juan A. Suarez Romero
5c6d266c59 docs: add release notes for 19.1.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-21 13:55:11 +02:00
Juan A. Suarez Romero
6cffdfd192 Update version to 19.1.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-21 11:33:41 +00:00
Alan Coopersmith
dfcde49122 intel/common: include unistd.h for ioctl() prototype on Solaris
Fixes build errors of:
In file included from ../src/intel/vulkan/anv_private.h:48,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
../src/intel/common/gen_gem.h: In function ‘gen_ioctl’:
../src/intel/common/gen_gem.h:68:15: error: implicit declaration of function ‘ioctl’ [-Werror=implicit-function-declaration]
   68 |         ret = ioctl(fd, request, arg);
      |               ^~~~~
In file included from ../include/c11/threads_posix.h:35,
                 from ../include/c11/threads.h:66,
                 from ../src/mesa/main/mtypes.h:39,
                 from ../src/intel/compiler/brw_compiler.h:30,
                 from ../src/intel/vulkan/anv_private.h:51,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
/usr/include/unistd.h: At top level:
/usr/include/unistd.h:471:12: error: conflicting types for ‘ioctl’
  471 | extern int ioctl(int, int, ...);
      |            ^~~~~
/usr/include/unistd.h:471:1: note: a parameter list with an ellipsis can’t match an empty parameter name list declaration
  471 | extern int ioctl(int, int, ...);
      | ^~~~~~
In file included from ../src/intel/vulkan/anv_private.h:48,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
../src/intel/common/gen_gem.h:68:15: note: previous implicit declaration of ‘ioctl’ was here
   68 |         ret = ioctl(fd, request, arg);
      |               ^~~~~

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 6804b8e1ff)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/intel/common/gen_gem.h
2019-10-16 17:36:16 +02:00
Alan Coopersmith
9c100e31a2 meson: recognize "sunos" as the system name for Solaris
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit d8a9420f6f)
[Juan A. Suarez: resolve trivial conflict]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	meson.build
2019-10-16 15:32:51 +00:00
Matt Turner
2fd001f21e util: Drop preprocessor guards for glibc-2.12
glibc-2.12 was released in 2010. No one is building new Mesa against 9
year old glibc, and removing these checks allows the code to work on
other C libraries like musl.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 9c411e020d)
2019-10-16 15:26:21 +00:00
Alan Coopersmith
13120904e4 util: Workaround lack of flock on Solaris
v2: Replace autoconf check for flock() with meson check

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit b3028a9fb8)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	meson.build
2019-10-16 14:52:59 +00:00
Alan Coopersmith
740b0e9dc7 util: Make Solaris implemention of p_atomic_add work with gcc
gcc is very particular about where you place the (void) cast
The previous placement made it error out with:

In file included from disk_cache.c:40:0:
../../src/util/u_atomic.h:203:29: error: void value not ignored as it ought to be
 #define p_atomic_add(v, i) ((void)         \
                              ^
disk_cache.c:658:4: note: in expansion of macro ‘p_atomic_add’
    p_atomic_add(cache->size, size);
    ^

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit a56c3e3a47)
2019-10-16 14:49:34 +00:00
Alan Coopersmith
9eaa6998cc c99_compat.h: Don't try to use 'restrict' in C++ code
Fixes build failures on Solaris in C++ files using gcc:

../src/util/u_math.h:628:41: error: expected ‘,’ or ‘...’ before ‘dest’
  628 | util_memcpy_cpu_to_le32(void * restrict dest, const void * restrict src, size_t n)
      |                                         ^~~~
../src/util/u_math.h: In function ‘void* util_memcpy_cpu_to_le32(void*)’:
../src/util/u_math.h:641:18: error: ‘dest’ was not declared in this scope
  641 |    return memcpy(dest, src, n);
      |                  ^~~~
../src/util/u_math.h:641:24: error: ‘src’ was not declared in this scope
  641 |    return memcpy(dest, src, n);
      |                        ^~~
../src/util/u_math.h:641:29: error: ‘n’ was not declared in this scope; did you mean ‘yn’?
  641 |    return memcpy(dest, src, n);
      |                             ^
      |                             yn

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit ddde652e70)
2019-10-16 14:46:47 +00:00
Juan A. Suarez Romero
e56b3afd2d cherry-ignore: Revert "radv: disable viewport clamping even if FS doesn't write Z"
Revert: this commit was explicitly requested to be removed from the
branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-07 14:30:05 +00:00
Lionel Landwerlin
d169d0df0e intel/isl: Set null surface format to R32_UINT
It appears we never had a test in piglit or deqp sampling from a null
surface...

It turns out this triggers a hang on IVB only. Updating the null
surface format to R32_UINT fixes the hang on ivb and doesn't affect
other platforms, so set it by default for all platforms.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c445d6f66e)
2019-10-07 16:27:08 +02:00
Lionel Landwerlin
3d763e801c intel: fix subslice computation from topology data
We're missing the offset of the slice in the subslice mask...

This worked for most platforms that don't have first slice fused off
because we would reread the same mask from slice0 again and again...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit d36763b2a4)
2019-10-07 16:27:07 +02:00
Prodea Alexandru-Liviu
9b75c1eaef scons/MSYS2-MinGW-W64: Fix build options defaults
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>

When building in a MSYS2 Mingw-w64 environment Mesa3D sets wrong default build options which inevitably lead to build failure.

(cherry picked from commit 6309c31fd8)
2019-10-07 16:27:07 +02:00
Dylan Baker
7e3d942403 meson: Only error building gallium video without libdrm when the platform is drm
Fixes: 3b265f61f5
       ("meson: gallium media state trackers require libdrm with x11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1878
Tested-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 1481d05409)
2019-10-07 16:27:07 +02:00
Andres Gomez
142e51da08 egl: Remove the 565 pbuffer-only EGL config under X11.
The CTS finally has agreed to drop the requirement for a
565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the
code to satisfy this requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

This reverts commit 82607f8a90,
commit 6ad31c4ff3 and
commit dacb11a585.

v2:
  - Reference the VK-GL-CTS issue (Eric E.).

v3:
  - Don't revert
    fc21394bc4 ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers")
    (Kenneth).

References: VK-GL-CTS issue 1601.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 02c265be9d)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/egl/drivers/dri2/platform_x11.c
2019-10-07 16:27:07 +02:00
Juan A. Suarez Romero
844c594837 cherry-ignore: radv: Fix condition for skipping the continue CS.
Fixes: this commit depends on commit e1dc3ab753 in order to compile,
which did not land in the branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-07 16:27:07 +02:00
Lionel Landwerlin
9fc609585d mesa: don't forget to clear _Layer field on texture unit
On the Android Antutu benchmark we ran into an assert in ISL where the
(base layer + num layers) > total layers. It turns out the core of
mesa forgot to clear the _Layer variable, potentially leaving an
inconsistent value.

v2: Pull setting u->_Layer out of the conditional blocks (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2208d79dde)
2019-10-02 09:41:27 -04:00
Ken Mays
ab1ae12790 haiku: fix Mesa build
1. The hgl.c file is a read-only file versus read-write.
Ref: src/gallium/state_trackers/hgl/hgl.c

2.  I've included the Haiku-specific patches I used to get a successful
build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure.
Shows "[764/764] linking target ... libswpipe.so" at build completion.

v2:
Remove autotools files (Eric)

v3:
Update the patch

Reported-by: Ken Mays <kmays2000@gmail.com>
Tested-by: Ken Mays <kmays2000@gmail.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com>
(cherry picked from commit 4943c89d6d)
2019-10-02 09:41:27 -04:00
Kenneth Graunke
9bc34d54db iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.
We can't just check for the BO base address, we need to check for the
full address including any offset we may have applied.  When updating
the address, we need to include the offset again.

Fixes: 5ad0c88dbe ("iris: Replace buffer backing storage and rebind to update addresses.")
(cherry picked from commit 309924c3c9)
2019-10-02 09:41:27 -04:00
Dylan Baker
f7338bfe1f meson: gallium media state trackers require libdrm with x11
v2: - update copyright year in all changed files
    - rebase on master

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 3b265f61f5)
2019-10-02 09:41:27 -04:00
Kenneth Graunke
2d584d7386 iris: Disable CCS_E for 32-bit floating point textures.
A while back, Michael Larabel noticed that Paraview's Wavelet Volume
case runs significantly slower on iris than i965.  It turns out this
is because we enable CCS_E for 32-bit floating point formats, while
i965 disables it, with an oblique comment saying that we benchmarked
it (on what exactly?) and determined that it was a loss.

Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed
large framerate drops when enabling CCS_E for either format.  However,
several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit
floating point formats, with no apparent ill effects.

So, disable compression for 32-bit float formats for now, but leave it
enabled for 16-bit float formats as they seem to be working fine.

Improves performance in Paraview's Wavelet Volume test by 62% on a
Skylake GT4e.

Fixes: 3cfc6a207b ("iris: Fill out res->aux.possible_usages")
(cherry picked from commit a0a93763fb)
2019-10-02 09:41:27 -04:00
pal1000
2739dd9621 scons: Fix MSYS2 Mingw-w64 build.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

This patch is based on 28e3f85e09/mingw-w64-mesa/link-ole32.patch but with tweaks to avoid MSVC build break when applied.

v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation;

v3: Fix obviously wrong compiler flags for swr driver;

v4: Update original patch URL because it has been relocated;

v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway;

v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.
(cherry picked from commit ffb0d3a25c)
2019-10-02 09:41:27 -04:00
pal1000
e53ca66c4a scons/windows: Support build with LLVM 9.
As X86AsmPrinter component is gone, LLVMX86AsmPrinter got replaced
with LLVMRemarks, LLVMBitstreamReader and LLVMDebugInfoDWARF.

Tests done with llvm-config on both LLVM 8 and 9 indicate that
mcjit, bitwriter and x86asmprinter fully fit inside engine component.

On other platforms and with meson build mcdisassembler was used to replace
X86AsmPrinter but mcdisassembler also fully fits inside engine component
for LLVM>=8 according to same tests.

v2: Avoid duplicating code related to Mingw pthreads.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>

On 19.1 this patch does not apply cleanly without 88eb2a1f

(cherry picked from commit bcb4dfb14b)
2019-10-02 09:41:27 -04:00
Michel Zou
7c0ce1b35e scons: For MinGW use -posix flag.
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 88eb2a1f7e)
2019-10-02 09:41:27 -04:00
Michel Zou
09ba783aea scons: add py3 support
SCons 3.1 has moved to python 3, requiring this fix
to continue supporting scons builds.

Closes: #944
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 3f92d17894)
2019-10-02 09:41:27 -04:00
Andrii Simiklit
70ef5d63f7 glsl: disallow incompatible matrices multiplication
glsl 4.4 spec section '5.9 expressions':
"The operator is multiply (*), where both operands are matrices or one operand is a vector and the
 other a matrix. A right vector operand is treated as a column vector and a left vector operand as a
 row vector. In all these cases, it is required that the number of columns of the left operand is equal
 to the number of rows of the right operand. Then, the multiply (*) operation does a linear
 algebraic multiply, yielding an object that has the same number of rows as the left operand and the
 same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations”
 explains in more detail how vectors and matrices are operated on."

This fix disallows a multiplication of incompatible matrices like:
mat4x3(..) * mat4x3(..)
mat4x2(..) * mat4x2(..)
mat3x2(..) * mat3x2(..)
....

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit b32bb888c7)
2019-10-02 09:41:27 -04:00
Jason Ekstrand
f041840367 intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates
Without this, we were DCEing flag writes because we didn't think their
results were used because we didn't understand that an ANY32 predicate
actually read all the flags.

Fixes: df1aec763e "i965/fs: Define methods to calculate the flag..."
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6c858b9a91)
2019-10-02 09:41:27 -04:00
Dylan Baker
4a50b8add1 meson: Link xvmc with libxv
Prior to xvmc 1.0.12 libxvmc incorrectly required libxv, but that was
fixed. This results in compilation failures for the gallium xvmc tracker
and tools. This patch fixes that by explicitly linking to libxv.

Fixes: 22a817af8a
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1844
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit e456a053c3)
2019-10-02 09:41:27 -04:00
Dylan Baker
b30a0afc0c meson: Try finding libxvmcw via pkg-config before using find_library
This fixes cross compiling issues, because pkg-config is less likely to
get the wrong libs.

v2: - Fix typo in comment

Fixes: 22a817af8a
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/939
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 8c5c21d7e3)
2019-10-02 09:41:27 -04:00
Andreas Gottschling
8118131f37 drisw: Fix shared memory leak on drawable resize
XDestroyImage will mark the segment as to-be-destroyed, but it will
persist until we detach it, and we weren't doing so.

Cc: mesa-stable@lists.freedesktop.org
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/121
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit c5a2ccec5e)
2019-10-02 09:41:27 -04:00
Michel Dänzer
950d167026 radeonsi: fix VAAPI segfault due to various bugs
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236
(cherry picked from commit 67d930d64b)
2019-10-02 09:41:27 -04:00
Dylan Baker
e37019723f meson: fix logic for generating .pc files with old glvnd
We want to generate PC files for non-glvnd builds and for builds with
old glvnd, but the current logic doesn't do that, it builds them
unconditionally, and for GLES it builds the shared libraries, which is
also not what we want. This does not generate .pc files for gles1 or
gles2. Which it we weren't doing before either, making this not a
regression but a return to status-quo.o

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838
Fixes: 93df862b6a
       ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit fafd20f67d)
2019-10-02 09:41:27 -04:00
Lionel Landwerlin
450b808eea intel: use proper label for Comet Lake skus
Fixes: 82f6a746e8 ("intel: Add support for Comet Lake")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 813f3460e7)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	include/pci_ids/i965_pci_ids.h
2019-10-02 09:41:27 -04:00
Lionel Landwerlin
3b927c447f anv: gem-stubs: return a valid fd got anv_gem_userptr()
Fixes invalid close(-1) in the unit tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit da2d67fc3b)
2019-10-02 09:41:27 -04:00
Tapani Pälli
52dc974cd1 util: fix os_create_anonymous_file on android
Commit fixes current crashes with Vulkan applications on Android.

Fixes: c0376a1234 "util: add anon_file.h for all memfd/temp file usage"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit ce8fd042a5)
2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero
26ab4e1614 cherry-ignore: util: added missing headers in anon-file
Fixes: The commit was reverted later.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-02 09:41:27 -04:00
Eric Engestrom
e5e81d6530 util/anon_file: const string param
Fixes: c0376a1234 ("util: add anon_file.h for all memfd/temp file usage")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Eric Anholt <eric@anholt.net>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
(cherry picked from commit 525a917c6c)
2019-10-02 09:41:27 -04:00
Eric Engestrom
b13396622c util/anon_file: add missing #include
Fixes: c0376a1234 ("util: add anon_file.h for all memfd/temp file usage")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Eric Anholt <eric@anholt.net>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
(cherry picked from commit 60af7f5a81)
2019-10-02 09:41:27 -04:00
Greg V
bb22ac12d6 util: add anon_file.h for all memfd/temp file usage
Move the Weston os_create_anonymous_file code from egl/wayland into util,
add support for Linux memfd and FreeBSD SHM_ANON,
use that code in anv/aubinator instead of explicit memfd calls for portability.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit c0376a1234)
2019-10-02 09:41:27 -04:00
Danylo Piliaiev
2963e9fa3d st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader
RET as a last instruction could be safely ignored.
Remove it to prevent crashes/warnings in case underlying driver
doesn't implement arbitrary returns.

A better way would be to remove the RET after the whole shader
is parsed which will handle a possible case when the last RET is
followed by a comment.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
(cherry picked from commit 2d8f77db83)
2019-10-02 09:41:27 -04:00
Eric Engestrom
a74657d4aa meson: re-add incorrect pkg-config files with GLVND for backward compatibility
This is a bit counter-intuitive, but the issue is that GLVND is broken
in versions <= 1.1.1, so we need to keep wrongly providing these files
to cover up their mistake, otherwise the rest of the world ends up
broken.

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 93df862b6a)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/egl/meson.build
2019-10-02 09:41:27 -04:00
Erik Faye-Lund
3a0b77e3f7 glsl: correct bitcast-helpers
Without this, we'll incorrectly round off huge values to the nearest
representable double instead of keeping it at the exact value  as
we're supposed to.

Found by inspecting compiler-warnings.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 85faf5082f ("glsl: Add 64-bit integer support for constant expressions")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 88f909eb37)
2019-10-02 09:41:27 -04:00
Rhys Perry
ef35babd33 nir/opt_remove_phis: handle phis with no sources
This can happen with loops with unreachable exits which are later
optimized away.

Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 12372d60ff)
2019-10-02 09:41:27 -04:00
Marek Olšák
954ace9e3e gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH
because vl doesn't call flush_resource and I wasn't able to find
all places where flush_resource needs to be called.

This fixes corrupted / unflushed surfaces with fullscreen videos on Raven.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f52afdf672)
2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero
4f48aaf50a cherry-ignore: nir/opt_large_constants: Handle store writemasks
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-02 09:41:27 -04:00
Stephen Barber
d34ae3876a nouveau: add idep_nir_headers as dep for libnouveau
Fixes a compilation error when building libnouveau:

In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25:
../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory
 #include "nir_intrinsics.h"
           ^~~~~~~~~~~~~~~~~~
           compilation terminated.

Fixes: f014ae3c7c ("nouveau: add support for nir")
Signed-off-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 8c3ace6991)
2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero
895f0a2ca2 bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars
The script only handles commits with "Fixes: <sha1>" where <sha1> is
equal or great than 8 chars. But <sha1> can be smaller, like 7 chars.

This commit relax the restriction to handle <sha1> 4 or more chars.

Fixes: 533fead423 ("bin/get-pick-list.sh: tweak the commit sha matching pattern")

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit b3c25e6f99)
2019-10-02 09:41:27 -04:00
Bas Nieuwenhuizen
9ee9251ef8 radv: Add workaround for hang in The Surge 2.
Released today and hangs on RADV. We don't have the root cause yet,
but this should unblock people playing the game.

No drirc because the radv debugflags are not usable from drirc and
I want this backported.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 780182f0a0)
2019-10-02 09:41:27 -04:00
Kenneth Graunke
b977c7444c intel: Increase Gen11 compute shader scratch IDs to 64.
From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9a ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b9e93db208)
2019-10-02 09:41:27 -04:00
Jason Ekstrand
c94e5e3ee1 nir/repair_ssa: Replace the unreachable check with the phi builder
In a3268599f3, I attempted to fix nir_repair_ssa for unreachable
blocks.  However, that commit missed the possibility that the use is in
a block which, itself, is unreachable.  In this case, we can end up in
an infinite loop trying to replace a def with itself.  Even though a
no-op replacement is a fine operation, it keeps extending the end of the
uses list as we're walking it.  Instead of explicitly checking for the
group of conditions, just check if the phi builder gives us a different
def.  That's guaranteed to be 100% reliable and, while it lacks symmetry
with the is_valid checks, should be more reliable.

Fixes: a3268599 "nir/repair_ssa: Repair dominance for unreachable..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d63162cff0)
2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero
c140c260c1 cherry-ignore: Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
revert: The following commit was requested to be removed from stable
branch by original author.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero
34ee0eb6cc Revert "Revert "intel/fs: Move the scalar-region conversion to the generator.""
This reverts commit 667920050a.

This commit was breaking Xorg rendering in all Icelake devices.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/795
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-02 09:41:27 -04:00
Hal Gentz
db6974fa2a gallium/osmesa: Fix the inability to set no context as current.
Currently there is no way to make no context current w/gallium + osmesa.
The non-gallium version of osmesa does this if the context and buffer
passed to `OSMesaMakeCurrent` are both null. This small change makes it
so that this is also the case with the gallium version.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 57c894334e)
2019-10-02 09:41:27 -04:00
Andres Gomez
9e754647ba docs/features: Update VK_KHR_display_swapchain status
It was set as done by mistake.

Fixes: bc15d74529 ("docs/features: Mark some Vulkan extensions as done")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit bcd9224728)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	docs/features.txt
2019-10-02 09:41:27 -04:00
Adam Jackson
0b8c7cf51c docs: Update bug report URLs for the gitlab migration
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 5b5c5bf833)
2019-10-02 09:41:27 -04:00
Arcady Goldmints-Orlov
7e6dd1ce7a anv: fix descriptor limits on gen8
Later generations support bindless for samplers, images, and buffers and
thus per-stage descriptors are not limited by the binding table size.
However, gen8 doesn't support bindless images and thus needs to report a
lower per-stage limit so that all combinations of descriptors that fit
within the advertised limits are reported as supported by
vkGetDescriptorSetLayoutSupport.

Fixes test dEQP-VK.api.maintenance3_check.descriptor_set
Fixes: 79fb0d27f3 ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5ec5fecc26)
2019-10-02 09:41:27 -04:00
Tapani Pälli
1ee0add2a5 egl: check for NULL value like eglGetSyncAttribKHR does
Commit d1e1563bb6 added a NULL check for eglGetSyncAttribKHR
but eglGetSyncAttrib does not do this. Patch adds same check to
happen with eglGetSyncAttrib.

Fixes crashes in (when exposing EGL 1.5):
   dEQP-EGL.functional.fence_sync.invalid.get_invalid_value

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 99cbec0a5f)
2019-10-02 09:41:27 -04:00
Paulo Zanoni
3b0e591228 intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32
The current code can create functions with a width of 32, which is not
supported by our hardware. Add some code to simplify how we express
what we want and prevent such cases.

For some unknown reason, all the tests I could run seem to work even
with these unsupported MOVs.

Fixes: b0858c1cc6 "intel/fs: Add a couple of simple helper opcodes"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 8e614c7a29)
2019-10-02 09:41:27 -04:00
Eric Engestrom
5936c8f4f4 gl: drop incorrect pkg-config file for glvnd
Akin to 1a25980c46 ("egl: drop incorrect pkg-config file for
glvnd") and b01524fff0 ("meson: don't build libGLES*.so with
GLVND") , removes a pkg-config file that shouldn't have been there in
the first place, but was needed because of that GLVND bug.

Now that the glvnd bug has been fixed, it was apparent that this gl.pc
pkg-config file was forgotten to be removed, so let's do just that :)

Suggested-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit a1de3011f3)
2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero
8748747007 cherry-ignore: add explicit 19.3 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-02 09:41:24 -04:00
Andres Gomez
ac31da3529 docs: Add the maximum implemented Vulkan API version in 19.1 rel notes
Currently, Vulkan 1.1.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit d2db43fcad)
2019-10-02 09:40:28 -04:00
Bas Nieuwenhuizen
97af29d6da tu: Set up glsl types.
Addresses this assert:

deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type *glsl_type::get_interface_instance(const glsl_struct_field *, unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed.

running dEQP-VK.api.smoke.triangle .

Fixes: 624789e370 "compiler/glsl: handle case where we have multiple users for types"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7999e10cab)
2019-10-02 09:40:28 -04:00
Haihao Xiang
81a4483465 i965: support AYUV/XYUV for external import only
Fixes: 89785e2d56 ("i965: add support for sampling from AYUV")
Fixes: 7cab8d3661 ("i965: Add support for sampling from XYUV images")
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 8a9b81ab9d)
2019-10-02 09:40:28 -04:00
Samuel Iglesias Gonsálvez
00651091fa intel/nir: do not apply the fsin and fcos trig workarounds for consts
If we have fsin or fcos trigonometric operations with constant values
as inputs, we will multiply the result by 0.99997 in
brw_nir_apply_trig_workarounds, making the result wrong.

Adjusting the rules so they do not apply to const values we let a
later constant fold to deal with it.

v2:
- Do not early constant fold but only apply the trig workaround for
  non constants (Caio).
- Add fixes tag to commit log (Caio).

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 3c474f8513)
2019-10-02 09:40:28 -04:00
Tapani Pälli
d11d2c6def iris: close screen fd on iris_destroy_screen
Otherwise it never gets closed, this fixes errors seen with deqp-egl
where we end up opening 1024 files.

Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 631255387f)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/drivers/iris/iris_screen.c
2019-10-02 09:40:28 -04:00
Rhys Perry
44c38ecd27 radv: always emit a position export in gs copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8d0337299 ('radv: add multiple streams support for the GS copy shader')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ffabcbba60)
2019-10-02 09:40:28 -04:00
Juan A. Suarez Romero
3859e211db cherry-ignore: add explicit 19.2 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-02 09:40:26 -04:00
Kenneth Graunke
4d45999a24 iris: Initialize ice->state.prim_mode to an invalid value
It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we
fail to notice an initial primitive of points being new, and fail at
updating the "primitive is points or lines" field.

We do not need to reset this on device loss because we're tracking
the last primitive mode sent to us on the CPU via draw_vbo, not the
last primitive mode sent to the GPU.

Fixes several tests:
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner

Fixes: dcfca0af7c ("iris: Set XY Clipping correctly.")
(cherry picked from commit c9fb704f72)
2019-09-19 07:55:32 +00:00
Juan A. Suarez Romero
b9d7244035 docs: add sha256 checksums for 19.1.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-17 12:53:06 +02:00
Juan A. Suarez Romero
f632aac938 docs: add release notes for 19.1.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-17 12:31:43 +02:00
Juan A. Suarez Romero
952502893a Update version to 19.1.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-17 10:28:45 +00:00
Samuel Pitoiset
2959de1f92 radv: fix allocating number of user sgprs if streamout is used
streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.

Fixes: 378e2d2414 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8137df3a46)
[Juan A. Suarez: fix the structure usage]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-13 08:18:29 +00:00
Danylo Piliaiev
30689e7da8 tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE
Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made
it for drivers impossible to have flatshaded color inputs.

Translate it to INTERP_MODE_NONE which drivers interpret as
smooth or flat depending on flatshading state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467

Fixes: 770faf54 ("tgsi_to_nir: Improve interpolation modes.")

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 175c32e9bd)
2019-09-13 08:06:17 +00:00
Kenneth Graunke
9556c5b1a2 gallium: Fix util_format_get_depth_only
This is a pipe format, not a boolean.

Fixes: 5849e0612c ("gallium/auxiliary: Add util_format_get_depth_only() helper.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit c6d40b5182)
2019-09-12 08:25:22 +00:00
Caio Marcelo de Oliveira Filho
83dadcfdc6 glsl/nir: Avoid overflow when setting max_uniform_location
Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned
max_uniform_location.  Those unmapped uniforms don't have to be
accounted at this point.

Fixes: 7a9e5cdfbb ("nir/linker: Add gl_nir_link_uniforms()")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 4f33f96c45)
2019-09-12 08:22:46 +00:00
Mauro Rossi
1f7629a760 android: anv: libmesa_vulkan_common: add libmesa_util static dependency
Change needed to fix the following building error:

In file included from external/mesa/src/intel/vulkan/anv_device.c:43:
external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 4dcb1ff ("anv: add support for driconf")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit ae5ac26dfa)
2019-09-10 08:46:03 +00:00
Erik Faye-Lund
5b85ecce0b util: fix SSE-version needed for double opcodes
This code generates CVTSD2SI, which requires SSE2. So let's fix the
required SSE-version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc)
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2ade1c5cf7)
2019-09-10 08:43:12 +00:00
Mauro Rossi
c34a479cc3 android: amd/common: fix missing include path
Fixes the following building error in Android:

In file included from external/mesa/src/amd/common/ac_llvm_helper.cpp:34:
In file included from external/mesa/src/amd/common/ac_llvm_build.h:30:
In file included from external/mesa/src/compiler/nir/nir.h:40:
In file included from external/mesa/src/compiler/nir_types.h:36:
external/mesa/src/compiler/glsl_types.h:37:10: fatal error: 'main/config.h' file not found
         ^~~~~~~~~~~~~~~
1 error generated.

Fixes: bd4c661 ("ac,ac/nir: use a better sync scope for shared atomics")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit bbbbea243a)
2019-09-10 08:42:05 +00:00
Mauro Rossi
51fc954c90 android: radv: fix necessary dependecies
Fixes building errors due to libmesa_util and libexpat dependencies:

In file included from external/mesa/src/amd/vulkan/radv_device.c:52:
external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
external/mesa/src/util/xmlconfig.c:670: error: undefined reference to 'XML_ParserCreate'
...
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 3c2e826 ("radv: Add support for driconf.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 51e24af8fd)
2019-09-10 08:40:02 +00:00
Juan A. Suarez Romero
b4fd0bae5c cherry-ignore: add explicit 19.2 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-10 08:05:27 +00:00
Jason Ekstrand
05beca4dcf nir/dead_cf: Repair SSA if the pass makes progress
The dead_cf pass calls into the CF manipulation helpers which attempt to
keep NIR's SSA form sane.  However, when the only break is removed from
a loop, dominance gets messed up anyway because the CF SSA clean-up code
only looks at phis and doesn't consider the case of code becoming
unreachable.  One solution to this would be to put the loop into LCSSA
form before we modify any of its contents.  Another (and the approach
taken by this pass) is to just run the repair_ssa pass afterwards
because the CF manipulation helpers are smart enough to keep all the
use/def stuff sane; they just don't always preserve dominance
properties.

While we're here, we clean up some bogus indentation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit c832820ce9)
2019-09-09 11:24:07 +00:00
Jason Ekstrand
d77aa3cc1c nir/repair_ssa: Insert deref casts when needed
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 1005272a2b)
2019-09-09 11:22:10 +00:00
Jason Ekstrand
ff8122d5a2 nir/repair_ssa: Repair dominance for unreachable blocks
NIR currently assumes that unreachable blocks are trivially dominated by
everything.  However, when considering well-formed SSA, there is no path
from any block to an unreachable block.  Therefore, we can break any
use-def chains where the use is in an unreachable block.  This removes
any dependencies on code created by uses in unreachable blocks and lets
DCE do a better job of cleaning it up.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit a3268599f3)
2019-09-09 11:19:38 +00:00
Jason Ekstrand
18005d8fd3 nir: Add a block_is_unreachable helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit f81a2623d8)
2019-09-09 11:15:17 +00:00
Jason Ekstrand
01d452de58 nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 517142252f)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/compiler/nir/nir_from_ssa.c
2019-09-09 13:10:13 +02:00
Eric Engestrom
431f5a8a78 radv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5eb7d48b58)
2019-09-09 11:04:32 +00:00
Eric Engestrom
89d1ca343f amd: move adaptive sync to performance section, as it is defined in xmlpool
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4ad99ee961)
2019-09-09 11:00:05 +00:00
Eric Engestrom
a150bb7e03 anv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 037b5b567f)
2019-09-09 10:54:52 +00:00
Eric Engestrom
4d5bcb4c33 wsi: add minImageCount override
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a72cdd00ab)
2019-09-09 10:51:27 +00:00
Eric Engestrom
2977a3e0e1 anv: add support for driconf
No option is supported yet, this is just the boilerplate.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4dcb1fff19)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/intel/vulkan/meson.build
2019-09-09 12:31:46 +02:00
Jason Ekstrand
82edaa5a41 anv: Bump maxComputeWorkgroupSize
Fixes: 9a129510f5 "anv: Bump maxComputeWorkgroupInvocations"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 3b1a7e5333)
2019-09-09 10:22:53 +00:00
Jason Ekstrand
667920050a Revert "intel/fs: Move the scalar-region conversion to the generator."
This reverts commit c0504569ea.  Now that
we're doing interpolation lowering in NIR, we can continue to stride the
FS input registers directly in the brw_fs_nir code like we did before.
This fixes SIMD32 fragment shaders which broke because lower_simd_width
depended on the 0 stride to split PLN instructions correctly.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit d15fe8ca82)
2019-09-06 10:35:49 +00:00
Sergii Romantsov
48f78dfce2 intel/dri: finish proper glthread
KWin was able to get NULL-context in the call
intelUnbindContext. But a call _mesa_glthread_finish
is not resistent to such case.
Case can be catched with steps:
	1. Create both glx and egl contexts
	2. Make glx as current
	3. Make egl as current
	4. Reset glx context
	5. Make egl as current

Solution adds proper finishing of glthread-context
(context will be taken from the requested dri-context
for unbinding, but not from the saved current context).

Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271
Fixes: dca36d5516 (i965: Implement threaded GL support)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1dce75c183)
2019-09-06 10:34:06 +00:00
Connor Abbott
d74ccd46fa radv: Call nir_propagate_invariant()
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3f5b541fc8)
2019-09-06 10:32:27 +00:00
Hal Gentz
35d435235a glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.
When run in optirun, applications that linked to `libGLX.so` and then
proceeded to querying Mesa for extension strings caused a SEGV in Mesa.

`glXQueryExtensionsString` was calling a chain of functions that
eventually led to `__glXQueryServerString`. This function would call
`xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`.
The latter for some unknown reason returned `NULL`. Passing this `NULL`
to `xcb_glx_query_server_string_string_length` would cause a SEGV as the
function tried to dereference it.

The reason behind the function returning `NULL` is yet to be determined,
however, simply checking that the ptr is not `NULL` resolves this. A
similar check has been added to `__glXGetString` for completeness sake,
although not immediately necessary.

In addition to that, we stumbled into a similar problem in
`AllocAndFetchScreenConfigs` which tries to access the configs to free
them if `__glXQueryServerString` fails. This, of course, SEGVs, because the
configs are yet to have been allocated. Simply continuing past the configs
if their config ptrs are `NULL` resolves this. We also switch to `calloc`
to make sure that the config ptrs are `NULL` by default, and not some
uninitialized value.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 24b8a8cfe8 "glx: implement __glXGetString, hide __glXGetStringFromServer"
Fixes: cb3610e37c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it "
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
(cherry picked from commit 1591d1fee5)
2019-09-06 10:30:46 +00:00
Eric Engestrom
29159cbf21 nir: fix memleak in error path
Fixes: 2cf59861a8 ("nir: Add partial redundancy elimination for compares")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7659c6197f)
2019-09-05 16:01:12 +00:00
Eric Engestrom
6a5c36715a anv: fix format string in error message
Fixes: 9775894f10 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7abf65aedc)
2019-09-05 15:59:45 +00:00
Eric Engestrom
3bd87314e2 util/os_file: fix double-close()
Fixes: 955c63d364 ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 1667360f7d)
2019-09-05 15:57:55 +00:00
Eric Engestrom
5fcb149a46 egl: fix deadlock in malloc error path
Fixes: cb0980e69a ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 43d470404c)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
[Juan A. Suarez: resolve trivial conflicts]

Conflicts:
	src/egl/main/egldriver.c
2019-09-05 16:56:04 +01:00
Eric Engestrom
524373ba99 ttn: fix 64-bit shift on 32-bit 1
Fixes: 4d0b2c7aaa ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 3afe9d798a)
2019-09-05 15:49:07 +00:00
Lionel Landwerlin
4115781efa vulkan/overlay: bounce image back to present layout
Once we write the overlay to an image to be presented, we must not
forget to put it back into present layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 320b0f66c2)
2019-09-03 11:37:15 +00:00
Erik Faye-Lund
d16ab58f50 gallium/auxiliary/indices: consistently apply start only to input
The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413 ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 52af1427c6)
2019-09-03 11:33:57 +00:00
Juan A. Suarez Romero
4ec2325dd0 docs: add sha256 checksums for 19.1.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-03 13:04:25 +02:00
Juan A. Suarez Romero
85c8f88a49 docs: add release notes for 19.1.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-03 12:02:19 +02:00
Juan A. Suarez Romero
d45f8ff429 Update version to 19.1.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-03 09:56:49 +00:00
Pierre-Eric Pelloux-Prayer
52aea45dbc glsl: replace 'x + (-x)' with constant 0
This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 47cc660d9c)
2019-08-30 07:39:55 +00:00
Ian Romanick
938adab8ea intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware
See the previous commit for the explanation of the Fixes tag.

Hurts 21 shaders in shader-db.  All of the hurt shaders are in Unreal
Engine 4 tech demos.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit b418269d7d)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
[Juan A. Suarez: resolve trivial conflicts]

Conflicts:
	src/intel/compiler/brw_compiler.c
2019-08-29 12:04:34 +02:00
Ian Romanick
759afcacd9 nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled
This caused a problem on Sandybridge where an open-coded
bitfieldReverse() function could be optimized to a
nir_op_bitfield_reverse that would generate an unsupported BFREV
instruction in the backend.  This was encountered in some Unreal4 tech
demos in shader-db.  The bug was not previously noticed because we don't
actually try to run those demos on Sandybridge.

The fixes tag is a bit a lie.  The actual bug was introduced about
26,000 commits earlier in 371c4b3c48 ("nir: Recognize open-coded
bitfield_reverse.").  Without the NIR lowering pass, the flag needed to
avoid the optimization does not exist.  Hopefully nobody will care to
fix this on an earlier Mesa release.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit d3fd1c761a)
2019-08-29 09:51:14 +00:00
Kenneth Graunke
48a671e269 intel/compiler: Fix src0/desc setter ordering
src0 vstride and type overlap with bits of the extended descriptor.
brw_set_desc() also sets the extended descriptor to 0.  So by setting
the descriptor, then setting src0, we were accidentally setting a bunch
of extended descriptor bits unintentionally.

When using this infrastructure for framebuffer writes (in a future
patch), this ended up setting the extended descriptor bit 20, which is
"Null Render Target" on Icelake, causing nothing to be written to the
framebuffer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c8c9c48684)
2019-08-29 09:30:42 +00:00
Kenneth Graunke
6138702dec mesa: Fix _mesa_float_to_unorm() on 32-bit systems.
This fixes the following CTS test on 32-bit systems:
GTF-GL46.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_init

It does glGetTexImage of a 16-bit SNORM image, requesting 32-bit UNORM
data.  In get_tex_rgba_uncompressed, we round trip through float to
handle image transfer ops for clamping.  _mesa_format_convert does:

   _mesa_float_to_unorm(0.571428597f, 32)

which translated to:

   _mesa_lroundevenf(0.571428597f * 0xffffffffu)

which produced different results on 64-bit and 32-bit systems:

   64-bit: result = 0x92492500
   32-bit: result = 0x80000000

This is because the size of "long" varies between the two systems, and
0x92492500 is too large to fit in a signed 32-bit integer.  To fix this,
we switch to the new _mesa_i64roundevenf function which always does the
64-bit operation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104395
Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit e18cd5452a)
2019-08-28 08:27:34 +00:00
Kenneth Graunke
68bd0c7b9d util: Add a _mesa_i64roundevenf() helper.
This always returns a int64_t, translating to _mesa_lroundevenf on
systems where long is 64-bit, and llrintf where "long long" is needed.

Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b59914e179)
2019-08-28 08:22:58 +00:00
Marek Olšák
915a272b5a radeonsi: fix scratch buffer WAVESIZE setting leading to corruption
Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 360cf3c4b0)
2019-08-28 08:19:30 +00:00
Paulo Zanoni
e4df7ffc23 intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails
Looks like a copy/paste error. This patch prevents a segfault when
running the following on BDW:

    INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \
        dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4

For the curious, the message we're getting is:

    CS compile failed: Failure to register allocate.  Reduce number
    of live scalar values to avoid this.

Fixes: 864737ce6c ("i965/fs: Build 32-wide compute shader when needed.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 848d5e444a)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
[Juan A. Suarez: resolve trivial conflicts]

Conflicts:
	src/intel/compiler/brw_fs.cpp
2019-08-27 10:58:48 +02:00
Jonas Ådahl
955c54cea0 wayland/egl: Ensure correct buffer size when allocating
Whenever a buffer is allocated, e.g. by the first draw call or EGL call after a
buffer swap, make sure the size is up to date. Prior to this commit, we
failed to do so when querying the buffer age, or swapping buffers
without any prior EGL call or draw call.

Signed-off-by: Jonas Ådahl <jadahl@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 903ad59407)
2019-08-26 18:39:59 +02:00
Andres Rodriguez
c1959aa26d radv: additional query fixes
Make sure we read the updated data from the gpu in cases where WAIT_BIT
is not set.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a410823b3e)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_query.c
2019-08-26 13:30:15 +02:00
Kenneth Graunke
fc8e419619 iris: Fix large timeout handling in rel2abs()
...by copying the implementation of anv_get_absolute_timeout().

Appears to fix a CTS test with 32-bit builds:
GTF-GL46.gtf32.GL3Tests.sync.sync_functionality_clientwaitsync_flush

Fixes: f459c56be6 ("iris: Add fence support using drm_syncobj")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 7ee7b0ecbc)
2019-08-26 09:58:42 +00:00
Tapani Pälli
1c9c540b2a egl: reset blob cache set/get functions on terminate
Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when
running dEQP that terminates and reinitializes a display.

Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 3e03a3fc53)
2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero
5369eedf37 cherry-ignore: iris: Avoid unnecessary resolves on transfer maps
Fixes: The following commit depends on commits 77a1070d36 and
df4c2ec5e1 in order to compile, which did not land in the branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-26 09:58:42 +00:00
Kenneth Graunke
4d3dc92628 iris: Drop copy format hacks from copy region based transfer path.
This doesn't work for compressed formats, as the source texture and
temporary texture would have different block sizes.  (Forcing the driver
to always take the GPU path would expose the bug.)  Instead, just use
the source format for the temporary, and let blorp_copy deal with
overrides.

The one case where we can't do this is ASTC, because isl won't let us
create a linear ASTC surface.  Fall back to the CPU paths there for now.

Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 136629a1e3)
2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero
4f4a38289b cherry-ignore: iris: Update fast clear colors on Gen9 with direct immediate writes.
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-26 09:58:42 +00:00
Kenneth Graunke
8ad62264d1 iris: Fix broken aux.possible/sampler_usages bitmask handling
For renderable surfaces, we allocate SURFACE_STATEs for each bit in
res->aux.possible_usages.  Sampler views use res->aux.sampler_usages.

When pinning buffers, we call surf_state_offset_for_aux() to calculate
the offset to the desired surface state.  surf_state_offset_for_aux()
took an aux_modes parameter, which should be one of those two fields.
However...it was not using that parameter.  It always used the broader
res->aux.possible_usages field directly.

One of the callers, update_clear_value(), was passing incorrect masks
for this parameter.  It iterated through the bits in order, using
u_bit_scan(), which destructively modifies the mask.  So each time we
called it, the count of bits before our selected mode was 0, which would
cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE,
rather than updating each in turn.  This was hidden by the earlier bug
where surf_state_offset_for_aux() ignored the parameter.

Fixes: 7339660e80 ("iris: Add aux.sampler_usages.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 117a0368b0)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/drivers/iris/iris_state.c
2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero
c5a3f783d2 cherry-ignore: iris: Replace devinfo->gen with GEN_GEN
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero
fb69feb0b5 cherry-ignore: add explicit 19.2 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-26 09:58:42 +00:00
Danylo Piliaiev
61fb6bca53 nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll
Without loop_prepare_for_unroll loops are losing phis.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411
Fixes: 5db98195 "nir: add loop unroll support for wrapper loops"
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 84b3ef6a96)
2019-08-23 11:55:04 +00:00
Ilia Mirkin
ac0f71a4af gallium/vl: use compute preference for all multimedia, not just blit
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 958390a9bf)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/auxiliary/util/u_screen.c
	src/gallium/docs/source/screen.rst
	src/gallium/drivers/radeonsi/si_get.c
	src/gallium/include/pipe/p_defines.h
2019-08-23 13:48:50 +02:00
Daniel Schürmann
41e8b0d027 nir/lcssa: handle deref instructions properly
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 414148cdc1 "nir: Support deref instructions in loop_analyze"
(cherry picked from commit 204846ad06)
2019-08-23 11:42:10 +00:00
Juan A. Suarez Romero
ae2a676cd1 docs: add sha256 checksums for 19.1.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-23 12:38:02 +02:00
Juan A. Suarez Romero
a384fe0ceb docs: add release notes for 19.1.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-23 12:24:21 +02:00
Juan A. Suarez Romero
6c37279d09 Update version to 19.1.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-23 10:20:54 +00:00
Marek Olšák
9862fc4941 radeonsi: fix an assertion failure: assert(!res->b.is_shared)
This only appears to happen on Raven2.

Possible way to reproduce:

resource_get_handle(WINSYS_HANDLE_TYPE_KMS) --> sets is_shared = true
resource_get_handle(WINSYS_HANDLE_TYPE_DMABUF) --> fail

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8d0d753bd0)
2019-08-20 09:30:34 +00:00
Greg V
9c9b92c69a intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 134e750e16 ("i965: extract performance query metrics")
(cherry picked from commit ac1561088d)
2019-08-10 09:31:43 +00:00
Greg V
a8105085e9 anv: remove unused Linux-specific include
Fixes: 4201cc2dd3 ("anv: Implement VK_KHX_external_semaphore_fd")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 2be3f16600)
2019-08-10 09:30:21 +00:00
Danylo Piliaiev
3627595e3d i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D
There is an object-level  preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b8842bc312)
2019-08-10 09:29:01 +00:00
Bas Nieuwenhuizen
c4ab0e18bb radv: Avoid VEGA/RAVEN scissor bug in binning.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 23a9d20997)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_pipeline.c
2019-08-10 11:27:14 +02:00
Bas Nieuwenhuizen
908d85ffce radv: Avoid binning RAVEN hangs.
Mirroring radeonsi.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4a3f987afd)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_pipeline.c
2019-08-10 11:21:23 +02:00
Erik Faye-Lund
a9cbcf09be gallium/dump: add missing query-type to short-list
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 3f6b3d9db7 ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit da9e2958ec)
2019-08-08 10:32:20 +00:00
Erik Faye-Lund
2f7b1159bd gallium/dump: add missing query-type to short-list
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: a677799e51 ("gallium: add PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE
                     and corresponding cap")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 70a93922db)
2019-08-08 10:30:50 +00:00
Eric Engestrom
d38952ef0d util: fix mem leak of program path
Fixes: 759b940389 ("util: Get program name based on path when possible")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 5b10ddf358)
2019-08-08 10:28:03 +00:00
Matt Turner
945a217e94 meson: Test for program_invocation_name
program_invocation_name and program_invocation_short_name are both GNU
extensions. I don't believe one can exist without the other, so only
check for program_invocation_name.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit c9b86cf526)
2019-08-08 10:10:21 +00:00
Marek Olšák
f837d0a6a3 radeonsi: disable SDMA image copies on dGPUs to fix corruption in games
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 6b3ee86989)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/drivers/radeonsi/cik_sdma.c
2019-08-08 12:04:18 +02:00
Bas Nieuwenhuizen
f0aa11b054 ac/nir: Use correct cast for readfirstlane and ptrs.
Fixes: 028ce527 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2af00b1fdd)
2019-08-08 10:01:00 +00:00
Bas Nieuwenhuizen
3a7d0d760f radv: Do non-uniform lowering before bool lowering.
Since it can introduce comparisons.

Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2301b2e029)
2019-08-08 09:59:15 +00:00
Jason Ekstrand
84e3025387 anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D
There is an object-level  preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109630
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111267
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bc612536eb)
2019-08-08 09:56:23 +00:00
Juan A. Suarez Romero
f70c6dda43 cherry-ignore: panfrost: Make ctx->job useful
Fixes: This commit does not apply cleanly on 19.1 branch, as it depends
on other commits not present in the branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-08 09:53:12 +00:00
Sergii Romantsov
c9d9ad2e9f i965/clear: clear_value better precision
Test-case with depth-clear 0.5 and format
MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent
clear-value of 0.4999997.
Maybe its better to improve?

CC: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 0ae9ce0f29 (i965/clear: Quantize the depth clear value based on the format)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a86eccfb78)
2019-08-07 17:23:42 +00:00
Juan A. Suarez Romero
7fcb69a33c docs: add sha256 checksums for 19.1.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-07 18:49:02 +02:00
Juan A. Suarez Romero
b84ffa028d docs: add release notes for 19.1.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-07 18:38:23 +02:00
Juan A. Suarez Romero
53cc3e8f7e Update version to 19.1.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-07 16:33:26 +00:00
Tapani Pälli
83815a97d5 mesa: add glsl_type ref to one_time_init and decref to atexit
This fixes problems spotted within vk-gl-cts. Problem is that the builtin
functions refer to types and we should not release types before builtins
are released.

Fixes: 624789e370 ("compiler/glsl: handle case where we have multiple users for types")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 16:31:31 +03:00
Francisco Jerez
59cb919ff2 intel/ir: Fix CFG corruption in opt_predicated_break().
Specifically the optimization of a conditional BREAK + WHILE sequence
into a conditional WHILE seems pretty broken.  The list of successors
of "earlier_block" (where the conditional BREAK was found) is emptied
and then re-created with the same edges for no apparent reason.  On
top of that the list of predecessors of the block immediately after
the WHILE loop is emptied, but only one of the original edges will be
added back, which means that potentially several blocks that still
have it on their list of successors won't be on its list of
predecessors anymore, causing all sorts of hilarity due to the
inconsistency in the control flow graph.

The solution is to remove the code that's removing valid edges from
the CFG.  cfg_t::remove_block() will already clean up after itself.
The assert in bblock_t::combine_with() also needs to be removed since
we will be merging a block with multiple children into the first one
of them.

Found the issue on a hardware enabling branch originally, but
apparently somebody reproduced the same problem independently on
master in the meantime.

Fixes: d13bcdb3a9 ("i965/fs: Extend predicated break pass to predicate WHILE.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009
Cc: jiradet.jd@gmail.com
Cc: Sergii Romantsov <sergii.romantsov@globallogic.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 54fbc625ea)
2019-08-02 07:00:31 +00:00
Eric Engestrom
8f3935b1ac nir: remove explicit nir_intrinsic_index_flag values
These were left after a rebase and happen to make
NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it
was noticed.

Fixes: 6f20643b47 ("nir: Allow qualifiers on copy_deref and image instructions")
Cc: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 5d7bcac4e7)
2019-08-01 07:59:12 +00:00
Emil Velikov
b4f52b1567 egl/drm: ensure the backing gbm is set before using it
Currently, if we error out before gbm_dri is set (say due to a different
name of the backing GBM implementation, or otherwise) the tear down will
trigger a NULL ptr deref and crash out.

Move the gbm_dri initialization as early as possible.

v2: Drop check in dri2_teardowm_drm (Eric)

Reported-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 72b97ad9b2)
2019-08-01 07:57:55 +00:00
Jason Ekstrand
a42361cdb2 intel/fs: Implement quad_swap_horizontal with a swizzle on gen7
This fixes dEQP-VK.subgroups.quad.compute.subgroupquadswaphorizontal_*
on all gen7 platforms.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8fd2f2c276)
2019-07-31 08:12:46 +00:00
Jason Ekstrand
f522c7ca9e intel/fs: Use ALIGN16 instructions for all derivatives on gen <= 7
The issue here was discovered by a set of Vulkan CTS tests:

    dEQP-VK.glsl.derivate.*.dynamic_*

These tests use ballot ops to construct a branch condition that takes
the same path for each 2x2 quad but may not be uniform across the whole
subgroup.  They then tests that derivatives work and give the correct
value even when executed inside such a branch.  Because the derivative
isn't executed in uniform control-flow and the values coming into the
derivative aren't smooth (or worse, linear), they nicely catch bugs that
aren't uncovered by simpler derivative tests.

Unfortunately, these tests require Vulkan and the equivalent GL test
would require the GL_ARB_shader_ballot extension which requires int64.
Because the requirements for these tests are so high, it's not easy to
test on older hardware and the bug is only proven to exist on gen7;
gen4-6 are a conjecture.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 499d760c6e)
2019-07-31 08:06:48 +00:00
Eric Engestrom
ac7f03caed scons+meson: suppress spammy build warning on MacOS
Originally introduced in c7f3657450 ("darwin: Suppress type
conversion warnings for GLhandleARB") to fix Bugzilla #66346 [1], this
workaround was never ported to Scons or Meson.

[1] https://bugs.freedesktop.org/66346

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit bf8b5de6b9)
2019-07-31 08:02:16 +00:00
Bas Nieuwenhuizen
b1d66aa9ee radv: Fix descriptor set allocation failure.
Set all the handles to VK_NULL_HANDLE:

"If the creation of any of those descriptor sets fails, then the implementation
must destroy all successfully created descriptor set objects from this command,
set all entries of the pDescriptorSets array to VK_NULL_HANDLE and return the
error."

(Vulkan 1.1.117 Spec, section 13.2)

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2b53c49d2f)
2019-07-31 08:00:27 +00:00
Lionel Landwerlin
d06ccdf9dd spirv: don't discard access set by vtn_pointer_dereference
We can have a access flag already set here so just augment the
existing ones.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0fb61dfdeb ("spirv: propagate access qualifiers through ssa & pointer")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 7deb5ec0e8)
2019-07-31 07:58:43 +00:00
Andres Rodriguez
ad72ce1ad7 radv: fix queries with WAIT_BIT returning VK_NOT_READY
When vkGetQueryPoolResults() is called with VK_QUERY_RESULT_WAIT_BIT
set, the driver is supposed to wait for the query to become available
before returning.

Currently, radv returns once the query is indeed ready, but it returns
VK_NOT_READY. It also fails to populate the results.

The problem is a missing volatile in the secondary check for query
availability. This patch removes the secondary check altogether since it
is redundant with the preceding loop.

This bug was found with an unreleased version of SteamVR.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 2b71b4e793)
2019-07-31 07:52:52 +00:00
Andrii Simiklit
23eebaf2ec meson: add a warning for meson < 0.46.0
This could help somebody to be noticed about meson issue:
https://github.com/mesonbuild/meson/pull/3274
as result NDEBUG won't be defined even if b_ndebug is true
and buildtype is release.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109791
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-07-30 11:44:55 +03:00
Eric Anholt
3ec136d583 freedreno: Fix data races with allocating/freeing struct ir3.
There is a single ir3_compiler in the screen, and each context may be
compiling ir3 shaders, which call ir3_create.  ralloc doesn't do any
locking on its own, so eventually you can end up racing to break
ralloc's linked lists.

We really don't want struct ir3 to live as long as the compiler (maybe
struct ir3_shader's lifetime, if anything), so you'd better be freeing
it anyway.

Fixes: 8fe2076243 ("freedreno/ir3: convert over to ralloc")
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 6e3b220ad3)
2019-07-30 08:33:26 +00:00
Bas Nieuwenhuizen
8fbadb152c radv: Take variable descriptor counts into account for buffer entries.
Fixes: b5e04e9217 "radv: Support allocating variable size descriptor sets."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit aac492901a)
2019-07-30 08:32:07 +00:00
Jason Ekstrand
b1df082b00 anv: Don't claim support for 24 and 48-bit formats on IVB
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 99d04a5bd6)
2019-07-30 08:31:00 +00:00
Jason Ekstrand
66ee5bd082 isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW
On Haswell, the format works but it doesn't properly do an sRGB decode.
It appears to act identically to R8G8B8_UNORM.  Only Vulkan uses this
format so this only affects Vulkan on HSW.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 7c1b39cf18)
2019-07-30 08:29:37 +00:00
Rhys Perry
7364cb04c5 ac/nir: fix txf_ms with an offset
Seems to fix some hair artifacts in Max Payne 3:
https://github.com/daniel-schuermann/mesa/issues/76

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f4e499ec79 ('radv: add initial non-conformant radv vulkan driver')
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit a9f58af454)
2019-07-30 08:27:47 +00:00
Lionel Landwerlin
83d17d5730 spirv: propagate access qualifiers through ssa & pointer
Not only variables can be flagged as NonUniformEXT but also
expressions. We're currently ignoring it in an expression such as :

   imageLoad(data[nonuniformEXT(rIndex)], 0)

The associated SPIRV :

   OpDecorate %69 NonUniformEXT
   ...
   %69 = OpLoad %61 %68

This changes propagates access qualifiers through ssa & pointers so
that when it hits a OpLoad/OpStore style instructions, qualifiers are
not forgotten.

Fixes failure the following tests :

   dEQP-VK.descriptor_indexing.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8ed583fe52 ("spirv: Handle the NonUniformEXT decoration")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 0fb61dfdeb)
2019-07-30 08:25:46 +00:00
Lionel Landwerlin
0801a8b906 spirv: wrap push ssa/pointer values
This refactor allows for common code to apply decoration on all
ssa/pointer values. In particular this will allow to propagage access
qualifiers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 86b53770e1)
[Lionel Landwerlin: patch adapted for 19.1 branch]
2019-07-30 08:23:19 +00:00
Connor Abbott
e1fdca7492 nir: Allow qualifiers on copy_deref and image instructions
In the next commit, we'll properly handle access qualifiers on struct
members by propagating them to load/store instructions, but these
instructions had no way to specify the qualifier.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 6f20643b47)
2019-07-30 08:18:49 +00:00
Caio Marcelo de Oliveira Filho
57fc7a23e1 anv: Remove special allocation for anv_push_constants
The key reason for that mechanism is gone: all the extra optional data
that could be in the anv_push_constants was moved elsewhere.  At this
point, just put anv_push_constants directly in anv_cmd_state (part of
anv_cmd_buffer).

v2: Remove a NULL check we don't need anymore in
    anv_cmd_buffer_push_constants().  (Lionel)
    Fix size we consider for valid push params.  (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit f7d53fffa2)
2019-07-30 08:18:49 +00:00
Ilia Mirkin
630a2e4d97 nv50/ir: handle insn not being there for definition of CVT arg
This can happen if it's e.g. a uniform or a function argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3e468ff2fe)
2019-07-29 11:08:14 +00:00
Ilia Mirkin
5f640b4692 nvc0: allow a non-user buffer to be bound at position 0
Previously the code only handled it for positions 1 and up (as would be
for UBO's in GL). It's not a lot of trouble to handle this, and vl or
vdpau want this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9f8ed5aa67)
2019-07-29 11:01:18 +00:00
Ilia Mirkin
645462fe85 nv50,nvc0: update sampler/view bind functions to accept NULL array
Apparently vl (or vdpau) wants to pass that in now. Handle it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c52b057e00)
2019-07-29 10:56:22 +00:00
Ilia Mirkin
e671e68238 gallium/vl: fix compute tgsi shaders to not process undefined components
This caused nouveau's function handling logic to think that the MAIN
function was due to receive external parameters, and cascaded some
failures after that. Instead avoid having the undefined components in
the first place.

Fixes: f6ac0b5d71 (gallium/auxiliary/vl: Add compute shader to support video compositor render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit face27fdc5)
2019-07-29 10:53:54 +00:00
Boyuan Zhang
b521c3c0c8 radeon/vcn: enable rate control for hevc encoding
Set cu_qp_delta_enable_flag on when rate control is enabled, and set it
off when rate control is disabled (e.g. constant qp).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: fix typo and add bugzilla info

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
(cherry picked from commit b0626c1f30)
2019-07-29 10:51:48 +00:00
Boyuan Zhang
5c7cffe1d4 radeon/uvd: enable rate control for hevc encoding
Set cu_qp_delta_enable_flag on when rate control is enabled, and set it
off when rate control is disabled (e.g. constant qp).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: fix typo and add bugzilla info

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
(cherry picked from commit 5115c25bb8)
2019-07-29 10:50:33 +00:00
Boyuan Zhang
e2568bc6e4 radeon/vcn: fix poc for hevc encode
MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.

Also use poc value passed from st instead of calculation
in slice header encoding.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: Fix typo

V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
(cherry picked from commit 9aaf3aaf5d)
2019-07-29 10:47:57 +00:00
Boyuan Zhang
7470b25b2b radeon/uvd: fix poc for hevc encode
MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.

Also use poc value passed from st instead of calculation
in slice header encoding.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: Fix typo

V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
(cherry picked from commit 77cf700fa3)
2019-07-29 10:46:12 +00:00
Lionel Landwerlin
c45c624dce nir: add access to image_deref intrinsics
SPIRV added the ability to access variables and have expressions non
dynamically uniform and because spirv_to_nir generates deref
instructions, we'll need to have that access there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 8c330728f3)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/compiler/nir/nir.c
2019-07-29 10:23:45 +02:00
Mark Menzynski
2098b48fa0 nvc0/ir: Fix assert accessing null pointer
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111007
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111167

Signed-off-by: Mark Menzynski <mmenzyns@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann<tobias.klausmann@freenet.de>
(cherry picked from commit 7493fbf032)
2019-07-26 15:10:50 +00:00
Jason Ekstrand
eb24e60cdc anv: Disable transform feedback on gen7
It's totally implementable, it's just that the plumbing is a bit
different and we never hooked it up.  Don't advertise a broken feature.

Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"
(cherry picked from commit 295e5a17da)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/intel/vulkan/anv_extensions.py
2019-07-26 09:35:42 +02:00
Bas Nieuwenhuizen
204a36f270 radv: Set correct metadata size for GFX9+.
Without correct size, radeonsi assumes the metadata is incorrect,
which can and will cause issues.

Since the metadata is really incorrect without the size, let us
fix that.

Fixes: e43cc3e3af "radv/gfx9: handle GFX9 opaque metadata"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 7e1fe81f56)
2019-07-26 07:32:34 +00:00
Arcady Goldmints-Orlov
2329b87ff4 anv: report HOST_ALLOCATION as supported for images
Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as
supported for images. It was being shown supported for buffers, but not
images.

Fixes: 69cc6272fb ("anv: Implement VK_EXT_external_memory_host")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 832cedfdee)
2019-07-26 07:29:18 +00:00
Daniel Schürmann
742f348d32 spirv: Fix order of barriers in SpvOpControlBarrier
Semantically, the memory barrier has to come first to wait
for the completion of pending memory requests.
Afterwards, the workgroups can be synchronized.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit e352b4d650)
2019-07-25 09:41:06 +00:00
Nicolas Dufresne
327a6b3a64 egl: Also query modifiers when exporting DMABuf
This fixes eglExportDMABUFImageQueryMESA() so it will report the
modififers of the underlying image. Without this information,
re-importing will likely be broken as it is rare these days that no
modifiers are used.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Fixes: 8f7338f284 ("egl: add initial EGL_MESA_image_dma_buf_export v2.4")
(cherry picked from commit 08f1cefecd)
2019-07-25 09:02:00 +00:00
Yevhenii Kolesnikov
4bb56fdd46 main: Fix memleaks in mesa_use_program
Add freeing of SubroutineIndexes to the _mesa_free_shader_state.

Fixes: 4566aaaa5b ("mesa/subroutines: start adding per-context
subroutine index support (v1.1)")
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 882fe09a74)
2019-07-25 08:34:56 +00:00
Andrii Simiklit
61117d653e intel/compiler: don't use a keyword struct for a class fs_reg
warning: struct 'fs_reg' was previously declared as a class
Fixes: e64be391 ("intel/compiler: generalize the combine constants pass")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit fa2fc68de1)
2019-07-25 08:14:29 +00:00
Eric Engestrom
97cfb89b73 gallium+mesa: fix tgsi_semantic array type
Fixes: ed23335a31 ("gallium: use enums in p_shader_tokens.h (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e7e31b18d6)
2019-07-24 10:38:00 +00:00
Eric Engestrom
d4a64ad09b util: fix no-op macro (bad number of arguments)
Fixes: b8e077daee ("util: no-op __builtin_types_compatible_p() for non-GCC compilers")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit f986741a91)
2019-07-24 10:29:18 +00:00
Dylan Baker
e9a284e8d0 meson: allow building all glx without any drivers
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111016
Fixes: a47c525f32
       ("meson: build glx")
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 7cf50af6f5)
2019-07-24 10:09:37 +00:00
Lionel Landwerlin
dccd75b60c anv: fix use of comma operator
This doesn't fix any bug at the moment because the next statement is
'true' which happens to be APIMODE_D3D, but if that changes it could.

The fixes tags is as far I could go but the error predates it (2016 is
probably far enough).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8db6f2e6eb ("anv/pipeline: Roll genX_pipeline_util.h into genX_pipeline.c")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 772a5f9814)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/intel/vulkan/genX_pipeline.c
2019-07-24 12:06:57 +02:00
Eric Engestrom
aff5714c65 nir: don't return void
Fixes: 14531d676b ("nir: make nir_const_value scalar")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 3acc4278ad)
2019-07-24 10:02:48 +00:00
Dave Airlie
9305d9b142 st/nir: fix arb fragment stage conversion
The comment even justifies the wrongness wrongly.

We should be translating to pipe values properly here or else
fragment maps to tess ctrl.

Fixes: 3d7611e9a6 ("st/nir: use NIR for asm programs")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 365f24705f)
2019-07-23 11:50:52 +00:00
Kenneth Graunke
2570ee28f5 egl: Only expose 565 pbuffer configs if X can export them as DRI3 images
Glamor in xorg-server 1.20 cannot expose 16bpp pixmaps when running in
the usual 24bpp mode.  This meant our 565 pbuffer configs would
ultimately fail to create a backing pixmap, leading to crashes.

To hack around this, make a 16bpp pixmap and try and export it.
If it works, expose the configs.  Otherwise, just skip them.

This also disables them on DRI2.  These configs were only added to pass
conformance requirements, and I doubt anybody cares about testing out
565 pbuffer visuals on DRI2-only drivers.

v2: Don't leak the fds (caught by Eric Anholt)
v3: Don't free(fds), it's not malloc'd

Fixes: dacb11a585 ("egl: Add a 565 pbuffer-only EGL config under X11.")
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 82607f8a90)
2019-07-23 11:49:37 +00:00
Kenneth Graunke
f8c0b90f99 egl: Make the 565 pbuffer-only config single buffered.
In commit dacb11a585, Eric found the first
matching 565 pbuffer config, and stopped.  Our double-buffered configs
come first in the list, so we added that, making a pbuffer-only config
that claimed to be double buffered.  This doesn't make sense, since
pixmaps/pbuffers are fundamentally not double buffered.

When using that config, every call to eglCreatePbufferSurface would fail
with EGL_BAD_MATCH.  The call chain looks like this:

   - eglCreatePbufferSurface
   - dri3_create_pbuffer_surface
   - dri3_create_surface
   - dri2_get_dri_config

which eventually does:

   const bool double_buffer = surface_type == EGL_WINDOW_BIT;

and then fails to find a matching config, because it ends up looking
for a single-buffered config - and there aren't any.

To fix this, make the 565 pbuffer config single-buffered.  This fixes
at least 51 dEQP-EGL.* tests.

Fixes: dacb11a585 ("egl: Add a 565 pbuffer-only EGL config under X11.")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 6ad31c4ff3)
2019-07-23 11:47:55 +00:00
Kenneth Graunke
43f62d2003 egl: Quiet warning about front buffer rendering for pixmaps/pbuffers
pbuffer configs cause a million of these warnings to trigger, but
when using pixmaps or buffers, there is only one surface, so this
warning doesn't make much sense.  Retain it for window surfaces for now.

Fixes: dacb11a585 ("egl: Add a 565 pbuffer-only EGL config under X11.")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit fc21394bc4)
2019-07-23 11:46:43 +00:00
Kenneth Graunke
be12174820 mesa: Fix ReadBuffers with pbuffers
pbuffers are internally single-buffered.  Marek fixed DrawBuffers to
handle this case, but we need to fix ReadBuffers too.  Otherwise,
pretty much every conformance test fails because glReadPixels breaks.

v2: Refactor the switch into a helper (suggested by Eric Anholt)

Fixes: 35294f2eca ("mesa: fix pbuffers because internally they are front buffers")
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 78164a3a6c)
2019-07-23 11:45:31 +00:00
Jason Ekstrand
3cd11985c0 intel/fs: Stop stack allocating large arrays
Normally, we haven't worried too much about stack sizes as Linux tends
to be fairly friendly towards large stacks.  However, when running DXVK
apps under wine, we're suddenly subject to Windows' more stringent stack
limitations and can run out of space more easily.  In particular, some
of the shaders in Elite Dangerous: Horizons have quite a few registers
and the arrays in split_virtual_grfs are large enough to blow a 1 MiB
stack leading to crashes during shader compilation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108662
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit fa63fad333)
2019-07-23 11:43:54 +00:00
Nataraj Deshpande
87efbe488e egl/android: Update color_buffers querying for buffer age
color_buffers[] is currently hard coded to 3 for android which fails
in droid_window_dequeue_buffer when ANativeWindow creates color_buffers
>3 while querying buffer age during dEQP partial_update tests on chromeOS.

The patch removes static color_buffers[], queries for MIN_UNDEQUEUED_BUFFERS,
sets native window buffer count and allocates the correct number of
color_buffers as per android.

Fixes dEQP-EGL.functional.partial_update* tests on chromebooks with
enabling EGL_KHR_partial_update.

v2: update comment instead of removing (Eric Engestrom)
v3: change static array to dynamic allocated color_buffers
    querying MIN_UNDEQUEUED_BUFFERS (Chia-I Wu olv@chromium.org)

Fixes: 2acc69da8c "EGL/Android: Add EGL_EXT_buffer_age extension"
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
(cherry picked from commit 0661c357c6)
2019-07-23 11:42:30 +00:00
Samuel Pitoiset
e1800b20f4 radv: fix crash in vkCmdClearAttachments with unused attachment
depth_stencil_attachment and/or ds_resolve attachment can be NULL.

This fixes crashes with
dEQP-VK.renderpass.suballocation.unused_clear_attachments.*

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit b5116d3cb7)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_meta_clear.c
2019-07-23 13:40:12 +02:00
Juan A. Suarez Romero
33e57d0ace docs: add sha256 checksums for 19.1.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-07-23 11:18:10 +00:00
Juan A. Suarez Romero
09a1b2bdba docs: add release notes for 19.1.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-07-23 11:07:52 +00:00
Juan A. Suarez Romero
58e93aef96 Update version to 19.1.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-07-23 11:04:20 +00:00
Dave Airlie
f17ff71f49 radv: fix crash in shader tracing.
Enabling tracing, and then having a vmfault, can leads to a segfault
before we print out the traces, as if a meta shader is executing
and we don't have the NIR for it.

Just pass the stage and give back a default.

Fixes: 9b9ccee4d6 ("radv: take LDS into account for compute shader occupancy stats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2ac2b98780)
2019-07-19 08:40:05 +00:00
Samuel Pitoiset
d86b14ecbb radv: fix VGT_GS_MODE if VS uses the primitive ID
Found by inspection.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 63d670e350)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_pipeline.c
2019-07-19 10:38:00 +02:00
Samuel Iglesias Gonsálvez
900bcab48b anv: fix alphaToCoverage when there is no color attachment
There are tests in CTS for alpha to coverage without a color attachment
that are failing. This happens because we remove the shader color
outputs when we don't have a valid color attachment for them, but when
alpha to coverage is enabled we still want to preserve the the output
at location 0 since we need the alpha component. In that case we will
also need to create a null render target for RT 0.

v2:
  - We already create a null rt when we don't have any, so reuse that
    for this case (Jason)
  - Simplify the code a bit (Iago)

v3:
  - Take alpha to coverage from the key and don't tie this to depth-only
    rendering only, we want the same behavior if we have multiple render
    targets but the one at location 0 is not used. (Jason).
  - Rewrite commit message (Iago)

v4:
  - Make sure we take into account the array length of the shader outputs,
    which we were no handling correctly either and make sure we also
    create null render targets for any invalid array entries too.

v5:
  - Simplify removal of unused outputs by using rt_used[] so we don't have
    to special case alpha to coverage there too.

Fixes the following CTS tests:
dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bc66cebc0d)
2019-07-18 16:30:25 +00:00
Lionel Landwerlin
0b1ee72bbc anv: fix format mapping for depth/stencil formats
anv_format is supposed to have a pointer back to the associated
VkFormat, we were missed this for depth/stencil formats.

This doesn't fix anything afaict, but will be needed for future
changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 465de47bad ("anv: associate vulkan formats with aspects")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3adc32df92)
2019-07-18 08:36:51 +00:00
Lepton Wu
3dea2e2ffc virgl: Set meta data for textures from handle.
The set of meta data was removed by commit 8083464. It broke lots of
dEQP tests when running with pbuffer surface type.

Fixes: 8083464013 ("virgl: remove dead code")
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
(cherry picked from commit 6109df58e4)
2019-07-18 08:35:40 +00:00
Bas Nieuwenhuizen
1527d02acb radv: Only save the descriptor set if we have one.
After reset, if valid does not contain the relevant bit the descriptor
can be != NULL but still not be valid.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f1a8967344)
2019-07-18 08:32:06 +00:00
Lionel Landwerlin
d578b42e34 anv: report timestampComputeAndGraphics true
Spec says :

   "timestampComputeAndGraphics specifies support for timestamps on all
    graphics and compute queues. If this limit is set to VK_TRUE, all
    queues that advertise the VK_QUEUE_GRAPHICS_BIT or
    VK_QUEUE_COMPUTE_BIT in the VkQueueFamilyProperties::queueFlags
    support VkQueueFamilyProperties::timestampValidBits of at least 36."

On gen7+ this should be true (we only have 32bits of timestamp on
gen6 and below).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 802f00219a ("anv/device: Update features and limits")
Reported-by: Timothy Strelchun <timothy.strelchun@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ce4c5474af)
2019-07-18 08:29:59 +00:00
Lionel Landwerlin
a612f0210a vulkan/wsi: update swapchain status on vkQueuePresent
With the following chain of events :

   vkQueuePresent()
   <- Surface resize
   vkQueuePresent()

We should be able to report SUBOPTIMAL or OUT_OF_DATE on the second
vkQueuePresent() call. Currently we only look at X11 events in the
vkAcquireNextImage() path so we're not able to report this.

This change checks the queue of events and process any available ones
to update the swapchain status.

v2: Be consistent about reporting the current error state of the
    swapchain (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111097
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6f880f128f)
2019-07-18 08:27:58 +00:00
Jason Ekstrand
7a072f1f39 nir/loop_analyze: Properly handle swizzles in loop conditions
This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for
all intermediate values so that we can properly handle swizzles.  Even
though if conditions are required to be scalars, they may still consume
swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your
loop termination condition.  The old code would just bail the moment it
saw its first non-zero swizzle but we can now properly chase the scalar
from the if condition to all the way to a, b, and c.

Shader-db results on Kaby Lake:

    total loops in shared programs: 4388 -> 4364 (-0.55%)
    loops in affected programs: 29 -> 5 (-82.76%)
    helped: 29
    HURT: 5

Shader-db results on Haswell:

    total loops in shared programs: 4370 -> 4373 (0.07%)
    loops in affected programs: 2 -> 5 (150.00%)
    helped: 2
    HURT: 5

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit ff972c7a3a)
2019-07-18 08:24:56 +00:00
Jason Ekstrand
b685e303f7 nir: Add some helpers for chasing SSA values properly
There are various cases in which we want to chase SSA values through ALU
ops ranging from hand-written optimizations to back-end translation
code.  In all these cases, it can be very tricky to do properly because
of swizzles.  This set of helpers lets you easily work with a single
component of an SSA def and chase through ALU ops safely.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 8f7405ed9d)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/compiler/nir/nir.h
2019-07-18 08:22:26 +00:00
Jason Ekstrand
b9b376b821 nir/loop_analyze: Refactor detection of limit vars
This commit reworks both get_induction_and_limit_vars() and
try_find_trip_count_vars_in_iand to return true on success and not
modify their output parameters on failure.  This makes their callers
significantly simpler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 0333649e63)
2019-07-18 08:20:12 +00:00
Gert Wollny
fde2473a06 softpipe: Remove unused static function
Thanks to Eric Engestrom for pointing out that there was something wrong
with that function.

Fixes: 724a73509e
  softpipe: Prepare handling explicit gradients

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 9c611fb381)
2019-07-17 08:22:59 +00:00
Jason Ekstrand
b43e2d5a12 nir/regs_to_ssa: Handle regs in phi sources properly
Sources of phi instructions act as if they occur at the very end of the
predecessor block not the block in which the phi lives.  In order to
handle them correctly, we have to skip phi sources on the normal
instruction walk and handle them as a separate walk over the successor
phis.  While registers in phi instructions is a bit of an oddity it can
happen when we temporarily go out-of-SSA for control-flow manipulations.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 6fb685fe4b)
2019-07-17 08:17:29 +00:00
Yevhenii Kolesnikov
cffebf6f57 meta: leaking of BO with DrawPixels
ctx->Unpack.BufferObj wasn't unreferenced.

Fixes: d492e7b017 (meta: Fix invalid PBO access from DrawPixels when
trying to just alloc.)
CC: Eric Anholt <eric@anholt.net>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3853871ef8)
2019-07-17 08:14:46 +00:00
Jason Ekstrand
3a27a5b989 anv: Account for dynamic stencil write disables in the PMA fix
In 6ce8592836 we started looking at the dynamic stencil state and
disabling stencil writes when the stencil mask is zero.  Unfortunately,
we never updated the PMA fix code accordingly so 3DSTATE_WM_DEPTH_STENCIL
and the PMA fix were getting out-of-sync causing hangs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109203
Fixes: 6ce8592836 "anv: Disable stencil writes when both write..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 6a441151c2)
2019-07-17 08:12:37 +00:00
Sergii Romantsov
43682f0c6f meta: memory leak of CopyPixels usage
Meta of CopyPixel generates a buffer object
but does not free it on cleanup.

Fixes: 37d11b13ce (meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7417b43211)
2019-07-17 08:10:41 +00:00
Caio Marcelo de Oliveira Filho
6ba4ce97b7 spirv: Fix stride calculation when lowering Workgroup to offsets
Use alignment to calculate the stride associated with the pointer
types.  That stride is used when the pointers are casted to arrays.

Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b;
} will have element an element size of 12 bytes, but the stride needs
to be 16 bytes to respect the 8 byte alignment.

Fixes: 050eb6389a "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 026cfa1099)
2019-07-16 07:55:10 +00:00
Jason Ekstrand
f24507425b nir,intel: Add support for lowering 64-bit nir_opt_extract_*
We need this when doing full software 64-bit emulation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b "nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0ba508d7a3)
2019-07-16 07:47:37 +00:00
Jason Ekstrand
cad015acb5 nir/opt_if: Clean up single-src phis in opt_if_loop_terminator
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071
Fixes: 2a74296f24 "nir: add opt_if_loop_terminator()"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 7a19e05e8c)
2019-07-16 07:36:27 +00:00
Bas Nieuwenhuizen
2c1e3692b8 anv: Add android dependencies on android.
Specifically needed for nativewindow for some VK_EXT_external_memory_android_hardware_buffers
functions, where we call into some AHardwareBuffer functions.

The legacy Android ext did not have us call into any Android function
at all and hence it was not noticed.

Fixes: 755c633b8d "anv: Fix vulkan build in meson."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit d4f0f1a6e2)
2019-07-16 07:34:36 +00:00
Lionel Landwerlin
fa9ba5e19e anv: fix crash in vkCmdClearAttachments with unused attachment
anv_render_pass_compile() turns an unused attachment into a NULL
depth_stencil_attachment pointer so check that pointer before
accessing it.

Found with updates to existing CTS tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 208be8eafa ("anv: Make subpass::depth_stencil_attachment a pointer")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit c9c8c2f7d7)
2019-07-16 07:32:45 +00:00
Vinson Lee
6df891afa6 meson: Add dep_thread dependency.
Fix this build error on Ubuntu 18.04.

/usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5'

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663
Suggested-by: Eric Engestrom <eric@@engestrom.ch>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 730ceeddb5)
2019-07-15 17:31:08 +00:00
Eric Anholt
17dc693590 freedreno: Fix assertion failures in context setup in shader-db mode.
Cherry-picks a0d4d7febf upstream

The TTN path needs access to the screen to make the right decisions about
lowering, but we didn't have pctx->screen set up at fdN_prog_init time.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2019-07-15 12:43:36 +02:00
Caio Marcelo de Oliveira Filho
14a2fba722 anv: Fix pool allocator when first alloc needs to grow
When using softpin, the first allocation was not calculating the
padding and offset correctly for the case the first allocation needed
to grow.  We were missing initialize the state.end right after
expanding the pool for the first time.

This is not a problem for non-softpin since there we don't use
leftover padding so the ends would re-arrange incrementally.

This fixes running dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 in
SKL -- the test uses a shader larger than the initial size for the
instruction pool.

Fixes: dfc9ab2ccd "anv/allocator: Add padding information."
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 09c4037dda)
2019-07-15 10:28:02 +00:00
Timothy Arceri
e4b7aa9e74 mesa: save/restore SSO flag when using ARB_get_program_binary
Without this the restored program will fail the pipeline validation
checks when we attempt to use an SSO program.

Fixes: c20fd744fe ("mesa: Add Mesa ARB_get_program_binary helper functions")

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111010
(cherry picked from commit 3043908ccb)
2019-07-15 10:22:42 +00:00
Jason Ekstrand
24e7db0a36 anv: Set Stateless Data Port Access MOCS
This is the MOCS setting used for the A64 stateless messages which we
sometimes use for SSBO operations.

Fixes: 48ed2a7bb0 "anv: Implement VK_EXT_buffer_device_address"
Fixes: 79fb0d27f3 "anv: Implement SSBOs bindings with GPU addr..."
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 6a2ff217b8)
2019-07-15 10:19:55 +00:00
Jason Ekstrand
28aec04659 nir/loop_analyze: Bail if we encounter swizzles
None of the current code knows what to do with swizzles.  Take the safe
option for now and bail if we see one.  This does have a small shader-db
impact but it is at least safe.

Shader-db results on Kaby Lake:

    total loops in shared programs: 4364 -> 4388 (0.55%)
    loops in affected programs: 5 -> 29 (480.00%)
    helped: 5
    HURT: 29

Shader-db results on Haswell:

    total loops in shared programs: 4373 -> 4370 (-0.07%)
    loops in affected programs: 5 -> 2 (-60.00%)
    helped: 5
    HURT: 2

Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 9a3cb6f5fe)
2019-07-15 10:17:31 +00:00
Jason Ekstrand
0b540a702a nir/loop_analyze: Handle bit sizes correctly in calculate_iterations
The current code assumes everything is 32-bit which is very likely true
but not guaranteed by any means.  Instead, use nir_eval_const_opcode to
do the calculations in a bit-size-agnostic way.  We also use the new
constant constructors to build the correct size constants.

Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 268ad47c11)
2019-07-15 10:14:43 +00:00
Jason Ekstrand
afaec581a8 nir: Add more helpers for working with const values
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit ce5581e23e)
2019-07-15 10:09:44 +00:00
Jason Ekstrand
f5e70045e1 nir/loop_analyze: Fix phi-of-identical-alu detection
One issue was that the original version didn't check that swizzles
matched when comparing ALU instructions so it could end up matching
very different instructions.  Using the nir_instrs_equal function from
nir_instr_set.c which we use for CSE should be much more reliable.
Another was that the loop assumes it will only run two iterations which
may not be true.  If there's something which guarantees that this case
only happens for phis after ifs, it wasn't documented.

Fixes: 9e6b39e1d5 "nir: detect more induction variables"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 9f7ffe41dd)
2019-07-15 10:00:59 +00:00
Jason Ekstrand
d76ab7d9fb nir/instr_set: Expose nir_instrs_equal()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 6e984bcb92)
2019-07-15 09:57:17 +00:00
Connor Abbott
8bc7397e02 nir: Add a helper to determine if an intrinsic can be reordered
This is simple now, but we're going to be adding a few more conditions
to this later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit a1c737927c)
2019-07-15 09:34:37 +00:00
Marek Olšák
83c4597f19 radeonsi: don't set READ_ONLY for const_uploader to fix bindless texture hangs
Bindless textures can update descriptors with WRITE_DATA.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie airlied@redhat.com
(cherry picked from commit 5058d62b05)
2019-07-10 11:00:51 +00:00
Lionel Landwerlin
1e3b877903 vulkan/overlay: fix crash on freeing NULL command buffer
It is legal to call vkFreeCommandBuffers() on NULL command buffers.

This fix requires eb41ce1b01 ("util/hash_table: Properly handle
the NULL key in hash_table_u64").

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4438188f49 ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a72351cc76)
2019-07-09 10:32:19 +00:00
Ian Romanick
87fc035c53 mesa: Set minimum possible GLSL version
Set the absolute minimum possible GLSL version.  API_OPENGL_CORE can
mean an OpenGL 3.0 forward-compatible context, so that implies a minimum
possible version of 1.30.  Otherwise, the minimum possible version 1.20.
Since Mesa unconditionally advertises GL_ARB_shading_language_100 and
GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't
advertise any extensions to enable any shader stages (e.g.,
GL_ARB_vertex_shader).

Converts about 2,500 piglit tests from crash to skip on NV18.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110955
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0349bc3ce2)
2019-07-09 10:29:50 +00:00
Ian Romanick
47d6b60127 nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_size
This is important because, for example nir_op_fne has
dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or
64-bits.  Fixing this helps partial redundancy elimination for compares
in a few more shaders.

v2: Add unit tests for nir_opt_comparison_pre that are fixed by this
commit.

All Intel platforms had similar results.
total instructions in shared programs: 17179408 -> 17179081 (<.01%)
instructions in affected programs: 43958 -> 43631 (-0.74%)
helped: 118
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2
helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for instructions value: -3.08 -2.37
95% mean confidence interval for instructions %-change: -1.30% -0.85%
Instructions are helped.

total cycles in shared programs: 360959066 -> 360942386 (<.01%)
cycles in affected programs: 774274 -> 757594 (-2.15%)
helped: 111
HURT: 4
helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36
helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24%
HURT stats (abs)   min: 1 max: 2068 x̄: 533.25 x̃: 32
HURT stats (rel)   min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56%
95% mean confidence interval for cycles value: -200.61 -89.47
95% mean confidence interval for cycles %-change: -10.32% -6.58%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: be1cc3552b ("nir: Add nir_const_value_negative_equal")
(cherry picked from commit 0ac5ff9ecb)
2019-07-09 10:23:12 +00:00
Ian Romanick
fb2c5dd98f nir: Add unit tests for nir_opt_comparison_pre
Each tests has a comment with the expected before and after NIR.  The
tests don't actually check this.  The tests only check whether or not
the optimization pass reported progress.  I couldn't think of a robust,
future-proof way to check the before and after code.

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b08d704051)
2019-07-09 10:18:37 +00:00
Ian Romanick
f6c032c615 intel/vec4: Reswizzle VF immediates too
Previously, an instruction like

mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F]

would get rewritten as

mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F]

The latter does not produce the correct result.  The VF immediate in the
second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F].  This
commit produces the former.

Fixes: 1ee1d8ab46 ("i965/vec4: Reswizzle sources when necessary.")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 47c2aa5b48)
2019-07-09 10:14:02 +00:00
Chia-I Wu
e9e63bfba8 anv: fix VkExternalBufferProperties for host allocation
It was reported as unsupported previously.  It should be importable
and is compatible with itself.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: 69cc6272fb ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5824130389)
2019-07-09 10:12:40 +00:00
Chia-I Wu
84f76533e4 anv: fix VkExternalBufferProperties for unsupported handles
compatibleHandleTypes must include the queried handle type.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit f3c7a02a62)
2019-07-09 10:11:32 +00:00
Bas Nieuwenhuizen
e0d44fd4fe radv: Handle cmask being disallowed by addrlib.
alignment=0 does weird things with align64.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e46b41b3ae)
2019-07-09 10:10:28 +00:00
Lionel Landwerlin
5666f3b891 vulkan/overlay: fix command buffer stats
Begin/Reset of command buffer both reset the content of the command
buffer. Don't forget to wipe them on Begin.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4438188f49 ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit")
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8f0f727fe4)
2019-07-09 10:09:07 +00:00
Juan A. Suarez Romero
e42399f4de docs: add sha256 checksums for 19.1.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-07-09 09:18:55 +00:00
Juan A. Suarez Romero
fe1f7b538b docs: add release notes for 19.1.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-07-09 09:09:53 +00:00
Juan A. Suarez Romero
eea0045458 Update version to 19.1.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-07-09 09:04:10 +00:00
Jason Ekstrand
77598ddfac iris: Use a uint16_t for key sizes
sizeof(struct brw_vs_prog_key) == 324.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4633298fd6)
2019-07-05 08:47:31 +00:00
Bas Nieuwenhuizen
50c3dcd2f8 radv: Fix interactions between variable descriptor count and inline uniform blocks.
Fixes: d7e6541cc7 "radv: Only allocate supplied number of descriptors when variable."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8a053254b8)
2019-07-04 10:36:29 +02:00
Juan A. Suarez Romero
202eb29e55 intel: fix wrong format usage
Do not use the view format when filling the surface state.

Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.*

Fixes: fb1350c76f ("intel: Add and use helpers for level0 extent")

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit e06bc0b166)
2019-07-04 10:35:16 +02:00
Caio Marcelo de Oliveira Filho
95cfcc3b43 spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup
From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1):

    For objects in the Uniform, StorageBuffer, or PushConstant storage
    classes, the element’s address or location is calculated using a
    stride, which will be the Base-type’s Array Stride when the Base
    type is decorated with ArrayStride. For all other objects, the
    implementation will calculate the element’s address or location.

For non-CL shaders the driver should layout the Workgroup storage
class, so override any explicitly set ArrayStride in the shader.  This
currently fixes only the lower_workgroup_access_to_offsets case, which
is used by anv.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 050eb6389a)
2019-07-03 10:13:10 +02:00
Arfrever Frehtes Taifersar Arahesis
cb3072488c meson: Improve detection of Python when using Meson >=0.50.
Previously, on systems where multiple versions of Python 3 (e.g. 3.6 and 3.7)
are installed, wrong version of Python 3 could have been used.

The proper fix requires availability of path() method in Meson's python
module, which has been added in Meson 0.50:
https://github.com/mesonbuild/meson/pull/4616

Distro Bug: https://bugs.gentoo.org/671308
Signed-off-by: Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

v2: - Add missing `endif` keyword (Dylan)
(cherry picked from commit b120a02b21)
2019-07-02 10:12:55 +02:00
Jory Pratt
3d0e6d3cff meson: Search for execinfo.h
Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for
execinfo.h presence, just check directly. This allows the build to work
on musl.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 10e8d46601)
2019-07-02 09:57:34 +02:00
Jory Pratt
6dca27fce6 util: Heap-allocate 256K zlib buffer
The disk cache code tries to allocate a 256 Kbyte buffer on the stack.
Since musl only gives 80 Kbyte of stack space per thread, this causes a
trap.

See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size

(In musl-1.1.21 the default stack size has increased to 128K)

[mattst88]: Original author unknown, but I think this is small enough
            that it is not copyrightable.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

(cherry picked from commit fd7b7f14d8)
2019-07-02 09:56:18 +02:00
Bas Nieuwenhuizen
334f0d3ead radv: Only allocate supplied number of descriptors when variable.
Fixes: b5e04e9217 "radv: Support allocating variable size descriptor sets."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d7e6541cc7)
2019-07-02 09:53:58 +02:00
James Clarke
515f4b2f20 meson: GNU/kFreeBSD has DRM/KMS and requires -D_GNU_SOURCE
This is a regression from the old autotools build system.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 7389bf9761)
2019-07-01 11:43:32 +02:00
Gert Wollny
05af010f77 vl: Use CS composite shader only if TEX_LZ and DIV are supported
Enable the compute shader copositer only when TEX_LZ is supported by the driver.

v2: Also check whether DIV is supported.

https://bugs.freedesktop.org/show_bug.cgi?id=110783

Fixes: 9364d66cb7
 gallium/auxiliary/vl: Add video compositor compute shader render

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 75d8b4e795)
2019-07-01 11:18:04 +02:00
Gert Wollny
5cfbe55184 gallium: Add CAP for opcode DIV
Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able
to check this.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 843723e2f7)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/docs/source/screen.rst
	src/gallium/include/pipe/p_defines.h
2019-07-01 10:34:27 +02:00
Lionel Landwerlin
d14939925e intel/compiler: don't use byte operands for src1 on ICL
The simulator complains about using byte operands, we also have
documentation telling us.

Note that add operations on bytes seems to work fine on HW (like ADD).
Using dwords operands with CMP & SEL fixes the following tests :

   dEQP-VK.spirv_assembly.type.vec*.i8.*

v2: Drop the GLK changes (Matt)
    Add validator tests (Matt)

v3: Drop GLK ref (Matt)
    Don't mix float/integer in MAD (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
BSpec: 3017
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5847de6e9a)
2019-07-01 10:13:43 +02:00
Dylan Baker
38dab50ec8 Revert "meson: Add support for using cmake for finding LLVM"
This reverts commit 5157a42765.

There is a meson bug that causes llvm to always be statically linked,
which is obviously not what we want. I haven't had time to look into it
yet, but for now let's just revert it.

(cherry picked from commit 97c2c4546c)
2019-07-01 10:00:58 +02:00
Anuj Phogat
16ba6fecb2 Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.

We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.

This reverts commit 9c421d6b47.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d96cba7754)
2019-07-01 09:59:00 +02:00
Anuj Phogat
1bcdc5b4a6 Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.

We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.

This reverts commit 2be60e0c73.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 387e43b52f)
2019-07-01 09:57:42 +02:00
Anuj Phogat
e17b17c2f5 Revert "i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.

We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.

This reverts commit 85ecd14ef6.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7746d4edef)
2019-07-01 09:55:01 +02:00
Pierre-Eric Pelloux-Prayer
ac3c9a4195 radeon/uvd: fix calc_ctx_size_h265_main10
Left shift was applied twice.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110702

Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: <irherder@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c81c784a4a)
2019-07-01 09:53:20 +02:00
Pierre-Eric Pelloux-Prayer
22b21623f3 mesa: delete framebuffer texture attachment sampler views
When a context is destroyed the destroy_tex_sampler_cb makes sure that all the
sampler views created by that context are destroyed.
This is done by walking the ctx->Shared->TexObjects hash table.

In a multiple context environment the texture can be deleted by a different context,
so it will be removed from the TexObjects table and will prevent the above mechanism
to work.
This can result in an assertion in st_save_zombie_sampler_view because the
sampler_view owns a reference to a destroyed context.

This issue occurs in blender 2.80.

This commit fixes this by explicitly releasing sampler_view created by the destroyed
context for all texture attachments.

Fixes: 593e36f956 (st/mesa: implement "zombie" sampler views (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110944
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c37f03d464)
2019-07-01 09:51:44 +02:00
Eric Engestrom
f6c959afaa meson: bump required libdrm version to 2.4.81
dbb4457d98 started using drmDevicesEqual(), which was
introduced in libdrm 2.4.81

We could either copy the function locally, or bump the required version.
Since the function is non-trivial and 2.4.81 is old enough already,
I suggesting the latter.

Fixes: dbb4457d98 ("egl: add EGL_EXT_device_drm support")
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>


(cherry picked from commit 5819bc0e5c)
2019-07-01 09:49:38 +02:00
Samuel Pitoiset
adbf808e0c radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+
These two extensions are supported on GFX8 but the throughput
of 16-bit floats/integers is same as 32-bit. Also, shaderInt16
is only enabled on GFX9+ for the same reason, be more consistent.

This fixes a crash with Wolfenstein II because it expects
shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is
exposed. Note that AMDVLK only enables these extensions on GFX9+.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit ef1787dbc9)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_extensions.py
2019-06-28 10:13:48 +02:00
Kenneth Graunke
d6b1b9158e gallium: Make util_copy_image_view handle shader_access
A while back, we added a new field, but failed to update the copier.
I believe iris is the only current user of the new field, and it hasn't
used the copier, so noone noticed.

Fixes: 8b626a22b2 st/mesa: Record shader access qualifiers for images
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 255c71ec07)
2019-06-28 10:06:02 +02:00
Nanley Chery
211bedcf4d isl: Don't align phys_level0_sa by block dimension
Aligning phys_level0_sa by the compression block dimension prior to
mipmap layout causes the layout of compressed surfaces to differ from
the sampler's expectations in certain cases. The hardware docs agree:

From the BDW PRM, Vol. 5, Compressed Mipmap Layout,

   The compressed mipmaps are stored in a similar fashion to
   uncompressed mipmaps [...]

   The following exceptions apply to the layout of compressed (vs.
   uncompressed) mipmaps:
      * [...]
      * The dimensions of the mip maps are first determined by applying
	the sizing algorithm presented in Non-Power-of-Two Mipmaps
	above. Then, if necessary, they are padded out to compression
	block boundaries.

The last bullet indicates that alignment should not be done for
calculating a miplevel's dimensions, but rather for determining miplevel
placement/padding. Comply with this text by removing the extra
alignment.

Fixes some fbo-generatemipmap-formats piglit failures on all tested
platforms (SNB-KBL).

v2:
- Note fixed platforms.
- Update some consumers via a helper function.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 02f6995d76)
2019-06-28 10:03:42 +02:00
Nanley Chery
eef57b818b intel: Add and use helpers for level0 extent
Prepare for a bug fix by adding and using helpers which convert
isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of
surface elements.

v2:
- Update iris (Ken).
- Update anv.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit fb1350c76f)
2019-06-28 10:00:53 +02:00
Kenneth Graunke
97b43a8160 iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS
This makes CompressedTexSubImage from a PBO source do proper GPU
rendering to upload instead of stalling to map the PBO source on
the CPU (then copying it on the CPU).

Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this
functionality, and to Jason Ekstrand for writing the code I adapted.
Vulkan only supports a single layer, however, and this code tries to
support multiple layers as long as it's miplevel 0.

Improves performance in Sid Meier's Civilization VI:

   Average frame time (ms):         -3.67423% +/- 1.46201% (n=5)
   99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5)

(cherry picked from commit a032a9665f)
2019-06-28 09:59:05 +02:00
Dylan Baker
421aa4d162 meson: Add support for using cmake for finding LLVM
Meson has support for using cmake as a finder for some dependencies,
including LLVM. Using cmake has a lot of advantages: it needs less meson
maintenance to keep working (even for llvm updates); it works more
sanely for cross compiles (as llvm-config is a compiled binary not a
shell script). Meson 0.51.0 also has a new generic variable getter that
can be used to get information from either cmake, pkg-config, or
config-tools dependencies, which is needed for cmake. We continue to
support using llvm-config if you don't have cmake installed, or if cmake
cannot find a suitable version.

Fixes: 0d59459432
       ("meson: Force the use of config-tool for llvm")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 5157a42765)
2019-06-28 09:45:31 +02:00
Lionel Landwerlin
a0a6df95b4 intel/compiler: fix derivative on y axis implementation
This rewrites the ddy in EXECUTE_4 mode with a loop to make it more
obvious what is going on and also sets the group each of the 4 threads
in the groups are supposed to execute.

Fixes the following CTS tests :

   dEQP-VK.glsl.derivate.dfdyfine.dynamic_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-Authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2134ea3800 ("intel/compiler/fs: Implement ddy without using align16 for Gen11+")
(cherry picked from commit 836225840c)
2019-06-28 09:43:15 +02:00
Sagar Ghuge
6fbe0eea26 glsl: Fix round64 conversion function
Fix round64 function to handle round to nearest even cases specially
with positive and negative numbers with fraction part 0.5.

v2: 1) Simplify unused bits (Elie Tournier)

Fixes:
   KHR-GL45.gpu_shader_fp64.builtin.round_dvec2
   KHR-GL45.gpu_shader_fp64.builtin.round_dvec3
   KHR-GL45.gpu_shader_fp64.builtin.round_dvec4
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_double
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 06807e1948)
2019-06-26 17:32:00 +00:00
Sergii Romantsov
3e1c46f233 i965: leaking of upload-BO with push constants
In case of any enabled VS members from: uses_firstvertex,
uses_baseinstance, uses_drawid, uses_is_indexed_draw
leaks may happens.
Call gen6_upload_push_constants allocates
stage_stat->push_const_bo. It than takes pointer from
push_const_bo to draw_params_bo (in the call
brw_prepare_shader_draw_parameters by brw_upload_data)
and do reference which finally haven't got unreferenced.

Fixes leak:
 136 bytes in 1 blocks are definitely lost in loss record 6 of 13
    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596)
    by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672)
    by 0xC314BB3: brw_upload_space (intel_upload.c:88)
    by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155)
    by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300)
    by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540)
    by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659)
    by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681)
    by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052)
    by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175)
    by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386)
    by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
(cherry picked from commit 1931c97a1d)
2019-06-26 08:17:11 +00:00
Jason Ekstrand
77962816a5 anv/descriptor_set: Only write texture swizzles if we have an image view
When immutable samplers are set we call write_image_view with a NULL
image view.  This causes issues on IVB where we have to fake texture
swizzling.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999
Fixes: d2aa65eb18 "anv: Emulate texture swizzle in the shader when..."
(cherry picked from commit 0a364a4a74)
2019-06-26 07:16:56 +00:00
Ville Syrjälä
970cc023b0 anv/cmd_buffer: Reuse gen8 Cmd{Set, Reset}Event on gen7
Modern DXVK requires event support [1], but looks like it only
uses vkCmdSetEvent() + vkGetEventStatus(). So we can just
borrow the relevant code from gen8, leaving CmdWaitEvents still
unimplemented.

[1] 8c3900c533

v2: Also move CmdWaitEvents into genX_cmd_buffer.c (Jason)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6230bfeb65)
2019-06-25 16:06:33 +00:00
Rob Clark
2e83a64f64 freedreno/a5xx: fix batch leak in fd5 blitter path
Fixes: 3d198926a4 freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Signed-off-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 927fb50727)
2019-06-25 11:47:59 +00:00
Ian Romanick
f59881898f glsl: Don't increase the iteration count when there are no terminators
Incrementing the iteration count was intended to fix an off-by-one error
when the first terminator was superseded by a later terminator.  If
there is no first terminator or later terminator, there is no off-by-one
error.  Incrementing the loop count creates one.  This can be seen in
loops like:

    do {
        if (something) {
            // No breaks or continues here.
        }
    } while (false);

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Abel Briggs <abelbriggs1@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953
Fixes: 646621c66d ("glsl: make loop unrolling more like the nir unrolling path")
(cherry picked from commit ee1c69fadd)
2019-06-25 11:46:19 +00:00
Nataraj Deshpande
9171d2f19e anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format
When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform
gralloc module will select a format based on the usage flags provided by
the camera device and the other endpoint of the stream.

The patch fixes crash in vulkan when the test is run with camera stream
set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED.

Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering
on chromebook with camera HAL3.

v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take
    AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan)

Fixes: f1654fa7e3 "anv/android: support creating images from external format"
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d94fca5420)
2019-06-25 11:44:52 +00:00
Eric Anholt
e9660d3c3f freedreno: Fix up end range of unaligned UBO loads.
We need the constants uploaded to cover the NIR offset plus the size,
not the aligned-down start of our upload range plus the size.  Fixes
mistaken UBO analysis with mat3 loads.

Fixes: 893425a607 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 56842d33d5)
2019-06-25 11:43:26 +00:00
Eric Anholt
0741463bb4 freedreno: Fix UBO load range detection on booleans.
NIR 1-bit bool dests will have a bit size of 1, and thus a calculated
"bytes" of 0.  load_ubo is always loading from dwords in the source.

Fixes: 893425a607 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 5e7c96b95d)
2019-06-25 11:41:33 +00:00
Juan A. Suarez Romero
d54dc24d6d docs: add sha256 checksums for 19.1.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-25 12:56:10 +02:00
Juan A. Suarez Romero
22eddd8b9d docs: add release notes for 19.1.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-25 12:43:49 +02:00
Juan A. Suarez Romero
118c300536 Update version to 19.1.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-25 10:29:55 +00:00
Eric Engestrom
ebd90fc7e0 util/os_file: resize buffer to what was actually needed
Fixes: 316964709e "util: add os_read_file() helper"
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 955c63d364)
2019-06-21 07:39:06 +00:00
Kenneth Graunke
25a34df614 iris: Fix iris_flush_and_dirty_history to actually dirty history.
When I split iris_flush_and_dirty_history into two helper functions,
I accidentally made it stop dirtying.  Which was...sort of the point.

Fixes: 21688a306b iris: Split iris_flush_and_dirty_for_history into two helpers.
(cherry picked from commit 64fb20ed32)
[Juan A. Suarez: resoved trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/drivers/iris/iris_resource.c
2019-06-21 09:36:09 +02:00
Eric Engestrom
c36e4bd7fa glx: fix glvnd pointer types
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110709
Fixes: 22a9e00aab ("glx: Implement the libglvnd interface.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 65b016b146)
2019-06-21 07:31:50 +00:00
Samuel Pitoiset
14d7fc09cc radv: disable viewport clamping even if FS doesn't write Z
This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 0a313cc285)
2019-06-21 07:28:35 +00:00
Bas Nieuwenhuizen
927ca86698 meson: Allow building radeonsi with just the android platform.
Just as was allowed by autotools.

Fixes: 108d257a16 "meson: build libEGL"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit d1c04835ab)
2019-06-20 08:40:39 +00:00
Bas Nieuwenhuizen
867223cee1 anv: Fix vulkan build in meson.
Apparently the android part was never ported to meson.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 755c633b8d)
2019-06-20 08:36:39 +00:00
Bas Nieuwenhuizen
a5154fa69c radv: Fix vulkan build in meson.
Apparently the android part was never ported to meson.

CC: <mesa-stable@lists.freedesktop.org>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4c300bd328)
2019-06-20 08:21:01 +00:00
Samuel Pitoiset
3fdf2b9645 radv: fix FMASK expand with SRGB formats
Found while working on DCC for MSAA.

Fixes: 6b976024a8 ("radv: add support for FMASK expand")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a7f75377ab)
2019-06-19 07:26:04 +00:00
Mathias Fröhlich
15f6bb5c6c egl: Don't add hardware device if there is no render node v2.
Do not offer a hardware drm backed egl device if no render node
is available. The current implementation will fail on this
egl device. On top it issues a warning that is actually missleading.
There are finally more error paths that can fail on the way to a
hardware backed egl device. Fixing all of them would kind of require
opening the drm device and see if there is a usable driver associated
with the device. The taken approach avoids a full probe and fixes at
least this kind of problem on kvm virtualization hosts I observe here.

Fixes: dbb4457d98 ("egl: add EGL_EXT_device_drm support")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
(cherry picked from commit 5743a36b2b)
2019-06-19 07:24:37 +00:00
Dave Airlie
72eb587b97 nouveau: fix frees in unsupported IR error paths.
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.

Fixes: f014ae3c7c ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 93ba356544)
2019-06-19 07:22:57 +00:00
Rob Clark
4de4c18841 freedreno/a6xx: un-swap X24S8_UINT
The stencil is actually in the .w component, but we used to use SWAP to
remap the channels.  This doesn't work when tiled/ubwc.

Fixes:
  dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array
  dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube
  dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array
  dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube
  dEQP-GLES31.functional.stencil_texturing.misc.base_level
  dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot
  dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot
  dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
  dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
  dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil

Signed-off-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 4e72abcd97)
2019-06-18 15:39:57 +00:00
Kenneth Graunke
47f1f4f9e5 glsl: Fix out of bounds read in shader_cache_read_program_metadata
The VaryingNames array has NumVaryings entries.  But BufferStride is
a small array of MAX_FEEDBACK_BUFFERS (4) entries.  Programs with
more than 4 varyings would read out of bounds.

Also, BufferStride is set based on the shader itself, which means that
it's inherently already included in the hash, and doesn't need to be
included again.  At the point when shader_cache_read_program_metadata
is called, the linker hasn't even set those fields yet.  So, just drop
it entirely.

Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test.

Fixes: 6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 3c10a2726b)
2019-06-18 09:55:20 +00:00
Jason Ekstrand
db4850c631 anv: Set STATE_BASE_ADDRESS upper bounds on gen7
This should fix floating-point border color on all gen7 HW.  Integer is
still thoroughly busted on gen7 because it doesn't exist on IVB and it's
crazy on HSW.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9672b7044c)
2019-06-18 09:52:51 +00:00
Gert Wollny
1702733645 virgl: Assume sRGB write control for older guest kernels or virglrenderer hosts
When the host virglrenderer is an older version that doesn't check the sRGB write
control feature, or when the guest kernel doesn't support CAPS v2, then the guest
will only report support for GL 2.1 on a GL 3.3 host, even though it was supporting
3.3 with earlier guest mesa versions.

By also checking the host feature check version this regression can be avoided.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110921
Fixes: 2845939d6a
   virgl: Set sRGB write control CAP based on host capabilities

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
(cherry picked from commit 2b87753a84)
2019-06-18 09:51:20 +00:00
Bas Nieuwenhuizen
6f18adff0a radv: Decompress DCC when the image format is not allowed for buffers.
Otherwise the buffer loads/stores in the bufimage meta operations fail.

If we decompress DCC then we can use the "canonical" format compatible
with the not-supported format.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4107590911)
2019-06-18 09:47:32 +00:00
Haihao Xiang
eb1e6e6412 i965: support UYVY for external import only
It is similar with YUYV

Fixes: 165e704719 ("i965/i915: Add UYVY as the supported format")
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 8ead5bebdb)
2019-06-17 07:42:52 +00:00
Lionel Landwerlin
0f8193cb18 intel/dump: fix segfault when the app hasn't accessed the device
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f80679c8e8)
2019-06-14 09:09:37 +00:00
Eduardo Lima Mitev
efc5518410 freedreno/a5xx: Fix indirect draw max_indices calculation
The number of elements to draw should not be affected by the offset.

A similar fix was submitted for a6xx at 79180a05.

Fixes these dEQP tests on a5xx:

dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500

Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 3fb7b1fd35)
2019-06-14 09:08:45 +00:00
Alejandro Piñeiro
80965709d0 v3d: fix checking twice auf flag
Seems a C&P error, and should check for auf/muf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110902
Fixes: 8f065596d2 "v3d: Add an optimization pass for redundant flags updates."

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 17c2c9cd67)
2019-06-14 09:06:36 +00:00
Bas Nieuwenhuizen
746025fd63 radv: Skip transitions coming from external queue.
Transitions to external queue should do the transition & make sure
it works on all queues.

Fixes: 8ebc7dcb59 "radv: Allow fast clears with concurrent queue mask for some layouts."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 0667c1f14b)
2019-06-14 09:05:30 +00:00
Kevin Strasser
a48ef364e1 st/mesa: Add rgbx handling for fp formats
Add missing cases for fp32 and fp16 formats.

Fixes: c68334ffc0 "st/mesa: add floating point formats in st_new_renderbuffer_fb()"
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 845ec8576a)
2019-06-14 09:03:49 +00:00
Kevin Strasser
be69033241 gallium/winsys/kms: Fix dumb buffer bpp
The bpp in the dumb buffer creation request is hardcoded to 32, which is an
incorrect assumption as the caller is free to pick any pipe format. Use the
bpp supplied to us through util_format_get_blocksizebits().

Fixes: 3b176c441b "gallium: Add a dumb drm/kms winsys backed swrast provider"
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit ec0a68e50d)
2019-06-14 09:02:14 +00:00
Eric Engestrom
582b691062 util/futex: fix dangling pointer use
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110901
Fixes: 7dc2f47882 "util: emulate futex on FreeBSD using umtx"
Cc: Greg V <greg@unrelenting.technology>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 9996ddbb27)
2019-06-14 08:58:54 +00:00
Samuel Pitoiset
7e0b89caa9 radv: fix VK_EXT_memory_budget if one heap isn't available
When the visible VRAM size is equal to the VRAM size only two
heaps are exposed.

This fixes dEQP-VK.api.info.device.memory_budget.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d378151246)
2019-06-14 08:57:10 +00:00
Samuel Pitoiset
90291b5db1 radv: fix occlusion queries on VegaM
The number of render backends is 16 but the enabled mask is 0xaaaa.

As noticed by Bas, allowing disabled render backends might break
the OCCLUSION_QUERY packet. We don't use it yet but keep this in
mind.

This fixes dEQP-VK.query_pool.* and dEQP-VK.multiview.*.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 2ef9d2738c)
2019-06-14 08:55:38 +00:00
Lionel Landwerlin
94e2228496 anv: do not parse genxml data without INTEL_DEBUG=bat
This significantly slows down the CTS runs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 32ffd90002 ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 93b93e5a9d)
2019-06-14 08:54:08 +00:00
Richard Thier
5eccd8fa5a r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added
v1: Fix skipped slab allocators and the buffer cache.

v2: Use only 1 domain for texture allocation

v3: Added flag for the create_fence call too

Based on Marek v1 and v2 proposed fixes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=1107812.patch

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ffd2f948fe)
2019-06-14 08:52:40 +00:00
Juan A. Suarez Romero
2a5b4e2b9f docs: Add SHA256 sums for 19.1.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-11 15:25:40 +00:00
Juan A. Suarez Romero
1517811f4f docs: Add release notes for 19.1.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-11 17:07:39 +02:00
Juan A. Suarez Romero
0d2ea312b7 Update version to 19.1.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-11 16:22:23 +02:00
Bas Nieuwenhuizen
49c17e845a radv: Prevent out of bound shift on 32-bit builds.
uintptr_t is 32-bits then and shifting it by 32 bits results in undefined
behavior IIRC.

Fixes: b3c8de1c55 "radv: save all descriptor pointers into the trace BO"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 39c71e0025)
2019-06-11 08:11:24 +00:00
Samuel Pitoiset
d058124201 radv: fix setting CB_SHADER_MASK for dual source blending
CB_SHADER_MASK was computed without the second color buffer
format which looks totally wrong to me.

While we are at it, copy a comment from RadeonSI.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit e9316fdfd4)
2019-06-11 08:05:33 +00:00
Emil Velikov
d4797ff15e mapi: correctly handle the full offset table
Earlier commit converted ES1 and ES2 to a new, much simpler, dispatch
generator. At the same time, GL/glapi and the driver side are still
using the old code.

There is a hidden ABI between GL*.so and glapi.so, former referencing
entry-points by offset in the _glapi_table. Hence earlier commit added
the full table of entry-points, alongside a marker for other cases like
indirect GL(X) and driver-size remapping.

Yet the patches did not handle things fully, thus it was possible to
get different interpretations of the dispatch table after the marker.

This commit fixes that adding an indicative error message to catch
future bugs.

While here correct the marker (MAX_OFFSETS) comment.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302
Fixes: cf317bf093 ("mapi: add all _glapi_table entrypoints tostatic_data.py")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a379b1c0ee)
2019-06-11 08:01:57 +00:00
Emil Velikov
eb532d1ae7 mapi: add static_date offset to MaxShaderCompilerThreadsKHR
As elaborated in the next patch, there is some hidden ABI that
effectively require most entrypoints to be listed in the file.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302
Cc: Marek Olšák <maraeo@gmail.com>
Fixes: c5c38e831e ("mesa: implement ARB/KHR_parallel_shader_compile")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 61960547df)
2019-06-11 07:59:38 +00:00
Samuel Pitoiset
a7a2d403fd radv: fix alpha-to-coverage when there is unused color attachments
When alphaToCoverage is enabled, we should always write the alpha
channel of MRT0 if it's unused. This now matches RadeonSI.

This fixes the new CTS:
dEQP-VK.pipeline.multisample.alpha_to_coverage_unused_attachment.samples_*.alpha_invisible

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
(cherry picked from commit 91aa25f462)
2019-06-11 07:58:06 +00:00
Kenneth Graunke
84bd361217 egl/x11: calloc dri2_surf so it's properly zeroed
Commit 2282ec0a refactored drawable creation across various platforms
into a new dri2_create_drawable helper function.

The GBM code in platform_drm.c code passed in dri2_surf->gbm_surf as the
loaderPrivate, while most other backends passed in dri2_surf directly.

To try and handle this, the patch checked if dri2_surf->gbm_surf was
non-NULL, and if so, presumed that the caller is the DRM platform and
we should use the dri2_surf->gbm_surf pointer.

This worked for most platforms, which calloc their dri2_surf structure,
zeroing the data.  Unfortunately, platform_x11.c used malloc, leaving
most of the dri2_surf as garbage.  In particular, dri2_surf->gbm_surf
was often non-NULL, causing dri2_create_drawable to try and use it,
passing a garbage pointer to the createNewDrawable hook, usually leading
to a SIGBUS or SIGSEGV when trying to dereference that bad pointer.

Since most callers calloc the data, make platform_x11.c follow suit.

Fixes crashes with i915_dri.so when running dEQP-GLES2.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4e3297f7d4)
2019-06-09 16:52:59 +00:00
Eric Engestrom
c025240f6c util/os_file: actually return the error read() gave us
Fixes: 316964709e "util: add os_read_file() helper"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 7e35f20d44)
2019-06-09 16:51:36 +00:00
Rob Clark
3301eeee51 freedreno/a6xx: fix hangs with newer sqe fw
With the newer (v1.76) fw, we were getting hangs (compared to older
v1.66 fw).  Re-work the GMEM code to structure things a bit closer to
the blob.  This moves some PKT7 packets from IB2 to IB1, which I think
is what was confusing SQE and causing it to get stuck in an infinite
loop.  But in general structuring things at least closer to the same way
blob does makes it easier to compare cmdstream.

Note: this is a bit on the large side for what I'd normally consider for
stable.. but right now it is looking  like it is the newer fw that is
headed for linux-firmware.  This should defn have some soak time on
master, but probably a good idea for this patch to end up in distro mesa
builds by the time a630_sqe.fw hits linux-firmware.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit 958f6ffb60)
2019-06-09 16:50:03 +00:00
Rob Clark
9f71165a1b freedreno/a6xx: fix issues with gallium HUD
In some cases the draw for the text wasn't working.  This seems to be
fixed by resyncing some of the "golded registers" from blob (initial
values were based on somewhat older blob version).

Perhaps good to have a bit of soak time on master, but would be good
to eventually land in 19.x stable branches.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit b820c09fa8)
2019-06-09 16:49:12 +00:00
Nanley Chery
7ca66dc06b anv/cmd_buffer: Initalize the clear color struct for CNL+
On CNL+, the clear color struct is composed of RGBA channel values and
fields which are either reserved by the HW or used to control
fast-clears. Currently anv initializes the channel values to zero and
allows the other fields to be undefined.

Satisfy the MBZ field requirements by removing an optimization that
doesn't hold true for CNL+ and pulling in the number of dwords to
initialize from ISL.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit b4198e792c)
2019-06-09 16:47:13 +00:00
Charmaine Lee
6f44b7ebb0 svga: Remove unnecessary check for the pre flush bit for setting vertex buffers
This fixes the missing rebind when the can_pre_flush bit
is not set and the vertex buffers are the same as what have been sent.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit f29b8fde91)
2019-06-09 16:46:05 +00:00
Deepak Rawat
28b72f5187 winsys/svga/drm: Fix 32-bit RPCI send message
Depending on whether compiled with frame-pointer or not, the temporary
memory location used for the bp parameter in these macros are referenced
relative to the stack pointer or the frame pointer.
Hence we can never reference that parameter when we've modified either
the stack pointer or the frame pointer, because then the compiler would
generate an incorrect stack reference.

Fix this by pushing the temporary memory parameter on a known location on
the stack before modifying the stack- and frame pointers.

Also in case of failuire RPCI channel is not closed which lead to vmx
running out of channels.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 72fc886826)
2019-06-09 16:44:49 +00:00
Nataraj Deshpande
147d6693be anv: Fix check for isl_fmt in assert
Checking isl_fmt returned value in assert seems appropriate
instead of format variable.

Fixes: f1654fa7e3 "anv/android: support creating images from external format"
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
(cherry picked from commit d6724471a5)
2019-06-06 09:41:56 +00:00
Jason Ekstrand
1f40ef24cc nir/propagate_invariant: Don't add NULL vars to the hash table
Fixes: 8410cf66d "nir/propagate_invariant: Skip unknown vars"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d96878a66a)
2019-06-06 09:37:29 +00:00
Lionel Landwerlin
90623adb16 intel/perf: improve dynamic loading config detection
We're currently trying to detect dynamic loading config support by
trying to remove to test config (hard coded in the i915 driver) and
checking we get ENOENT.

This can fail if the test config was updated in Mesa but not yet in
i915.

A better way to do this is to pick an invalid ID and check for ENOENT.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c162127440)
2019-06-06 09:34:23 +00:00
Lionel Landwerlin
971eeb93e6 intel/perf: fix EuThreadsCount value in performance equations
EuThreadsCount is supposed to be the number of threads per EU, not the
total number of threads in the whole device.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1fc7b95127 ("i965: Add Gen8+ INTEL_performance_query support")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0430c6d18a)
2019-06-06 08:46:17 +00:00
Deepak Rawat
a5c864f6f8 winsys/drm: Fix out of scope variable usage
In this particular instance, struct member were used outside of the
block where it was defined. Fix this by moving the definition outside of
block.

Signed-off-by: Deepak Rawat <drawat@vmware.com>
Fixes: 569f838987 ("winsys/svga: Add support for new surface ioctl, multisample pattern")
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 828e1b0b4c)
2019-06-06 08:40:58 +00:00
Emil Velikov
626ea69627 egl/dri: flesh out and use dri2_create_drawable()
Wrap the loader->createNewDrawable() dance into a helper and use it
throughout the codebase.

This addresses a cases like surfaceless (SL) on swrast (SL on kms_swrast
is fine) where we'd attempt using the wrong driver and crash out.

v2: fixup quirky GBM (Mathias)
v3: fixup GBM for real (Marek)

Cc: mesa-stable@lists.freedesktop.org
Cc: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (v2)
Signed-off-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2282ec0ad6)
2019-06-06 08:25:47 +00:00
Juan A. Suarez Romero
9d8f104f39 Update version to 19.1.0-rc5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-05 16:23:45 +00:00
Vinson Lee
2a45ddd42d freedreno: Fix GCC build error.
../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant
    .minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 },
    ^

Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit d4e70be739)
2019-06-05 09:00:53 +00:00
Marek Olšák
96fbd54398 ac: fix a typo in ac_build_wg_scan_bottom
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c9b64b58de)
2019-06-05 08:29:08 +00:00
Rhys Perry
60688cc393 ac/nir: mark some texture intrinsics as convergent
Otherwise LLVM can sink them and their texture coordinate calculations
into divergent branches.

v2: simplify the conditions on which the intrinsic is marked as convergent
v3: only mark as convergent in FS and CS with derivative groups

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 73dda85512)
2019-06-05 08:27:14 +00:00
Samuel Pitoiset
38927a35a6 radv: do not use gfx fast depth clears for layered depth/stencil images
The driver should only fast depth clears with the graphics path
when the view covers all image layers, otherwise this might
corrupt layers when HTILE is enabled.

Cc: 19.0 19.1 mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8a35eb0602)
2019-06-04 15:06:46 +00:00
Sagar Ghuge
cf6472e780 intel/compiler: Fix assertions in brw_alu3
v2: Fix assertion for src1 (Ian Romanick)

Fixes: 3b967e17 (intel/compiler: Avoid false positive assertions)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3016756398)
2019-06-04 15:06:46 +00:00
Pierre-Eric Pelloux-Prayer
5394f1578c radeonsi: init sctx->dma_copy before using it
Commit a1378639ab reordered context functions initializations but broke
sctx->b.resource_copy_region init when using AMD_DEBUG=forcedma.

In this case sctx->dma_copy was assigned a value after being used in:
   sctx->b.resource_copy_region = sctx->dma_copy;

This commit moves the FORCE_DMA special case after sctx->dma_copy initialization.

See https://bugs.freedesktop.org/show_bug.cgi?id=110422

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4583f09caa)
2019-06-04 15:06:46 +00:00
Timothy Arceri
51998d720b st/glsl: make sure to propagate initialisers to driver storage
This essentially reverts 20234cfe3a.

Fixes piglit test:
tests/spec/arb_get_program_binary/execution/uniform-after-restore.shader_test

Fixes: 20234cfe3a "st/mesa: don't propagate uniforms when restoring from cache"

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110784
(cherry picked from commit fea36a8f43)
2019-06-04 15:06:46 +00:00
Axel Davy
8773e20238 d3dadapter9: Revert to old throttling limit value
Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2
to 1:
20909284f2

No driver seems to overwrite the default value.

One user reports severe regressions for some games.
For now, revert to the value 2 for nine.

Cc: "19.1" mesa-stable@lists.freedesktop.org

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
(cherry picked from commit 5820ac6756)
2019-06-04 15:06:46 +00:00
Marek Olšák
4524f09cc0 u_blitter: don't fail mipmap generation for depth formats containing stencil
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109754

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 4b11ed443b)
2019-06-04 15:06:46 +00:00
Rob Clark
3fce389c8b freedreno/a6xx: fix GPU crash on small render targets
Fixes dEQP-GLES2.functional.multisampled_render_to_texture.readpixels

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8eaa2d5021)
2019-06-04 15:06:46 +00:00
Rob Clark
a37f10af7b freedreno/ir3: set more barrier bits
Blob is also setting the .l bit, and it seems to solve some intermittent
failures with a couple of deqp's:

dEQP-GLES31.functional.image_load_store.2d.qualifiers.coherent_r32i
dEQP-GLES31.functional.image_load_store.2d.qualifiers.volatile_r32f

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit f9fa456e1d)
2019-06-04 15:06:46 +00:00
Jonathan Marek
90d045f993 freedreno/ir3: fix input ncomp for vertex shaders
ncomp is never set for vertex shaders, but a3xx and a4xx still use it.

Fixes: 831f1a05c0 freedreno/ir3: rework varying packing

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 1db86d8b62)
2019-06-03 08:20:25 +00:00
Bas Nieuwenhuizen
b2c5c16668 nir: Actually propagate progress in nir_opt_move_load_ubo.
Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950).

Fixes: af355aaa07 "nir: add nir_opt_move_load_ubo() optimization pass"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e24a7840f6)
2019-06-03 08:15:53 +00:00
Jan Zielinski
fecdcce09c swr/rast: fix 32-bit compilation on Linux
Removing unused but problematic code from simdlib header to fix
compilation problem on 32-bit Linux.

Reviewed-by: Alok Hota <alok.hota@intel.com>
(cherry picked from commit cf673747ce)
2019-05-31 17:03:55 +02:00
Jason Ekstrand
a13bda4957 nir/dead_cf: Call instructions aren't dead
When we inlined cf_node_has_side_effects into node_is_dead, all the
conditions flipped and we forgot to flip one.  Fortunately, it doesn't
matter right now because no one uses this pass on shaders with more than
one function.

Fixes: b50465d197 "nir/dead_cf: Inline cf_node_has_side_effects"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 8948048c6f)
2019-05-31 08:15:31 +00:00
Jason Ekstrand
c2a945771c intel/fs: Do a stalling MFENCE in endInvocationInterlock()
Fixes: 939312702e "i965: Add ARB_fragment_shader_interlock support"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9e403dc56e)
2019-05-31 08:13:44 +00:00
Jason Ekstrand
92f4a16af8 intel/fs,vec4: Use g0 as the header for MFENCE
We set header_present but then pass it some random garbage.  Give it g0
instead.  I'm not actually sure this does anything but g0 is the usual
header data and this is what the windows driver does so it seems like a
good idea.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 859de4a748)
2019-05-31 08:11:35 +00:00
Jason Ekstrand
a19270007c iris: Don't assume UBO indices are constant
It will be true for the constant/system value buffer because they use a
constant zero but it's not true in general.  If we ever got here when
the source wasn't constant, nir_src_as_uint would assert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9dc57eebd5)
2019-05-30 09:06:28 +00:00
Lionel Landwerlin
4c7dfaba9c nir/lower_non_uniform: safely iterate over blocks
This fixes a problem where the same instruction gets replaced twice.
This was happening when the replaced instruction would be at the end
of a block.

Replacement of :

   if ssa_8 {
                ....
      intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
   }

Would be :

   if ssa_8 {
      loop {
         vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) ()
         vec1 1 ssa_48 = ieq ssa_47, ssa_44
         if ssa_48 {
            loop {
               vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) ()
               vec1 1 ssa_50 = ieq ssa_49, ssa_44
               if ssa_50 {
                  intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
                  break
               } else {
        ....
   }

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bd5457641 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 366811bedb)
2019-05-30 09:01:40 +00:00
Samuel Pitoiset
411114c45c radv: allocate more space in the CS when emitting events
If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.

This fixes a crash with Quake Champions.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 47a10edefb)
2019-05-30 09:00:31 +00:00
Juan A. Suarez Romero
dd9635c1d2 Update version to 19.1.0-rc4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-05-29 16:44:45 +02:00
Timothy Arceri
0dcba748f9 Revert "st/mesa: expose 0 shader binary formats for compat profiles for Qt"
This reverts commit 55376cb31e.

It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 11e16ca7ce)
2019-05-28 07:13:40 +00:00
Lionel Landwerlin
fe7c45b97e anv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors
When using the binding tables to access arrays of YCbCr descriptors we
did not consider the offset of the accessed element. We can't do a
simple multiple because the binding table entries are tightly packed.

For example element 0 of the array could use 2 entries/planes and
element 1 could use 2 entries/planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bb8768b9d ("anv: toggle on support for VK_EXT_ycbcr_image_arrays")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 2042f22e28)
2019-05-28 07:12:43 +00:00
Chenglei Ren
16eac8f754 anv/android: fix missing dependencies issue during parallel build
The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure
it gets generated as a dependency before building them.

Signed-off-by: Chenglei Ren <chenglei.ren@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 13b38ca1e4)
2019-05-28 07:11:10 +00:00
Qiang Yu
4b3c805b88 lima: fix render to non-zero level texture
Current implementation won't respect level of surface to render.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
(cherry picked from commit 54490b0b36)
2019-05-28 07:10:04 +00:00
Qiang Yu
87ac0bd86a lima: fix lima_blit with non-zero level source resource
lima_blit will do blit between resources with different levels.
When blit from a level!=0 source, it will sample from that level
of resource as texture.

Current texture setup won't respect level when not mipmap filter.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
(cherry picked from commit 1dc593e9b9)
2019-05-28 07:09:05 +00:00
Dave Airlie
74c5367612 Revert "mesa: unreference current winsys buffers when unbinding winsys buffers"
This reverts commit 12bf7cfecf.

This commits caused lots of problems:
https://bugs.freedesktop.org/show_bug.cgi?id=110721
https://bugs.freedesktop.org/show_bug.cgi?id=110761

Fixes: 12bf7cfecf ("mesa: unreference current winsys buffers when unbinding winsys buffers")
Pushing without review as we need to get it into next stable.

(cherry picked from commit 7fe5a8e874)
2019-05-27 08:31:05 +00:00
Christian Gmeiner
95ffe6323e etnaviv: use the correct uniform dirty bits
Found during code inspection.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 78fb5594be)
2019-05-27 08:28:37 +00:00
Danylo Piliaiev
03fd344776 anv: Do not emulate texture swizzle for INPUT_ATTACHMENT, STORAGE_IMAGE
If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
or VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT, the imageView member of each
element of pImageInfo must have been created with the identity swizzle.

Fixes: d2aa65eb

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit c82dcf89ae)
2019-05-27 08:26:47 +00:00
Lionel Landwerlin
9037cf26bb vulkan: fix build dependency issue with generated files
On machines with many cores, you can run into that issue :

../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory

v2: Move declare_dependency around (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jan Ziak
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit cb7c9b2a93)
2019-05-23 08:57:26 +00:00
Greg V
b02c6e8ee7 gallium: enable dmabuf on BSD as well
The DRM_CONF_SHARE_FD code did not check for Linux, so the commit that
introduced PIPE_CAP_DMABUF broke Wayland-EGL clients on FreeBSD.

Fixes: 8ae50e60 (gallium: replace DRM_CONF_SHARE_FD with PIPE_CAP_DMABUF)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 506ebf55c0)
2019-05-23 08:56:14 +00:00
Philipp Zabel
e13c13f54c etnaviv: fill missing offset in etna_resource_get_handle
Without this gbm_bo_get_offset() can return 0 where it shouldn't.

Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1ccb8a071b)
2019-05-23 08:53:19 +00:00
Marek Olšák
60d524fd39 radeonsi: fix a regression in si_rebind_buffer
Don't update non-buffer images.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110701
Fixes: 78e35df52a "radeonsi: update buffer descriptors in all contexts after buffer invalidation"

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Tested-By: Gert Wollny <gert.wollny@collabora..com>
(cherry picked from commit d6053bf2a1)
2019-05-23 08:51:16 +00:00
Lionel Landwerlin
ce2d68aace vulkan/overlay: fix timestamp query emission with no pipeline stats
The
   if (!pipe && timestamp)

logic was broken. It should have been :

   if (!pipe && !timestamp)

Let just drop this condition as the following code does the right
thing for all cases.

An error was appearing with the following variables :

VK_INSTANCE_LAYERS=VK_LAYER_MESA_overlay VK_LAYER_MESA_OVERLAY_CONFIG=gpu_timing

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ea7a6fa980 ("vulkan/overlay: add pipeline statistic & timestamps support")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 213d6527d4)
2019-05-23 08:50:11 +00:00
Marek Olšák
c1d83ae9fb radeonsi: update buffer descriptors in all contexts after buffer invalidation
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 78e35df52a)
[Juan: resolve trivial conflicts]
[Juan: remove the commit from the ignored cherry-pick]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_draw.c
2019-05-23 08:48:21 +00:00
Juan A. Suarez Romero
1dd62eb6e2 Update version to 19.1.0-rc3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-05-21 14:09:14 +00:00
Caio Marcelo de Oliveira Filho
ab75e1e289 nir: Fix clone of nir_variable state slots
When num_state_slots is 0, don't create the array.  This was
triggering the following assert when running vkcube with
NIR_TEST_CLONE=1

    vkcube: ../src/compiler/nir/nir_split_per_member_structs.c:66:
    split_variable: Assertion `var->state_slots == NULL' failed.

Fixes: 9fbd390dd4 "nir: Add support for cloning shaders"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 005cc9ae37)
2019-05-21 09:04:42 +00:00
Charmaine Lee
2153c3ae8e mesa: unreference current winsys buffers when unbinding winsys buffers
This fixes surface leak when no winsys buffers are bound.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 12bf7cfecf)
2019-05-21 09:02:06 +00:00
Charmaine Lee
04e9d7bf8f st/mesa: purge framebuffers with current context after unbinding winsys buffers
With commit c89e8470e5, framebuffers are purged after unbinding context,
but this change also introduces a heap corruption when running Rhino application
on VMware svga device. Instead of purging the framebuffers after the context
is unbound, this patch first ubinds the winsys buffers, then purges the framebuffers
with the current context, and then finally unbinds the context.

This fixes heap corruption.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit b480adfa5e)
2019-05-21 08:59:28 +00:00
Juan A. Suarez Romero
857210b0dd cherry-ignore: radeonsi: update buffer descriptors in all contexts after buffer invalidation
stable: this commit causes issues in several systems.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-05-21 08:54:23 +00:00
Eric Engestrom
6bac1a041d meson: expose glapi through osmesa
Suggested-by: Pierre Guillou <pierre.guillou@lip6.fr>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659
Fixes: f121a669c7 "meson: build gallium based osmesa"
Fixes: cbbd5bb889 "meson: build classic osmesa"
Cc: Brian Paul <brianp@vmware.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit ccb8ea7acf)
2019-05-21 08:42:32 +00:00
Jason Ekstrand
2040f10cb0 anv: Only consider minSampleShading when sampleShadingEnable is set
From the Vulkan 1.1.107 spec:

    Sample shading is enabled for a graphics pipeline:

      - If the interface of the fragment shader entry point of the
        graphics pipeline includes an input variable decorated with
        SampleId or SamplePosition. In this case minSampleShadingFactor
        takes the value 1.0.

      - Else if the sampleShadingEnable member of the
        VkPipelineMultisampleStateCreateInfo structure specified when
        creating the graphics pipeline is set to VK_TRUE. In this case
        minSampleShadingFactor takes the value of
        VkPipelineMultisampleStateCreateInfo::minSampleShading.

    Otherwise, sample shading is considered disabled.

In other words, if sampleShadingEnable is set to VK_FALSE, we should
ignore minSampleShading.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 1c92358bd8)
2019-05-21 08:42:32 +00:00
Jason Ekstrand
260f517d54 anv: Stop forcing bindless for images
This was an unintended artifact of my testing of bindless images.  We
should be choosing bindless or not dynamically.

Fixes: c0d9926df7 "anv: Use bindless handles for images"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 8413fd136c)
2019-05-21 08:42:32 +00:00
Neha Bhende
b6778c9f52 draw: fix memory leak introduced 7720ce32a
We need to free memory allocation PrimitiveOffsets in draw_gs_destroy().
This fixes memory leak found while running piglit on windows.

Fixes: 7720ce32a ("draw: add support to tgsi paths for geometry streams. (v2)")

Tested with piglit

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 926a6a35cf)
2019-05-21 08:42:32 +00:00
Jason Ekstrand
5d05324e65 anv: Emulate texture swizzle in the shader when needed
Now that we have the descriptor buffer mechanism, emulated texture
swizzle can be implemented in a very non-invasive way.  Previous
attempts all tried to extend the push constant based image param
mechanism which was gross.  This could, in theory, be done much faster
with a magic back-end instruction which does indirect MOVs but Vulkan on
IVB is already so slow this isn't going to matter much.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104355
Cc: "19.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d2aa65eb18)
2019-05-21 08:42:32 +00:00
Samuel Pitoiset
8dbdeb27f3 radv: add a workaround for Monster Hunter World and LLVM 7&8
The load/store optimizer pass doesn't handle WaW hazards correctly
and this is the root cause of the reflection issue with Monster
Hunter World. AFAIK, it's the only game that are affected by this
issue.

This is fixed with LLVM r361008, but we need a workaround for older
LLVM versions unfortunately.

Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d7501834cd)
2019-05-21 08:42:32 +00:00
Gert Wollny
dab3945ff3 Revert "softpipe/buffer: load only as many components as the the buffer resource type provides"
This reverts commit 865b9ddae4.

The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only
one component would be supported. The original issue is still relevant, but
the fix should be different.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0f598ed7b3)
2019-05-21 08:42:32 +00:00
Dave Airlie
d08fde8e7a glsl: init packed in more constructors.
src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls.

from Coverity.

Fixes: 659f333b3a (glsl: add packed for struct types)

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit b2d4d08a5c)
2019-05-21 08:42:32 +00:00
Nanley Chery
f69eb770cd anv: Fix some depth buffer sampling cases on ICL+
Don't attempt sampling with HiZ if the sampler lacks support for it. On
ICL, the HW docs state that sampling with HiZ is not supported and that
instances of AUX_HIZ in the RENDER_SURFACE_STATE object will be
interpreted as AUX_NONE.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 629806b55b)
2019-05-21 08:42:32 +00:00
Caio Marcelo de Oliveira Filho
5bed00cf0f nir: Fix nir_opt_idiv_const when negatives are involved
First, allow the case for negative powers of two.  Then ensure that we
use the absolute value of the non-constant value to calculate the
quotient -- this was hinted in the code by the name 'uq'.

This fixes an issue when 'd' is positive and 'n' is negative.  The
ishr will propagate the negative sign and we'll use nir_ineg() again,
incorrectly.

v2: First version used only ishr, but that isn't sufficient, since it
    never can produce a zero as a result.  (Jason)
    Allow negative powers of two.  (Caio)

Fixes: 74492ebad9 "nir: Add a pass for lowering integer division by constants"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8a995f2b5e)
2019-05-21 08:42:32 +00:00
Marek Olšák
b551be82a7 radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets
This is a prerequisite for the next commit.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0f1b070bad)
2019-05-17 07:41:15 +00:00
Eric Engestrom
7fa89fd959 util/os_file: always use the 'grow' mechanism
Use fstat() only to pre-allocate a big enough buffer.

This fixes a race where if the file grows between fstat() and read()
we would be missing the end of the file, and if the file slims down
read() would just fail.

Fixes: 316964709e "util: add os_read_file() helper"
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 22c1657d05)
2019-05-16 17:20:13 +00:00
Lionel Landwerlin
5fcfcdb162 nir: lower_non_uniform_access: iterate over instructions safely
This pass moves instructions around and adds control-flow in the
middle of blocks. We need to use nir_foreach_instr_safe to ensure that
we iterate over instructions correctly anyway.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bd5457641 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e04cf0b612)
2019-05-16 17:18:34 +00:00
Lionel Landwerlin
d70d8b2ffa vulkan/overlay: fix truncating error on 32bit platforms
Non dispatchable handles can be uint64_t. When compiling the layer on
a 32bit platform, this will lead to casting uint64_t into (void *)
which is 32bit, leading to incorrect handles being mapped internally
in the layer.

v2: Use more HKEY() (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Józef Kucia <joseph.kucia@gmail.com>
Fixes: 2d2927938f ("vulkan/overlay-layer: fix cast errors")
Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>
(cherry picked from commit 877b371cbb)
[Juan: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/vulkan/overlay-layer/overlay.cpp
2019-05-16 09:40:47 +02:00
Lionel Landwerlin
558a067d17 vulkan/overlay-layer: fix cast errors
Not quite sure what version of GCC/Clang produces errors (8.3.0
locally was fine).

v2: also fix an integer literal issue (Karol)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 2d2927938f)
2019-05-16 07:36:57 +00:00
Lionel Landwerlin
51354d2bf5 nir: fix lower_non_uniform_access pass
Obviously missing the instruction insertion into the SSA list.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bd5457641 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 391a836e8f)
2019-05-16 07:34:09 +00:00
Ian Romanick
06bf5428cf Revert "nir: add late opt to turn inot/b2f combos back to bcsel"
This reverts commit 7acc865226.

With these optimizations in place, the extra constant folding added in
the next commit extends some live ranges of 0.0 and ±1.0 constants, and
that causes several hundred shaders to have more spills and fills.

I believe this optimization we made basically irrelevant by 7725d60938
"intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))".

All Gen7.5+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 17225303 -> 17224634 (<.01%)
instructions in affected programs: 879402 -> 878733 (-0.08%)
helped: 679
HURT: 1
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.03% max: 0.93% x̄: 0.24% x̃: 0.05%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.02 -0.95
95% mean confidence interval for instructions %-change: -0.26% -0.22%
Instructions are helped.

total cycles in shared programs: 360842595 -> 360828542 (<.01%)
cycles in affected programs: 110443594 -> 110429541 (-0.01%)
helped: 389
HURT: 265
helped stats (abs) min: 1 max: 7525 x̄: 162.81 x̃: 28
helped stats (rel) min: <.01% max: 18.66% x̄: 1.11% x̃: 0.11%
HURT stats (abs)   min: 1 max: 7614 x̄: 185.96 x̃: 48
HURT stats (rel)   min: <.01% max: 25.08% x̄: 0.95% x̃: 0.10%
95% mean confidence interval for cycles value: -75.65 32.67
95% mean confidence interval for cycles %-change: -0.49% -0.06%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 12159 -> 12161 (0.02%)
spills in affected programs: 13 -> 15 (15.38%)
helped: 0
HURT: 1

total fills in shared programs: 25207 -> 25208 (<.01%)
fills in affected programs: 25 -> 26 (4.00%)
helped: 0
HURT: 1

Ivy Bridge
total instructions in shared programs: 12082019 -> 12082013 (<.01%)
instructions in affected programs: 1033 -> 1027 (-0.58%)
helped: 6
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.41% max: 0.83% x̄: 0.61% x̃: 0.59%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.78% -0.45%
Instructions are helped.

total cycles in shared programs: 179849270 -> 179849157 (<.01%)
cycles in affected programs: 4735 -> 4622 (-2.39%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 74 x̄: 28.25 x̃: 18
helped stats (rel) min: 0.13% max: 6.53% x̄: 2.85% x̃: 2.36%
95% mean confidence interval for cycles value: -82.73 26.23
95% mean confidence interval for cycles %-change: -7.98% 2.28%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10882750 -> 10882748 (<.01%)
instructions in affected programs: 266 -> 264 (-0.75%)
helped: 2
HURT: 0

Iron Lake
total cycles in shared programs: 188609440 -> 188609448 (<.01%)
cycles in affected programs: 4320 -> 4328 (0.19%)
helped: 0
HURT: 2

GM45
total cycles in shared programs: 129016868 -> 129016872 (<.01%)
cycles in affected programs: 2302 -> 2306 (0.17%)
helped: 0
HURT: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d2a9ba03e3)
[Juan: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/compiler/nir/nir_opt_algebraic.py
2019-05-15 10:36:12 +02:00
Jason Ekstrand
75ea0eeed1 intel/fs/ra: Stop adding RA interference to too many SENDS nodes
We only have one node per VGRF so this was adding way too much
interference.  No idea how we didn't catch this before.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15311100 -> 15311100 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 355468050 -> 355543197 (0.02%)
    cycles in affected programs: 2472492 -> 2547639 (3.04%)
    helped: 17
    HURT: 20

Fixes: 014edff0d2 "intel/fs: Add interference between SENDS sources"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 096ad8a809)
2019-05-15 08:28:06 +00:00
Jason Ekstrand
8cf49e1662 intel/fs/ra: Only add dest interference to sources that exist
Fixes: 83dedb6354 "i965: Add src/dst interference for certain"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 88cac12230)
2019-05-15 08:26:52 +00:00
Juan A. Suarez Romero
c03d9a7fa9 Update version to 19.1.0-rc2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-05-14 15:36:06 +02:00
Gert Wollny
9b51dcf1e2 softpipe/buffer: load only as many components as the the buffer resource type provides
Otherwise we risk to read past the end of the buffer.

In addition, change the loop counters to unsigned to be consistent
with the types.

Fixes: afa8707ba9
    softpipe: add SSBO/shader atomics support.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 865b9ddae4)
2019-05-14 08:41:50 +00:00
Bas Nieuwenhuizen
914ac06e32 radv: Do not use extra descriptor space for the 3rd plane.
While ImageFormatProperties returns the number of internal descriptors,
it turns out that applications do not need to actually allocate more
descriptors in the descriptor pool.

So if we make descriptors with more planes larger we have to be
convervative and always allocate space for the larger descriptors
which is a waste given the low usage of this ext.

So let us make use of the fact that 3plane formats all have the
same formats & dimensions for the last two planes. This way we
only need the first half of the descriptor of the 3rd plane and
can share the second half of the second plane.

This allows us to use 16 bytes for the descriptor which nicely
fits into the 16 bytes that are unused right next to the sampler.

Fixes: 5564c38212 "radv: Update descriptor sets for multiple planes."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f53ebfb450)
2019-05-13 10:47:26 +00:00
Józef Kucia
e2654c2379 radv: clear vertex bindings while resetting command buffer
Only vertex inputs accessed by vertex shader must have valid buffers
bound.

Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 5010436e09 "radv: bail out when binding the same vertex buffers"
(cherry picked from commit 24af0f1318)
2019-05-13 10:45:10 +00:00
Marek Olšák
bb845df961 st/mesa: fix 2 crashes in st_tgsi_lower_yuv
src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct
 tgsi_full_dst_register *, const struct tgsi_full_dst_register *, unsigned
 int): assertion "dst->Register.WriteMask" failed

The second crash was due to insufficient allocated size for TGSI
instructions.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 83435e748f)
2019-05-13 10:44:06 +00:00
Kenneth Graunke
f7c0ca6d38 iris: Use full ways for L3 cache setup on Icelake.
Anuj fixed this in i965 and anv, but the fix never landed in iris.
Fixes tessellation corruption on Icelake.  Thanks to Rafael for
bisecting this and tracking it down.

Fixes: d0996d5fab iris: Emit default L3 config for the render pipeline
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 72ccefb529)
2019-05-13 10:41:16 +00:00
Caio Marcelo de Oliveira Filho
38fdfdaff1 anv: Fix limits when VK_EXT_descriptor_indexing is used
Update various limits in
VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously
zero to their values from VkPhysicalDeviceLimits.  When using
VK_EXT_descriptor_indexing, the former limits will apply to all the
descriptor layout sets -- not only those using the new feature bits.

For the reference, VK_EXT_descriptor_indexing says

    "There are new descriptor set layout and descriptor pool creation
    flags that are required to opt in to the update-after-bind
    functionality, and there are separate maxPerStage* and
    maxDescriptorSet* limits that apply to these descriptor set
    layouts which may be much higher than the pre-existing limits. The
    old limits only count descriptors in non-updateAfterBind
    descriptor set layouts, and the new limits count descriptors in
    all descriptor set layouts in the pipeline layout."

Fixes: 6e230d7607 "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3610081daa)
2019-05-13 10:40:03 +00:00
Lionel Landwerlin
87722e0c42 vulkan/overlay: keep allocating draw data until it can be reused
The original implementation assumed that we could allocate the same
amount of command buffers as the number of images in the swapchain.
But the application could potentially render much faster and rerender
into images that have been submitted for presentation but not yet
presented.

This change keeps on allocating command buffers, vertex buffer, vertex
indices as well as a semaphore and a fence for as long as we can't
reuse a previously submitted one.

This fixes rendering issues in the overlay at high frame rates.

v2: Don't recreate semaphores constantly (Józef)

v3: Drop useless surface & FreeCommandBuffers (Józef)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>
(cherry picked from commit ad2b4aa378)
[Juan: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/vulkan/overlay-layer/overlay.cpp
2019-05-13 12:37:08 +02:00
Kenneth Graunke
f0e147bd47 i965: Fix memory leaks in brw_upload_cs_work_groups_surface().
This was taking a reference to the 64kB upload buffer and never
returning it, leaking a reference each time this atom triggered.

This leaked lots of 64kB upload BOs, eventually running us out of
of VMA space.  This would usually happen when using mpv to watch a
movie, after 20-40 minutes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134
Fixes: 63d7b33f51 i965/cs: Setup surface binding for gl_NumWorkGroups
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 3f60810de0)
2019-05-13 10:31:35 +00:00
Eric Engestrom
1fc65774e9 travis: fix syntax, and drop unused stuff
Fixes: a988d95389 "ci: Delete autotools build jobs"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 6e5728e5c9)
2019-05-13 10:30:30 +00:00
Leo Liu
349153f097 winsys/amdgpu: add VCN JPEG to no user fence group
There is no user fence for JPEG, the bug triggering
kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ceba9ff294)
2019-05-10 17:06:21 +00:00
Tomeu Vizoso
f8ec40e28b panfrost: Only take the fast paths on buffers aligned to block size
As the functions operate on 16-byte blocks.

Fixes this Valgrind error:

Invalid read of size 4
   at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85)
   by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171)
   by 0x584F587: panfrost_tile_texture (pan_resource.c:489)
   by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525)
   by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516)
   by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515)
   by 0x5875F13: u_default_texture_subdata (u_transfer.c:80)
   by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
   by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
   by 0x5391353: teximage (teximage.c:3105)
   by 0x5391353: teximage_err (teximage.c:3132)
   by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
   by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
 Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd
   at 0x483F5C8: malloc (vg_replace_malloc.c:299)
   by 0x584F47D: panfrost_transfer_map (pan_resource.c:467)
   by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
   by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59)
   by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
   by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
   by 0x5391353: teximage (teximage.c:3105)
   by 0x5391353: teximage_err (teximage.c:3132)
   by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
   by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
   by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3538ab570)
2019-05-10 17:04:58 +00:00
Tomeu Vizoso
5e75803339 panfrost: Fix two uninitialized accesses in compiler
Valgrind was complaining of those.

NIR_PASS only sets progress to TRUE if there was progress.

nir_const_load_to_arr() only sets as many constants as components has
the instruction.

This was causing some dEQP tests to flip-flop, such as:

dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Fixes: 14531d676b ("nir: make nir_const_value scalar")
(cherry picked from commit 554975bafa)
2019-05-10 17:02:37 +00:00
Rob Clark
f1ab22209e freedreno/ir3: fix rasterflat/glxgears
Ofc legacy gl features that are broken don't trigger fails in deqp.  I
should remember to test glxgears more often.

Fixes: 7ff6705b8d freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 9faf218b8c)
2019-05-10 17:00:35 +00:00
Lionel Landwerlin
e0c082d6eb anv: Use corresponding type from the vector allocation
We didn't notice this issue much because the 2 struct share a similar
layout, expect for the additional fields...

We run into that issue in Anv :

==15236== Invalid write of size 8
==15236==    at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211)
==15236==    by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264)
==15236==    by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312)
==15236==    by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167)
==15236==    by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190)
==15236==    by 0x8CF60871: alloc_surface_state (anv_image.c:1122)
==15236==    by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519)
==15236==    by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358)
==15236==  Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd
==15236==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15236==    by 0x8D2578E6: u_vector_init (u_vector.c:47)
==15236==    by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168)
==15236==    by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921)
==15236==    by 0x8CF56517: anv_CreateDevice (anv_device.c:1909)
==15236==    by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073)
==15236==    by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so)
==15236==    by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so)
==15236==    by 0x8BCB35C6: loader_create_device_chain (loader.c:5449)
==15236==    by 0x8BCBC230: vkCreateDevice (trampoline.c:838)

v2: Rename mmap_cleanups to avoid confusion (Caio)

v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit f2f6ac1c08)
2019-05-10 16:58:53 +00:00
Samuel Pitoiset
a97f44ac1f radv: fix setting the number of rectangles when it's dyanmic
We need to know the number of rectangles.

This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*.

Fixes: 5db0bf9994 ("radv: Implement VK_EXT_discard_rectangles.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 53dfff1c4d)
2019-05-09 10:44:18 +00:00
Dave Airlie
5d7d13d227 kmsro: add _dri.so to two of the kmsro drivers.
Fixes: 8cfc17bdda (kmsro: Add the rest of the current set of tinydrm drivers.)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 0a42d5b98b)
2019-05-09 10:43:03 +00:00
Dylan Baker
4a7b0cc5e4 meson: Force the use of config-tool for llvm
meson git now has a cmake find method for llvm, but it lacks a couple of
features that we use from the config tool version. Until that reaches
parity we need to use the config-tool version.

CC: 19.0 19.1 <<mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 0d59459432)
2019-05-09 10:40:22 +00:00
Lionel Landwerlin
d95797de61 anv: fix use after free
Once mem->bo is removed from the cache, it is likely to be freed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b80930a6fe ("anv: add support for VK_EXT_memory_budget")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 43596e5f34)
2019-05-09 10:39:19 +00:00
Lionel Landwerlin
424b60dc70 anv: rework queries writes to ensure ordering memory writes
We use a mix of MI & PIPE_CONTROL commands to write our queries' data
(results & availability). Those commands' memory write order is not
guaranteed with regard to their order in the command stream, unless CS
stalls are inserted between them. This is problematic for 2 reasons :

   1. We copy results from the device using MI commands even though
      the values are generated from PIPE_CONTROL, meaning we could
      copy unlanded values into the results and then copy the
      availability that is inconsistent with the values.

   2. We allow the user to poll on the availability values of the
      query pool from the CPU. If the availability lands in memory
      before the values then we could return invalid values.

This change does 2 things to address this problem :

      - We use either PIPE_CONTROL or MI commands to write both
        queries values and availability, so that the ordering of the
        memory writes guarantees that if availability is visible,
        results are also visible.

      - For the occlusion & timestamp queries we apply a CS stall
        before copying the results on the device, to ensure copying
        with MI commands see the correct values of previous
        PIPE_CONTROL writes of availability (required by the Vulkan
        spec).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a07d06f103)
2019-05-09 10:38:10 +00:00
Timothy Arceri
9d610c1cc3 Revert "glx: Fix synthetic error generation in __glXSendError"
This reverts commit e91ee763c3.

This seems to have broken a number of wine games. Lets revert
everything for now and try again later.

Acked-by: Adam Jackson <ajax@redhat.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590
(cherry picked from commit a01b393c39)
2019-05-08 10:32:39 +00:00
Kenneth Graunke
f770e81ba7 i965: leave the top 4Gb of the high heap VMA unused
This ports commit 9e7b0988d6 from anv
to i965.  Thanks to Lionel for noticing that it was missing!

Fixes: 01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d568fcd0a0)
2019-05-08 10:28:39 +00:00
Kenneth Graunke
faa7daa55e i965: Force VMA alignment to be a multiple of the page size.
This should happen regardless, but let's be paranoid.

Fixes: 01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 17210c63a9)
2019-05-08 10:27:20 +00:00
Kenneth Graunke
fd27561c9d i965: Fix BRW_MEMZONE_LOW_4G heap size.
The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages,
and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB.

So we can't use the top page.

Fixes: 01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 15f134c628)
2019-05-08 12:26:08 +02:00
Juan A. Suarez Romero
5d72a334e8 Update version to 19.1.0-rc1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-05-07 16:10:40 +00:00
Timothy Arceri
825ca9e42e radeonsi: add config entry for Counter-Strike Global Offensive
This fixes rendering issues with gun scopes which is rather
important.

Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100239
(cherry picked from commit 49025292fb)
2019-05-07 10:47:39 +00:00
Erik Faye-Lund
67f2be0fbf draw: flush when setting stream-out targets
We need to re-prepare the middle-end state to pick up changes to this
state to react correctly to pausing/resuming stream-out. So let's add a
flush here.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: ec8cbd79ac "draw/softpipe: EXT_transform_feedback support (v2)"
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d84b85bc28)
2019-05-07 10:46:02 +00:00
John Stultz
05faf6eb56 mesa: Makefile.sources: Add nir_lower_fb_read.c to Makefile.sources list
In commit a99c360a46 (nir: add pass to lower fb reads), a new
file was added that needs to also be added to the
Makefile.sources list used by the Android and SCons build system.

Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: a99c360a46 ("nir: add pass to lower fb reads")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
(cherry picked from commit c7f2145b4b)
2019-05-06 17:04:24 +02:00
John Stultz
3495bdca13 mesa: Makefile.sources: Add ir3_nir_lower_load_barycentric_at_sample/offset to Makefile.sources
In commit 2f0b9d2249 ("freedreno/ir3: lower
load_barycentric_at_offset") a new file was added that needs to
also be added to the Makefile.sources list used by Android and
SCons build system.

Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 2f0b9d2249 ("freedreno/ir3: lower load_barycentric_at_offset")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
(cherry picked from commit d04f44a459)
2019-05-06 17:03:09 +02:00
John Stultz
f93e1f92c4 mesa: android: freedreno: Fix build failure due to path change
The ir3_nir_trig.py file was moved in a previous commit,
aa0fed10d3 (freedreno: move ir3 to common location),
so update the Android.gen.mk file to match.

Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
(cherry picked from commit c935862127)
2019-05-06 17:02:18 +02:00
Amit Pundir
8c0b80e08a mesa: android: freedreno: build libfreedreno_{drm,ir3} static libs
Add libfreedreno_drm/ir3 to the build

Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: b4476138d5 ("freedreno: move drm to common location")
Fixes: aa0fed10d3 ("freedreno: move ir3 to common location")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[jstultz: Tweaked to add extra ir3 files from master]
Signed-off-by: John Stultz <john.stultz@linaro.org>
(cherry picked from commit 88105375c9)
2019-05-06 17:00:59 +02:00
Bas Nieuwenhuizen
070d763d5d radv: Implement cosited_even sampling.
Apparently cosited_even was the required one instead of midpoint.

This adds slight offset of 0.5 pixels to the coordinates (+ we need
the image size to convert to normalized coords)

Fixes: 91702374d5 "radv: Add ycbcr lowering pass."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 5692351264)
2019-05-06 16:59:58 +02:00
Bas Nieuwenhuizen
ed0d4eaa4c radv: Disable subsampled formats.
Broken on Polaris and since I discovered NV12 is not subsampled, but
a 2-plane format I decided I don't really care.

Work to do to re-enable:

1) Figure out which devices support it natively.
2) Write some software emulation for the others.

Fixes: 52c1adda21 "radv: Add ycbcr format features."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 5cbe12ad1b)
2019-05-06 16:59:00 +02:00
Timothy Arceri
6e52daa18c util/drirc: add workarounds for bugs in Doom 3: BFG
This makes the game playable on radeonsi.

Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110143
(cherry picked from commit 1af72fa4d6)
2019-05-06 16:48:57 +02:00
351 changed files with 11232 additions and 2627 deletions

View File

@@ -1,198 +1,40 @@
language: c
dist: xenial
os: osx
cache:
apt: true
ccache: true
env:
global:
- XORG_RELEASES=https://xorg.freedesktop.org/releases/individual
- XCB_RELEASES=https://xcb.freedesktop.org/dist
- WAYLAND_RELEASES=https://wayland.freedesktop.org/releases
- XORGMACROS_VERSION=util-macros-1.19.0
- GLPROTO_VERSION=glproto-1.4.17
- DRI2PROTO_VERSION=dri2proto-2.8
- LIBPCIACCESS_VERSION=libpciaccess-0.13.4
- LIBDRM_VERSION=libdrm-2.4.97
- XCBPROTO_VERSION=xcb-proto-1.13
- RANDRPROTO_VERSION=randrproto-1.3.0
- LIBXRANDR_VERSION=libXrandr-1.3.0
- LIBXCB_VERSION=libxcb-1.13
- LIBXSHMFENCE_VERSION=libxshmfence-1.2
- LIBVDPAU_VERSION=libvdpau-1.1
- LIBVA_VERSION=libva-1.7.0
- LIBWAYLAND_VERSION=wayland-1.15.0
- WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
- LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
- PATH="$HOME/prefix/bin:$PATH"
matrix:
include:
- env:
- LABEL="macOS meson"
- BUILD=meson
- DRI_LOADERS="-Dplatforms=x11"
- GALLIUM_DRIVERS=swrast
os: osx
- PKG_CONFIG_PATH=""
before_install:
- |
if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext
# Set PATH for homebrew pip3 installs
PATH="$HOME/Library/Python/3.6/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"
# Set PATH for keg-only gettext
PATH="/usr/local/opt/gettext/bin:${PATH}"
- HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext
# Set PATH for homebrew pip3 installs
- PATH="$HOME/Library/Python/3.6/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
- PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"
# Set PATH for keg-only gettext
- PATH="/usr/local/opt/gettext/bin:${PATH}"
# Install xquartz for prereqs ...
XQUARTZ_VERSION="2.7.11"
wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg
hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg
sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /
hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}
# ... and set paths
PATH="/opt/X11/bin:${PATH}"
PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"
ACLOCAL="aclocal -I /opt/X11/share/aclocal -I /usr/local/share/aclocal"
fi
# Install xquartz for prereqs ...
- XQUARTZ_VERSION="2.7.11"
- wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg
- hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg
- sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /
- hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}
# ... and set paths
- PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"
install:
# Install a more modern meson from pip, since the version in the
# ubuntu repos is often quite old.
- if test "x$BUILD" = xmeson; then
pip3 install --user meson;
pip3 install --user mako;
fi
# Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting).
- |
if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -jxvf $XORGMACROS_VERSION.tar.bz2
(cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
tar -jxvf $GLPROTO_VERSION.tar.bz2
(cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
tar -jxvf $DRI2PROTO_VERSION.tar.bz2
(cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
tar -jxvf $XCBPROTO_VERSION.tar.bz2
(cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
tar -jxvf $LIBXCB_VERSION.tar.bz2
(cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2
(cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -jxvf $LIBDRM_VERSION.tar.bz2
(cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2
tar -jxvf $RANDRPROTO_VERSION.tar.bz2
(cd $RANDRPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2
tar -jxvf $LIBXRANDR_VERSION.tar.bz2
(cd $LIBXRANDR_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
(cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
tar -jxvf $LIBVDPAU_VERSION.tar.bz2
(cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
tar -jxvf $LIBVA_VERSION.tar.bz2
(cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
tar -axvf $LIBWAYLAND_VERSION.tar.xz
(cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
(cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Meson requires ninja >= 1.6, but xenial has 1.3.x
wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip
unzip ninja-linux.zip
mv ninja $HOME/prefix/bin/
# Generate this header since one is missing on the Travis instance
mkdir -p linux
printf "%s\n" \
"#ifndef _LINUX_MEMFD_H" \
"#define _LINUX_MEMFD_H" \
"" \
"#define MFD_CLOEXEC 0x0001U" \
"#define MFD_ALLOW_SEALING 0x0002U" \
"" \
"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
# Generate this header, including the missing SYS_memfd_create
# macro, which is not provided by the header in the Travis
# instance
mkdir -p sys
printf "%s\n" \
"#ifndef _SYSCALL_H" \
"#define _SYSCALL_H 1" \
"" \
"#include <asm/unistd.h>" \
"" \
"#ifndef _LIBC" \
"# include <bits/syscall.h>" \
"#endif" \
"" \
"#ifndef __NR_memfd_create" \
"# define __NR_memfd_create 319 /* Taken from <asm/unistd_64.h> */" \
"#endif" \
"" \
"#ifndef SYS_memfd_create" \
"# define SYS_memfd_create __NR_memfd_create" \
"#endif" \
"" \
"#endif" > sys/syscall.h
fi
- pip3 install --user meson
- pip3 install --user mako
script:
if test "x$BUILD" = xmeson; then
if test -n "$LLVM_CONFIG"; then
# We need to control the version of llvm-config we're using, so we'll
# generate a native file to do so. This requires meson >=0.49
#
echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file
$LLVM_CONFIG --version
else
: > native.file
fi
export CFLAGS="$CFLAGS -isystem`pwd`"
meson _build \
--native-file=native.file \
-Dbuild-tests=true \
${DRI_LOADERS} \
-Ddri-drivers=${DRI_DRIVERS:-[]} \
-Dgallium-drivers=${GALLIUM_DRIVERS:-[]} \
-Dvulkan-drivers=${VULKAN_DRIVERS:-[]}
meson configure _build
ninja -C _build
ninja -C _build test
fi
- meson _build
-Dbuild-tests=true
-Dplatforms=x11
-Dgallium-drivers=swrast
- ninja -C _build
- ninja -C _build test

View File

@@ -39,7 +39,7 @@ LOCAL_CFLAGS += \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"
-DPACKAGE_BUGREPORT=\"https://gitlab.freedesktop.org/mesa/mesa/issues\"
# XXX: The following __STDC_*_MACROS defines should not be needed.
# It's likely due to a bug elsewhere, but let's temporarily add them

View File

@@ -110,6 +110,7 @@ endef
# add subdirectories
SUBDIRS := \
src/freedreno \
src/gbm \
src/loader \
src/mapi \

View File

@@ -73,7 +73,7 @@ with open("VERSION") as f:
mesa_version = f.read().strip()
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
('PACKAGE_BUGREPORT', '\\"https://gitlab.freedesktop.org/mesa/mesa/issues\\"'),
])
# Includes

View File

@@ -1 +1 @@
19.1.0-devel
19.1.8

44
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,44 @@
# fixes: The following commits do not apply cleanly on 19.1 branch, as they
# depend on other commits not present in the branch.
20b00e1ff24f974bc99e7ca9a720518da0ce5b89 panfrost: Make ctx->job useful
f6c44549ee2dd0f218deea1feba3965523609406 iris: Replace devinfo->gen with GEN_GEN
1cd13ccee7bc2733e7a56284dc02bdb1b1c40081 iris: Update fast clear colors on Gen9 with direct immediate writes.
270fe55256c78ede507d75d4665d73936ea7db31 nir/opt_large_constants: Handle store writemasks
# fixes: The following commit depends on commits 77a1070d366a and df4c2ec5e19b
# in order to compile, which did not land in the branch.
2d799250346331a93b21678dc5605cff74dfa3a1 iris: Avoid unnecessary resolves on transfer maps
# stable: Explicit 19.2 only nominations.
e73d863a66caac796ed5fb543a77f0b892df8573 radv: allow to enable VK_AMD_shader_ballot only on GFX8+
f202ac27a99caf9009aa9d60e2e0d7f3b528e99f radv: add a new debug option called RADV_DEBUG=noshaderballot
a6ad9e8ccf970a0da68508eb2ce26b316045b9f0 radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood
0813c27d8d4a7e9372a8a86d970b598fc4e3bfd1 radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
a4e6e59db82e61b47ef905f28dde80ae36a67d35 radv/gfx10: do not use NGG with NAVI14
fe0ec41c4d36fd5a82e7579d89e34cce7423c4e5 radv: Change memory type order for GPUs without dedicated VRAM
28adf0d00c6b5506ed2206b950336bdc568d2247 radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts
d95afd8b9e7f9b3880813203292257bf0ed7babf radeonsi/gfx10: fix wave occupancy computations
6d5f11ab345b05759c22acbcd2f79928311689e3 radv: store engine name
04dc6074cf7f651b720868e0ba24362b585d1b31 driconfig: add a new engine name/version parameter
0616b7ac90cf4f86bb409d34101e3a3cceac8cbe vulkan: add vk_x11_strict_image_count option
83f195414a2e89bd9f549dacc04365f67e5bd110 radeonsi: add Navi12 PCI ID
f833b4cada07b746a10ffa4d93fcd821920c3cb1 docs: Update to OpenGL 4.6 in the release notes
68820007fddbb5b79f1b2b08e66ef14092053a95 radv: fix loading 64-bit GS inputs
41b0e0d7e0f2353d337e68e8e439b5dfead880c4 docs: Add the maximum implemented Vulkan API version in 19.2 rel notes
65b698136c5ef0ef1a15cb6fbff13cbc4ceb3881 amd: add more PCI IDs for Navi14
48742de601a8afea1e5f99637f5823a97ca21915 ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir
3c0938bece83cd37365c30c35d2d54927f3fe0cd radeonsi/gfx10: fix L2 cache rinse programming
7d97013294816db46abb7d1e7c6871fe73dfac93 ac: fix incorrect vram_size reported by the kernel
8cbe83445b2ec78fab1f303918c79268713500b5 ac: add radeon_info::tcc_harvested
235ebe91633e7f47518118983e0e6f5c632b25a4 radeonsi/gfx10: fix corruption for chips with harvested TCCs
b7c2f7c5a6b21bccb7847ab03b7fba5c770e131c ac: fix num_good_cu_per_sh for harvested chips
# stable: Explicit 19.3 only nominations.
66f2aa6ccd0b226eebe2c1a46281160b0a54d522 docs: Add the maximum implemented Vulkan API version in 19.3 rel notes
# revert: The following commit was requested to be removed from stable branch by original author.
dcc0e23438f3e5929c2ef74d57e8207be25ecb41 Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
# fixes: The following commit was reverted later
c73988300f943e185a50aaba015f2f114ffcb262 util: added missing headers in anon-file
# fixes: The following commit depends on commit e1dc3ab75348 in order to
# compile, which did not land in the branch.
8ad3d8b178c0d8939db62ac2be9fdc98d127742d radv: Fix condition for skipping the continue CS.
# revert: The following commit was explicitly requested to be removed from the
# branch.
43041627445540afda1a05d11861935963660344 Revert "radv: disable viewport clamping even if FS doesn't write Z"

View File

@@ -32,7 +32,7 @@ is_sha_nomination()
{
fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \
sed -e 's/'"$2"'/\nfixes:/Ig' | \
grep -Eo 'fixes:[a-f0-9]{8,40}'`
grep -Eo 'fixes:[a-f0-9]{4,40}'`
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
if test $fixes_count -eq 0; then

View File

@@ -17,6 +17,9 @@ import SCons.Script.SConscript
host_platform = _platform.system().lower()
if host_platform.startswith('cygwin'):
host_platform = 'cygwin'
# MSYS2 default platform selection.
if host_platform.startswith('mingw'):
host_platform = 'windows'
# Search sys.argv[] for a "platform=foo" argument since we don't have
# an 'env' variable at this point.
@@ -49,9 +52,18 @@ if 'PROCESSOR_ARCHITECTURE' in os.environ:
else:
host_machine = _platform.machine()
host_machine = _machine_map.get(host_machine, 'generic')
# MSYS2 default machine selection.
if _platform.system().lower().startswith('mingw') and 'MSYSTEM' in os.environ:
if os.environ['MSYSTEM'] == 'MINGW32':
host_machine = 'x86'
if os.environ['MSYSTEM'] == 'MINGW64':
host_machine = 'x86_64'
default_machine = host_machine
default_toolchain = 'default'
# MSYS2 default toolchain selection.
if _platform.system().lower().startswith('mingw'):
default_toolchain = 'mingw'
if target_platform == 'windows' and host_platform != 'windows':
default_machine = 'x86'

View File

@@ -24,8 +24,8 @@ The old bug database on SourceForge is no longer used.
<p>
To file a Mesa bug, go to
<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa">
Bugzilla on freedesktop.org</a>
<a href="https://gitlab.freedesktop.org/mesa/mesa/issues">
GitLab on freedesktop.org</a>
</p>
<p>

View File

@@ -445,7 +445,7 @@ Khronos extensions that are not part of any Vulkan version:
VK_KHR_android_surface not started
VK_KHR_create_renderpass2 DONE (anv, radv)
VK_KHR_display DONE (anv, radv)
VK_KHR_display_swapchain DONE (anv, radv)
VK_KHR_display_swapchain not started
VK_KHR_draw_indirect_count DONE (radv)
VK_KHR_external_fence_fd DONE (anv, radv)
VK_KHR_external_fence_win32 not started

View File

@@ -29,7 +29,7 @@ immediately checked into git because not enough people are testing them.
Just applying patches, testing and reporting back is helpful.
<li>
<b>Driver debugging.</b>
There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa">bug database</a>.
There are plenty of open bugs in the <a href="https://gitlab.freedesktop.org/mesa/mesa/issues">bug database</a>.
<li>
<b>Remove aliasing warnings.</b>
Enable gcc -Wstrict-aliasing=2 -fstrict-aliasing and track down aliasing

View File

@@ -279,7 +279,7 @@ To setup the branchpoint:
<p>
Now go to
<a href="https://bugs.freedesktop.org/editversions.cgi?action=add&amp;product=Mesa" target="_parent">Bugzilla</a> and add the new Mesa version X.Y.
<a href="https://gitlab.freedesktop.org/mesa/mesa/-/milestones" target="_parent">gitlab</a> and add the new Mesa version X.Y.
</p>
<p>

File diff suppressed because it is too large Load Diff

154
docs/relnotes/19.1.1.html Normal file
View File

@@ -0,0 +1,154 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.1 Release Notes / June 25, 2019</h1>
<p>
Mesa 19.1.1 is a bug fix release which fixes bugs found since the 19.1.0 release.
</p>
<p>
Mesa 19.1.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
72114b16b4a84373b2acda060fe2bb1d45ea2598efab3ef2d44bdeda74f15581 mesa-19.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110709">Bug 110709</a> - g_glxglvnddispatchfuncs.c and glxglvnd.c fail to build with clang 8.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110901">Bug 110901</a> - mesa-19.1.0/src/util/futex.h:82: use of out of scope variable ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110902">Bug 110902</a> - mesa-19.1.0/src/broadcom/compiler/vir_opt_redundant_flags.c:104]: (style) Same expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110921">Bug 110921</a> - virgl on OpenGL 3.3 host regressed to OpenGL 2.1</li>
</ul>
<h2>Changes</h2>
<p>Alejandro Piñeiro (1):</p>
<ul>
<li>v3d: fix checking twice auf flag</li>
</ul>
<p>Bas Nieuwenhuizen (5):</p>
<ul>
<li>radv: Skip transitions coming from external queue.</li>
<li>radv: Decompress DCC when the image format is not allowed for buffers.</li>
<li>radv: Fix vulkan build in meson.</li>
<li>anv: Fix vulkan build in meson.</li>
<li>meson: Allow building radeonsi with just the android platform.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>nouveau: fix frees in unsupported IR error paths.</li>
</ul>
<p>Eduardo Lima Mitev (1):</p>
<ul>
<li>freedreno/a5xx: Fix indirect draw max_indices calculation</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>util/futex: fix dangling pointer use</li>
<li>glx: fix glvnd pointer types</li>
<li>util/os_file: resize buffer to what was actually needed</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>virgl: Assume sRGB write control for older guest kernels or virglrenderer hosts</li>
</ul>
<p>Haihao Xiang (1):</p>
<ul>
<li>i965: support UYVY for external import only</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv: Set STATE_BASE_ADDRESS upper bounds on gen7</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>docs: Add SHA256 sums for 19.1.0</li>
<li>Update version to 19.1.1</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>glsl: Fix out of bounds read in shader_cache_read_program_metadata</li>
<li>iris: Fix iris_flush_and_dirty_history to actually dirty history.</li>
</ul>
<p>Kevin Strasser (2):</p>
<ul>
<li>gallium/winsys/kms: Fix dumb buffer bpp</li>
<li>st/mesa: Add rgbx handling for fp formats</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>anv: do not parse genxml data without INTEL_DEBUG=bat</li>
<li>intel/dump: fix segfault when the app hasn't accessed the device</li>
</ul>
<p>Mathias Fröhlich (1):</p>
<ul>
<li>egl: Don't add hardware device if there is no render node v2.</li>
</ul>
<p>Richard Thier (1):</p>
<ul>
<li>r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/a6xx: un-swap X24S8_UINT</li>
</ul>
<p>Samuel Pitoiset (4):</p>
<ul>
<li>radv: fix occlusion queries on VegaM</li>
<li>radv: fix VK_EXT_memory_budget if one heap isn't available</li>
<li>radv: fix FMASK expand with SRGB formats</li>
<li>radv: disable viewport clamping even if FS doesn't write Z</li>
</ul>
</div>
</body>
</html>

194
docs/relnotes/19.1.2.html Normal file
View File

@@ -0,0 +1,194 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.2 Release Notes / July 9, 2019</h1>
<p>
Mesa 19.1.2 is a bug fix release which fixes bugs found since the 19.1.1 release.
</p>
<p>
Mesa 19.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
813a144ea8ebefb7b48b6733f3f603855b0f61268d86cc1cc26a6b4be908fcfd mesa-19.1.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110702">Bug 110702</a> - segfault in radeonsi HEVC hardware decoding with yuv420p10le</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110783">Bug 110783</a> - Mesa 19.1 rc crashing MPV with VAAPI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110944">Bug 110944</a> - [Bisected] Blender 2.8 crashes when closing certain windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110953">Bug 110953</a> - Adding a redundant single-iteration do-while loop causes different image to be rendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110999">Bug 110999</a> - 19.1.0: assert in vkAllocateDescriptorSets using immutable samplers on Ivy Bridge</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111019">Bug 111019</a> - radv doesn't handle variable descriptor count properly</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (3):</p>
<ul>
<li>Revert "i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch"</li>
<li>Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch"</li>
<li>Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch"</li>
</ul>
<p>Arfrever Frehtes Taifersar Arahesis (1):</p>
<ul>
<li>meson: Improve detection of Python when using Meson &gt;=0.50.</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Only allocate supplied number of descriptors when variable.</li>
<li>radv: Fix interactions between variable descriptor count and inline uniform blocks.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>meson: Add support for using cmake for finding LLVM</li>
<li>Revert "meson: Add support for using cmake for finding LLVM"</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>freedreno: Fix UBO load range detection on booleans.</li>
<li>freedreno: Fix up end range of unaligned UBO loads.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>meson: bump required libdrm version to 2.4.81</li>
</ul>
<p>Gert Wollny (2):</p>
<ul>
<li>gallium: Add CAP for opcode DIV</li>
<li>vl: Use CS composite shader only if TEX_LZ and DIV are supported</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>glsl: Don't increase the iteration count when there are no terminators</li>
</ul>
<p>James Clarke (1):</p>
<ul>
<li>meson: GNU/kFreeBSD has DRM/KMS and requires -D_GNU_SOURCE</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>anv/descriptor_set: Only write texture swizzles if we have an image view</li>
<li>iris: Use a uint16_t for key sizes</li>
</ul>
<p>Jory Pratt (2):</p>
<ul>
<li>util: Heap-allocate 256K zlib buffer</li>
<li>meson: Search for execinfo.h</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.1</li>
<li>intel: fix wrong format usage</li>
<li>Update version to 19.1.2</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS</li>
<li>gallium: Make util_copy_image_view handle shader_access</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>intel/compiler: fix derivative on y axis implementation</li>
<li>intel/compiler: don't use byte operands for src1 on ICL</li>
</ul>
<p>Nanley Chery (2):</p>
<ul>
<li>intel: Add and use helpers for level0 extent</li>
<li>isl: Don't align phys_level0_sa by block dimension</li>
</ul>
<p>Nataraj Deshpande (1):</p>
<ul>
<li>anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format</li>
</ul>
<p>Pierre-Eric Pelloux-Prayer (2):</p>
<ul>
<li>mesa: delete framebuffer texture attachment sampler views</li>
<li>radeon/uvd: fix calc_ctx_size_h265_main10</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/a5xx: fix batch leak in fd5 blitter path</li>
</ul>
<p>Sagar Ghuge (1):</p>
<ul>
<li>glsl: Fix round64 conversion function</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: leaking of upload-BO with push constants</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>anv/cmd_buffer: Reuse gen8 Cmd{Set, Reset}Event on gen7</li>
</ul>
</div>
</body>
</html>

191
docs/relnotes/19.1.3.html Normal file
View File

@@ -0,0 +1,191 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.3 Release Notes / July 23, 2019</h1>
<p>
Mesa 19.1.3 is a bug fix release which fixes bugs found since the 19.1.2 release.
</p>
<p>
Mesa 19.1.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
845460b2225d15c15d4a9743dec798ff0b7396b533011d43e774e67f7825b7e0 mesa-19.1.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109203">Bug 109203</a> - [cfl dxvk] GPU Crash Launching Monopoly Plus (Iris Plus 655 / Wine + DXVK)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109524">Bug 109524</a> - &quot;Invalid glsl version in shading_language_version()&quot; when trying to run directX games using wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110309">Bug 110309</a> - [icl][bisected] regression on piglit arb_gpu_shader_int 64.execution.fs-ishl-then-* tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110663">Bug 110663</a> - threads_posix.h:96: undefined reference to `pthread_once'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110955">Bug 110955</a> - Mesa 18.2.8 implementation error: Invalid GLSL version in shading_language_version()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111010">Bug 111010</a> - Cemu Shader Cache Corruption Displaying Solid Color After commit 11e16ca7ce0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111071">Bug 111071</a> - SPIR-V shader processing fails with message about &quot;extra dangling SSA sources&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111075">Bug 111075</a> - Processing of SPIR-V shader causes device hang, sometimes leading to system reboot</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111097">Bug 111097</a> - Can not detect VK_ERROR_OUT_OF_DATE_KHR or VK_SUBOPTIMAL_KHR when window resizing</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Handle cmask being disallowed by addrlib.</li>
<li>anv: Add android dependencies on android.</li>
<li>radv: Only save the descriptor set if we have one.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (2):</p>
<ul>
<li>anv: Fix pool allocator when first alloc needs to grow</li>
<li>spirv: Fix stride calculation when lowering Workgroup to offsets</li>
</ul>
<p>Chia-I Wu (2):</p>
<ul>
<li>anv: fix VkExternalBufferProperties for unsupported handles</li>
<li>anv: fix VkExternalBufferProperties for host allocation</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>nir: Add a helper to determine if an intrinsic can be reordered</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: fix crash in shader tracing.</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>freedreno: Fix assertion failures in context setup in shader-db mode.</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>softpipe: Remove unused static function</li>
</ul>
<p>Ian Romanick (4):</p>
<ul>
<li>intel/vec4: Reswizzle VF immediates too</li>
<li>nir: Add unit tests for nir_opt_comparison_pre</li>
<li>nir: Use nir_src_bit_size instead of alu1-&gt;dest.dest.ssa.bit_size</li>
<li>mesa: Set minimum possible GLSL version</li>
</ul>
<p>Jason Ekstrand (13):</p>
<ul>
<li>nir/instr_set: Expose nir_instrs_equal()</li>
<li>nir/loop_analyze: Fix phi-of-identical-alu detection</li>
<li>nir: Add more helpers for working with const values</li>
<li>nir/loop_analyze: Handle bit sizes correctly in calculate_iterations</li>
<li>nir/loop_analyze: Bail if we encounter swizzles</li>
<li>anv: Set Stateless Data Port Access MOCS</li>
<li>nir/opt_if: Clean up single-src phis in opt_if_loop_terminator</li>
<li>nir,intel: Add support for lowering 64-bit nir_opt_extract_*</li>
<li>anv: Account for dynamic stencil write disables in the PMA fix</li>
<li>nir/regs_to_ssa: Handle regs in phi sources properly</li>
<li>nir/loop_analyze: Refactor detection of limit vars</li>
<li>nir: Add some helpers for chasing SSA values properly</li>
<li>nir/loop_analyze: Properly handle swizzles in loop conditions</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.2</li>
<li>Update version to 19.1.3</li>
</ul>
<p>Lepton Wu (1):</p>
<ul>
<li>virgl: Set meta data for textures from handle.</li>
</ul>
<p>Lionel Landwerlin (6):</p>
<ul>
<li>vulkan/overlay: fix command buffer stats</li>
<li>vulkan/overlay: fix crash on freeing NULL command buffer</li>
<li>anv: fix crash in vkCmdClearAttachments with unused attachment</li>
<li>vulkan/wsi: update swapchain status on vkQueuePresent</li>
<li>anv: report timestampComputeAndGraphics true</li>
<li>anv: fix format mapping for depth/stencil formats</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't set READ_ONLY for const_uploader to fix bindless texture hangs</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: fix alphaToCoverage when there is no color attachment</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix VGT_GS_MODE if VS uses the primitive ID</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>meta: memory leak of CopyPixels usage</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: save/restore SSO flag when using ARB_get_program_binary</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>meson: Add dep_thread dependency.</li>
</ul>
<p>Yevhenii Kolesnikov (1):</p>
<ul>
<li>meta: leaking of BO with DrawPixels</li>
</ul>
</div>
</body>
</html>

227
docs/relnotes/19.1.4.html Normal file
View File

@@ -0,0 +1,227 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.4 Release Notes / August 7, 2019</h1>
<p>
Mesa 19.1.4 is a bug fix release which fixes bugs found since the 19.1.3 release.
</p>
<p>
Mesa 19.1.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
a6d268a7d9edcfd92b6da80f2e34e6e0a7baaa442efbeba2fc66c404943c6bfb mesa-19.1.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109203">Bug 109203</a> - [cfl dxvk] GPU Crash Launching Monopoly Plus (Iris Plus 655 / Wine + DXVK)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109524">Bug 109524</a> - &quot;Invalid glsl version in shading_language_version()&quot; when trying to run directX games using wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110309">Bug 110309</a> - [icl][bisected] regression on piglit arb_gpu_shader_int 64.execution.fs-ishl-then-* tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110663">Bug 110663</a> - threads_posix.h:96: undefined reference to `pthread_once'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110955">Bug 110955</a> - Mesa 18.2.8 implementation error: Invalid GLSL version in shading_language_version()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111010">Bug 111010</a> - Cemu Shader Cache Corruption Displaying Solid Color After commit 11e16ca7ce0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111071">Bug 111071</a> - SPIR-V shader processing fails with message about &quot;extra dangling SSA sources&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111075">Bug 111075</a> - Processing of SPIR-V shader causes device hang, sometimes leading to system reboot</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111097">Bug 111097</a> - Can not detect VK_ERROR_OUT_OF_DATE_KHR or VK_SUBOPTIMAL_KHR when window resizing</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: fix queries with WAIT_BIT returning VK_NOT_READY</li>
</ul>
<p>Andrii Simiklit (2):</p>
<ul>
<li>intel/compiler: don't use a keyword struct for a class fs_reg</li>
<li>meson: add a warning for meson &lt; 0.46.0</li>
</ul>
<p>Arcady Goldmints-Orlov (1):</p>
<ul>
<li>anv: report HOST_ALLOCATION as supported for images</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Set correct metadata size for GFX9+.</li>
<li>radv: Take variable descriptor counts into account for buffer entries.</li>
<li>radv: Fix descriptor set allocation failure.</li>
</ul>
<p>Boyuan Zhang (4):</p>
<ul>
<li>radeon/uvd: fix poc for hevc encode</li>
<li>radeon/vcn: fix poc for hevc encode</li>
<li>radeon/uvd: enable rate control for hevc encoding</li>
<li>radeon/vcn: enable rate control for hevc encoding</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv: Remove special allocation for anv_push_constants</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>nir: Allow qualifiers on copy_deref and image instructions</li>
</ul>
<p>Daniel Schürmann (1):</p>
<ul>
<li>spirv: Fix order of barriers in SpvOpControlBarrier</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>st/nir: fix arb fragment stage conversion</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: allow building all glx without any drivers</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>egl/drm: ensure the backing gbm is set before using it</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>freedreno: Fix data races with allocating/freeing struct ir3.</li>
</ul>
<p>Eric Engestrom (5):</p>
<ul>
<li>nir: don't return void</li>
<li>util: fix no-op macro (bad number of arguments)</li>
<li>gallium+mesa: fix tgsi_semantic array type</li>
<li>scons+meson: suppress spammy build warning on MacOS</li>
<li>nir: remove explicit nir_intrinsic_index_flag values</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>intel/ir: Fix CFG corruption in opt_predicated_break().</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>gallium/vl: fix compute tgsi shaders to not process undefined components</li>
<li>nv50,nvc0: update sampler/view bind functions to accept NULL array</li>
<li>nvc0: allow a non-user buffer to be bound at position 0</li>
<li>nv50/ir: handle insn not being there for definition of CVT arg</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>intel/fs: Stop stack allocating large arrays</li>
<li>anv: Disable transform feedback on gen7</li>
<li>isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW</li>
<li>anv: Don't claim support for 24 and 48-bit formats on IVB</li>
<li>intel/fs: Use ALIGN16 instructions for all derivatives on gen &lt;= 7</li>
<li>intel/fs: Implement quad_swap_horizontal with a swizzle on gen7</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.3</li>
<li>Update version to 19.1.4</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>mesa: Fix ReadBuffers with pbuffers</li>
<li>egl: Quiet warning about front buffer rendering for pixmaps/pbuffers</li>
<li>egl: Make the 565 pbuffer-only config single buffered.</li>
<li>egl: Only expose 565 pbuffer configs if X can export them as DRI3 images</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: fix use of comma operator</li>
<li>nir: add access to image_deref intrinsics</li>
<li>spirv: wrap push ssa/pointer values</li>
<li>spirv: propagate access qualifiers through ssa &amp; pointer</li>
<li>spirv: don't discard access set by vtn_pointer_dereference</li>
</ul>
<p>Mark Menzynski (1):</p>
<ul>
<li>nvc0/ir: Fix assert accessing null pointer</li>
</ul>
<p>Nataraj Deshpande (1):</p>
<ul>
<li>egl/android: Update color_buffers querying for buffer age</li>
</ul>
<p>Nicolas Dufresne (1):</p>
<ul>
<li>egl: Also query modifiers when exporting DMABuf</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>ac/nir: fix txf_ms with an offset</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix crash in vkCmdClearAttachments with unused attachment</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: add glsl_type ref to one_time_init and decref to atexit</li>
</ul>
<p>Yevhenii Kolesnikov (1):</p>
<ul>
<li>main: Fix memleaks in mesa_use_program</li>
</ul>
</div>
</body>
</html>

119
docs/relnotes/19.1.5.html Normal file
View File

@@ -0,0 +1,119 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.5 Release Notes / August 23, 2019</h1>
<p>
Mesa 19.1.5 is a bug fix release which fixes bugs found since the 19.1.4 release.
</p>
<p>
Mesa 19.1.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
7b54e14e35c7251b171b4cf9d84cbc1d760eafe00132117db193454999cd6eb4 mesa-19.1.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109630">Bug 109630</a> - vkQuake flickering geometry under Intel</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110395">Bug 110395</a> - Shadows are flickering in SuperTuxKart</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111113">Bug 111113</a> - ANGLE BlitFramebufferTest.MultisampleDepthClear/ES3_OpenGL fails on Intel Ubuntu19.04</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111267">Bug 111267</a> - [CM246] Flickering with multiple draw calls within the same graphics pipeline if a compute pipeline is present</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Do non-uniform lowering before bool lowering.</li>
<li>ac/nir: Use correct cast for readfirstlane and ptrs.</li>
<li>radv: Avoid binning RAVEN hangs.</li>
<li>radv: Avoid VEGA/RAVEN scissor bug in binning.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>util: fix mem leak of program path</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>gallium/dump: add missing query-type to short-list</li>
<li>gallium/dump: add missing query-type to short-list</li>
</ul>
<p>Greg V (2):</p>
<ul>
<li>anv: remove unused Linux-specific include</li>
<li>intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.4</li>
<li>cherry-ignore: panfrost: Make ctx-&gt;job useful</li>
<li>Update version to 19.1.5</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: disable SDMA image copies on dGPUs to fix corruption in games</li>
<li>radeonsi: fix an assertion failure: assert(!res-&gt;b.is_shared)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>meson: Test for program_invocation_name</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965/clear: clear_value better precision</li>
</ul>
</div>
</body>
</html>

132
docs/relnotes/19.1.6.html Normal file
View File

@@ -0,0 +1,132 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.6 Release Notes / September 3, 2019</h1>
<p>
Mesa 19.1.6 is a bug fix release which fixes bugs found since the 19.1.5 release.
</p>
<p>
Mesa 19.1.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
2a369b7b48545c6486e7e44913ad022daca097c8bd937bf30dcf3f17a94d3496 mesa-19.1.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104395">Bug 104395</a> - [CTS] GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels tests fail on 32bit Mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111213">Bug 111213</a> - VA-API nouveau SIGSEGV and asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111241">Bug 111241</a> - Shadertoy shader causing hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111411">Bug 111411</a> - SPIR-V shader leads to GPU hang, sometimes making machine unstable</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: additional query fixes</li>
</ul>
<p>Daniel Schürmann (1):</p>
<ul>
<li>nir/lcssa: handle deref instructions properly</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled</li>
<li>intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>gallium/vl: use compute preference for all multimedia, not just blit</li>
</ul>
<p>Jonas Ådahl (1):</p>
<ul>
<li>wayland/egl: Ensure correct buffer size when allocating</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.5</li>
<li>cherry-ignore: add explicit 19.2 only nominations</li>
<li>cherry-ignore: iris: Replace devinfo-&gt;gen with GEN_GEN</li>
<li>cherry-ignore: iris: Update fast clear colors on Gen9 with direct immediate writes.</li>
<li>cherry-ignore: iris: Avoid unnecessary resolves on transfer maps</li>
<li>Update version to 19.1.6</li>
</ul>
<p>Kenneth Graunke (6):</p>
<ul>
<li>iris: Fix broken aux.possible/sampler_usages bitmask handling</li>
<li>iris: Drop copy format hacks from copy region based transfer path.</li>
<li>iris: Fix large timeout handling in rel2abs()</li>
<li>util: Add a _mesa_i64roundevenf() helper.</li>
<li>mesa: Fix _mesa_float_to_unorm() on 32-bit systems.</li>
<li>intel/compiler: Fix src0/desc setter ordering</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: fix scratch buffer WAVESIZE setting leading to corruption</li>
</ul>
<p>Paulo Zanoni (1):</p>
<ul>
<li>intel/fs: grab fail_msg from v32 instead of v16 when v32-&gt;run_cs fails</li>
</ul>
<p>Pierre-Eric Pelloux-Prayer (1):</p>
<ul>
<li>glsl: replace 'x + (-x)' with constant 0</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>egl: reset blob cache set/get functions on terminate</li>
</ul>
</div>
</body>
</html>

157
docs/relnotes/19.1.7.html Normal file
View File

@@ -0,0 +1,157 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.7 Release Notes / September 17, 2019</h1>
<p>
Mesa 19.1.7 is a bug fix release which fixes bugs found since the 19.1.6 release.
</p>
<p>
Mesa 19.1.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.1.7 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksums</h2>
<pre>
e287920fdb38712a9fed448dc90b3ca95048c7face5db52e58361f8b6e0f3cd5 mesa-19.1.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110814">Bug 110814</a> - KWin compositor crashes on launch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111069">Bug 111069</a> - Assertion fails in nir_opt_remove_phis.c during compilation of SPIR-V shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111271">Bug 111271</a> - Crash in eglMakeCurrent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111401">Bug 111401</a> - Vulkan overlay layer - async compute not supported, making overlay disappear in Doom</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111405">Bug 111405</a> - Some infinite 'do{}while' loops lead mesa to an infinite compilation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111467">Bug 111467</a> - WOLF RPG Editor + Gallium Nine Standalone: Rendering issue when using Iris driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111552">Bug 111552</a> - Geekbench 5.0 Vulkan compute benchmark fails on Anvil</li>
</ul>
<h2>Changes</h2>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>glsl/nir: Avoid overflow when setting max_uniform_location</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>radv: Call nir_propagate_invariant()</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE</li>
</ul>
<p>Eric Engestrom (10):</p>
<ul>
<li>ttn: fix 64-bit shift on 32-bit `1`</li>
<li>egl: fix deadlock in malloc error path</li>
<li>util/os_file: fix double-close()</li>
<li>anv: fix format string in error message</li>
<li>nir: fix memleak in error path</li>
<li>anv: add support for driconf</li>
<li>wsi: add minImageCount override</li>
<li>anv: add support for vk_x11_override_min_image_count</li>
<li>amd: move adaptive sync to performance section, as it is defined in xmlpool</li>
<li>radv: add support for vk_x11_override_min_image_count</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>gallium/auxiliary/indices: consistently apply start only to input</li>
<li>util: fix SSE-version needed for double opcodes</li>
</ul>
<p>Hal Gentz (1):</p>
<ul>
<li>glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.</li>
</ul>
<p>Jason Ekstrand (7):</p>
<ul>
<li>Revert "intel/fs: Move the scalar-region conversion to the generator."</li>
<li>anv: Bump maxComputeWorkgroupSize</li>
<li>nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block</li>
<li>nir: Add a block_is_unreachable helper</li>
<li>nir/repair_ssa: Repair dominance for unreachable blocks</li>
<li>nir/repair_ssa: Insert deref casts when needed</li>
<li>nir/dead_cf: Repair SSA if the pass makes progress</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.6</li>
<li>cherry-ignore: add explicit 19.2 only nominations</li>
<li>Update version to 19.1.7</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>gallium: Fix util_format_get_depth_only</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>vulkan/overlay: bounce image back to present layout</li>
</ul>
<p>Mauro Rossi (3):</p>
<ul>
<li>android: radv: fix necessary dependecies</li>
<li>android: amd/common: fix missing include path</li>
<li>android: anv: libmesa_vulkan_common: add libmesa_util static dependency</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix allocating number of user sgprs if streamout is used</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>intel/dri: finish proper glthread</li>
</ul>
</div>
</body>
</html>

267
docs/relnotes/19.1.8.html Normal file
View File

@@ -0,0 +1,267 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.8 Release Notes / October 21, 2019</h1>
<p>
Mesa 19.1.8 is a bug fix release which fixes bugs found since the 19.1.7 release.
</p>
<p>
Mesa 19.1.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.1.8 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksums</h2>
<pre>
f0fe8289b7d147943bf2fc2147833254881577e8f9ed3d94ddb39e430e711725 mesa-19.1.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111236">Bug 111236</a> - VA-API radeonsi SIGSEGV __memmove_avx_unaligned</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111664">Bug 111664</a> - [Bisected] Segmentation fault on FS shader compilation (mat4x3 * mat4x3)</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/121">Issue #121</a> - Shared Memeory leakage in XCreateDrawable</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/795">Issue #795</a> - Xorg does not render with mesa 19.1.7</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/939">Issue #939</a> - Meson can't find 32-bit libXvMCW in non-standard path</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/944">Issue #944</a> - Mesa doesn't build with current Scons version (3.1.0)</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1838">Issue #1838</a> - Mesa installs gl.pc and egl.pc even with libglvnd &gt;= 1.2.0</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1844">Issue #1844</a> - libXvMC-1.0.12 breaks mesa build</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1869">Issue #1869</a> - X server does not start with Mesa 19.2.0</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1872">Issue #1872</a> - [bisected] piglit spec.arb_texture_view.bug-layers-image causes gpu hangs on IVB</li>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1878">Issue #1878</a> - meson.build:1447:6: ERROR: Problem encountered: libdrm required for gallium video statetrackers when using x11</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>docs: Update bug report URLs for the gitlab migration</li>
</ul>
<p>Alan Coopersmith (5):</p>
<ul>
<li>c99_compat.h: Don't try to use 'restrict' in C++ code</li>
<li>util: Make Solaris implemention of p_atomic_add work with gcc</li>
<li>util: Workaround lack of flock on Solaris</li>
<li>meson: recognize "sunos" as the system name for Solaris</li>
<li>intel/common: include unistd.h for ioctl() prototype on Solaris</li>
</ul>
<p>Andreas Gottschling (1):</p>
<ul>
<li>drisw: Fix shared memory leak on drawable resize</li>
</ul>
<p>Andres Gomez (3):</p>
<ul>
<li>docs: Add the maximum implemented Vulkan API version in 19.1 rel notes</li>
<li>docs/features: Update VK_KHR_display_swapchain status</li>
<li>egl: Remove the 565 pbuffer-only EGL config under X11.</li>
</ul>
<p>Andrii Simiklit (1):</p>
<ul>
<li>glsl: disallow incompatible matrices multiplication</li>
</ul>
<p>Arcady Goldmints-Orlov (1):</p>
<ul>
<li>anv: fix descriptor limits on gen8</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>tu: Set up glsl types.</li>
<li>radv: Add workaround for hang in The Surge 2.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader</li>
</ul>
<p>Dylan Baker (5):</p>
<ul>
<li>meson: fix logic for generating .pc files with old glvnd</li>
<li>meson: Try finding libxvmcw via pkg-config before using find_library</li>
<li>meson: Link xvmc with libxv</li>
<li>meson: gallium media state trackers require libdrm with x11</li>
<li>meson: Only error building gallium video without libdrm when the platform is drm</li>
</ul>
<p>Eric Engestrom (4):</p>
<ul>
<li>gl: drop incorrect pkg-config file for glvnd</li>
<li>meson: re-add incorrect pkg-config files with GLVND for backward compatibility</li>
<li>util/anon_file: add missing #include</li>
<li>util/anon_file: const string param</li>
</ul>
<p>Erik Faye-Lund (1):</p>
<ul>
<li>glsl: correct bitcast-helpers</li>
</ul>
<p>Greg V (1):</p>
<ul>
<li>util: add anon_file.h for all memfd/temp file usage</li>
</ul>
<p>Haihao Xiang (1):</p>
<ul>
<li>i965: support AYUV/XYUV for external import only</li>
</ul>
<p>Hal Gentz (1):</p>
<ul>
<li>gallium/osmesa: Fix the inability to set no context as current.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir/repair_ssa: Replace the unreachable check with the phi builder</li>
<li>intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates</li>
</ul>
<p>Juan A. Suarez Romero (11):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.7</li>
<li>cherry-ignore: add explicit 19.2 only nominations</li>
<li>cherry-ignore: add explicit 19.3 only nominations</li>
<li>Revert "Revert "intel/fs: Move the scalar-region conversion to the generator.""</li>
<li>cherry-ignore: Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"</li>
<li>bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars</li>
<li>cherry-ignore: nir/opt_large_constants: Handle store writemasks</li>
<li>cherry-ignore: util: added missing headers in anon-file</li>
<li>cherry-ignore: radv: Fix condition for skipping the continue CS.</li>
<li>cherry-ignore: Revert "radv: disable viewport clamping even if FS doesn't write Z"</li>
<li>Update version to 19.1.8</li>
</ul>
<p>Ken Mays (1):</p>
<ul>
<li>haiku: fix Mesa build</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>iris: Initialize ice-&gt;state.prim_mode to an invalid value</li>
<li>intel: Increase Gen11 compute shader scratch IDs to 64.</li>
<li>iris: Disable CCS_E for 32-bit floating point textures.</li>
<li>iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: gem-stubs: return a valid fd got anv_gem_userptr()</li>
<li>intel: use proper label for Comet Lake skus</li>
<li>mesa: don't forget to clear _Layer field on texture unit</li>
<li>intel: fix subslice computation from topology data</li>
<li>intel/isl: Set null surface format to R32_UINT</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>util: Drop preprocessor guards for glibc-2.12</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: fix VAAPI segfault due to various bugs</li>
</ul>
<p>Michel Zou (2):</p>
<ul>
<li>scons: add py3 support</li>
<li>scons: For MinGW use -posix flag.</li>
</ul>
<p>Paulo Zanoni (1):</p>
<ul>
<li>intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32</li>
</ul>
<p>Prodea Alexandru-Liviu (1):</p>
<ul>
<li>scons/MSYS2-MinGW-W64: Fix build options defaults Signed-off-by: Prodea Alexandru-Liviu &lt;liviuprodea@yahoo.com&gt; Reviewed-by: Jose Fonseca &lt;jfonseca@vmware.com&gt; Cc: &lt;mesa-stable@lists.freedesktop.org&gt;</li>
</ul>
<p>Rhys Perry (2):</p>
<ul>
<li>radv: always emit a position export in gs copy shaders</li>
<li>nir/opt_remove_phis: handle phis with no sources</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>intel/nir: do not apply the fsin and fcos trig workarounds for consts</li>
</ul>
<p>Stephen Barber (1):</p>
<ul>
<li>nouveau: add idep_nir_headers as dep for libnouveau</li>
</ul>
<p>Tapani Pälli (3):</p>
<ul>
<li>iris: close screen fd on iris_destroy_screen</li>
<li>egl: check for NULL value like eglGetSyncAttribKHR does</li>
<li>util: fix os_create_anonymous_file on android</li>
</ul>
<p>pal1000 (2):</p>
<ul>
<li>scons/windows: Support build with LLVM 9.</li>
<li>scons: Fix MSYS2 Mingw-w64 build.</li>
</ul>
</div>
</body>
</html>

View File

@@ -96,7 +96,7 @@
* - http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
*/
#ifndef restrict
# if (__STDC_VERSION__ >= 199901L)
# if (__STDC_VERSION__ >= 199901L) && !defined(__cplusplus)
/* C99 */
# elif defined(__GNUC__)
# define restrict __restrict__

View File

@@ -191,24 +191,24 @@ CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT1)")
CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)")
CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)")
CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT3)")
CHIPSET(0x9B21, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA0, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA2, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA4, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA5, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA8, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAA, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAB, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAC, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9B41, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC0, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC2, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC4, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC5, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC8, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCA, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCB, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCC, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9B21, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA0, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA2, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA4, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA5, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA8, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAA, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAB, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAC, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9B41, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC0, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC2, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC4, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC5, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC8, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCA, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCB, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCC, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x5A49, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
CHIPSET(0x5A4A, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
CHIPSET(0x5A41, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")

View File

@@ -42,7 +42,7 @@ pre_args = [
'-D__STDC_FORMAT_MACROS',
'-D__STDC_LIMIT_MACROS',
'-DPACKAGE_VERSION="@0@"'.format(meson.project_version()),
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"',
'-DPACKAGE_BUGREPORT="https://gitlab.freedesktop.org/mesa/mesa/issues"',
]
with_vulkan_icd_dir = get_option('vulkan-icd-dir')
@@ -83,7 +83,7 @@ with_shared_glapi = get_option('shared-glapi')
# shared-glapi is required if at least two OpenGL APIs are being built
if not with_shared_glapi
if ((with_gles1 == 'true' and with_gles2 == 'true') or
if ((with_gles1 == 'true' and with_gles2 == 'true') or
(with_gles1 == 'true' and with_opengl) or
(with_gles2 == 'true' and with_opengl))
error('shared-glapi required for building two or more of OpenGL, OpenGL ES 1.x, OpenGL ES 2.x')
@@ -107,7 +107,7 @@ with_any_opengl = with_opengl or with_gles1 or with_gles2
# Only build shared_glapi if at least one OpenGL API is enabled
with_shared_glapi = get_option('shared-glapi') and with_any_opengl
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 'linux'].contains(host_machine.system())
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'gnu/kfreebsd', 'dragonfly', 'linux', 'sunos'].contains(host_machine.system())
dri_drivers = get_option('dri-drivers')
if dri_drivers.contains('auto')
@@ -190,6 +190,12 @@ if cc.get_id() == 'intel'
endif
endif
#This message is needed until we bump meson version to 0.46 because of known 0.45.0 and 0.45.1 issue
#https://bugs.freedesktop.org/show_bug.cgi?id=109791
if meson.version().version_compare('< 0.46.0')
warning('''Meson < 0.46 doesn't automatically define `NDEBUG`; please update meson to at least 0.46.''')
endif
with_gallium = gallium_drivers.length() != 0 and gallium_drivers != ['']
if with_gallium and system_has_kms_drm
@@ -244,6 +250,7 @@ endif
if host_machine.system() == 'darwin'
with_dri_platform = 'apple'
pre_args += '-DBUILDING_MESA'
elif ['windows', 'cygwin'].contains(host_machine.system())
with_dri_platform = 'windows'
elif system_has_kms_drm
@@ -312,7 +319,7 @@ if with_glx == 'dri'
endif
endif
if not (with_dri or with_gallium or with_glx == 'xlib' or with_glx == 'gallium-xlib')
if not (with_dri or with_gallium or with_glx != 'disabled')
with_gles1 = false
with_gles2 = false
with_opengl = false
@@ -353,12 +360,12 @@ else
with_egl = false
endif
if with_egl and not (with_platform_drm or with_platform_surfaceless)
if with_egl and not (with_platform_drm or with_platform_surfaceless or with_platform_android)
if with_gallium_radeonsi
error('RadeonSI requires drm or surfaceless platform when using EGL')
error('RadeonSI requires the drm, surfaceless or android platform when using EGL')
endif
if with_gallium_virgl
error('Virgl requires drm or surfaceless platform when using EGL')
error('Virgl requires the drm, surfaceless or android platform when using EGL')
endif
endif
@@ -366,7 +373,7 @@ pre_args += '-DGLX_USE_TLS'
if with_glx != 'disabled'
if not (with_platform_x11 and with_any_opengl)
error('Cannot build GLX support without X11 platform support and at least one OpenGL API')
elif with_glx == 'gallium-xlib'
elif with_glx == 'gallium-xlib'
if not with_gallium
error('Gallium-xlib based GLX requires at least one gallium driver')
elif not with_gallium_softpipe
@@ -374,14 +381,12 @@ if with_glx != 'disabled'
elif with_dri
error('gallium-xlib conflicts with any dri driver')
endif
elif with_glx == 'xlib'
elif with_glx == 'xlib'
if with_dri
error('xlib conflicts with any dri driver')
endif
elif with_glx == 'dri'
if not with_dri
error('dri based GLX requires at least one DRI driver')
elif not with_shared_glapi
if not with_shared_glapi
error('dri based GLX requires shared-glapi')
endif
endif
@@ -485,10 +490,12 @@ elif not (with_gallium_r600 or with_gallium_nouveau)
endif
endif
dep_xvmc = null_dep
dep_xv = null_dep
with_gallium_xvmc = false
if _xvmc != 'false'
dep_xvmc = dependency('xvmc', version : '>= 1.0.6', required : _xvmc == 'true')
with_gallium_xvmc = dep_xvmc.found()
dep_xv = dependency('xv', required : _xvmc == 'true')
with_gallium_xvmc = dep_xvmc.found() and dep_xv.found()
endif
xvmc_drivers_path = get_option('xvmc-libs-path')
@@ -754,7 +761,11 @@ if with_platform_haiku
pre_args += '-DHAVE_HAIKU_PLATFORM'
endif
prog_python = import('python3').find_python()
if meson.version().version_compare('>=0.50')
prog_python = import('python').find_installation('python3')
else
prog_python = import('python3').find_python()
endif
has_mako = run_command(
prog_python, '-c',
'''
@@ -836,8 +847,10 @@ if cc.compiles('int foo(void) __attribute__((__noreturn__));',
endif
# TODO: this is very incomplete
if ['linux', 'cygwin', 'gnu'].contains(host_machine.system())
if ['linux', 'cygwin', 'gnu', 'gnu/kfreebsd'].contains(host_machine.system())
pre_args += '-D_GNU_SOURCE'
elif host_machine.system() == 'sunos'
pre_args += '-D__EXTENSIONS__'
endif
# Check for generic C arguments
@@ -1040,18 +1053,25 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major')
pre_args += '-DMAJOR_IN_MKDEV'
endif
foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h', 'dlfcn.h']
foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h', 'dlfcn.h', 'execinfo.h']
if cc.compiles('#include <@0@>'.format(h), name : '@0@'.format(h))
pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify())
endif
endforeach
foreach f : ['strtof', 'mkostemp', 'posix_memalign', 'timespec_get', 'memfd_create']
foreach f : ['strtof', 'mkostemp', 'posix_memalign', 'timespec_get', 'memfd_create', 'flock']
if cc.has_function(f)
pre_args += '-DHAVE_@0@'.format(f.to_upper())
endif
endforeach
if cc.has_header_symbol('errno.h', 'program_invocation_name',
args : '-D_GNU_SOURCE')
pre_args += '-DHAVE_PROGRAM_INVOCATION_NAME'
elif with_tools.contains('intel')
error('Intel tools require the program_invocation_name variable')
endif
# strtod locale support
if cc.links('''
#define _GNU_SOURCE
@@ -1163,7 +1183,7 @@ _drm_radeon_ver = '2.4.71'
_drm_nouveau_ver = '2.4.66'
_drm_etnaviv_ver = '2.4.89'
_drm_intel_ver = '2.4.75'
_drm_ver = '2.4.75'
_drm_ver = '2.4.81'
_libdrm_checks = [
['intel', with_dri_i915 or with_gallium_i915],
@@ -1258,6 +1278,7 @@ if _llvm != 'false'
with_gallium_opencl or _llvm == 'true'
),
static : not _shared_llvm,
method : 'config-tool',
)
with_llvm = dep_llvm.found()
endif
@@ -1296,8 +1317,13 @@ else
endif
dep_glvnd = null_dep
glvnd_missing_pc_files = false
if with_glvnd
dep_glvnd = dependency('libglvnd', version : '>= 0.2.0')
# GLVND until commit 0dfaea2bcb7cdcc785f9 ("Add pkg-config files for EGL, GL,
# GLES, and GLX.") was missing its pkg-config files, forcing every vendor to
# provide them and the distro maintainers to resolve the conflict.
glvnd_missing_pc_files = dep_glvnd.version().version_compare('< 1.2.0')
pre_args += '-DUSE_LIBGLVND=1'
endif
@@ -1411,6 +1437,9 @@ if with_platform_x11
with_gallium_omx != 'disabled'))
dep_xcb = dependency('xcb')
dep_x11_xcb = dependency('x11-xcb')
if with_dri_platform == 'drm' and not dep_libdrm.found()
error('libdrm required for gallium video statetrackers when using x11')
endif
endif
if with_any_vk or with_egl or (with_glx == 'dri' and with_dri_platform == 'drm')
dep_xcb_dri2 = dependency('xcb-dri2', version : '>= 1.8')
@@ -1431,7 +1460,7 @@ if with_platform_x11
if with_glx == 'dri' or with_glx == 'gallium-xlib'
dep_glproto = dependency('glproto', version : '>= 1.4.14')
endif
if with_glx == 'dri'
if with_glx == 'dri'
if with_dri_platform == 'drm'
dep_dri2proto = dependency('dri2proto', version : '>= 2.8')
dep_xxf86vm = dependency('xxf86vm')

View File

@@ -128,9 +128,9 @@ def generate(env):
if not path:
path = []
if SCons.Util.is_String(path):
path = string.split(path, os.pathsep)
path = str.split(path, os.pathsep)
env['ENV']['PATH'] = string.join([dir] + path, os.pathsep)
env['ENV']['PATH'] = str.join(os.pathsep, [dir] + path)
# Most of mingw is the same as gcc and friends...
gnu_tools = ['gcc', 'g++', 'gnulink', 'ar', 'gas']

View File

@@ -262,8 +262,12 @@ def parse_source_list(env, filename, names=None):
sym_table = parser.parse(src.abspath)
if names:
if isinstance(names, basestring):
names = [names]
if sys.version_info[0] >= 3:
if isinstance(names, str):
names = [names]
else:
if isinstance(names, basestring):
names = [names]
symbols = names
else:

View File

@@ -132,7 +132,7 @@ def check_cc(env, cc, expr, cpp_opt = '-E'):
sys.stdout.write('Checking for %s ... ' % cc)
source = tempfile.NamedTemporaryFile(suffix='.c', delete=False)
source.write('#if !(%s)\n#error\n#endif\n' % expr)
source.write(('#if !(%s)\n#error\n#endif\n' % expr).encode())
source.close()
# sys.stderr.write('%r %s %s\n' % (env['CC'], cpp_opt, source.name));
@@ -237,6 +237,9 @@ def generate(env):
hosthost_platform = host_platform.system().lower()
if hosthost_platform.startswith('cygwin'):
hosthost_platform = 'cygwin'
# Avoid spurious crosscompilation in MSYS2 environment.
if hosthost_platform.startswith('mingw'):
hosthost_platform = 'windows'
host_machine = os.environ.get('PROCESSOR_ARCHITEW6432', os.environ.get('PROCESSOR_ARCHITECTURE', host_platform.machine()))
host_machine = {
'x86': 'x86',
@@ -352,6 +355,7 @@ def generate(env):
'_DARWIN_C_SOURCE',
'GLX_USE_APPLEGL',
'GLX_DIRECT_RENDERING',
'BUILDING_MESA',
]
else:
cppdefines += [

View File

@@ -30,6 +30,7 @@ Tool-specific initialization for LLVM
import os
import os.path
import re
import platform as host_platform
import sys
import distutils.version
@@ -100,8 +101,36 @@ def generate(env):
env.Prepend(CPPPATH = [os.path.join(llvm_dir, 'include')])
env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
# LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter irreader`
if llvm_version >= distutils.version.LooseVersion('5.0'):
# LLVM 5.0 and newer requires MinGW w/ pthreads due to use of std::thread and friends.
if llvm_version >= distutils.version.LooseVersion('5.0') and env['crosscompile']:
assert env['gcc']
env.AppendUnique(CXXFLAGS = ['-posix'])
# LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter irreader` for LLVM<=7.0
# and `llvm-config --libs engine irreader` for LLVM>=8.0
# LLVMAggressiveInstCombine library part of engine component can be safely omitted as it's not used.
if llvm_version >= distutils.version.LooseVersion('9.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
'LLVMDebugInfoCodeView', 'LLVMCodeGen',
'LLVMScalarOpts', 'LLVMInstCombine',
'LLVMTransformUtils',
'LLVMBitWriter', 'LLVMX86Desc',
'LLVMMCDisassembler', 'LLVMX86Info',
'LLVMX86Utils',
'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
'LLVMAnalysis', 'LLVMProfileData',
'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
'LLVMBitReader', 'LLVMMC', 'LLVMCore',
'LLVMSupport',
'LLVMIRReader', 'LLVMAsmParser',
'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',
'LLVMBinaryFormat',
'LLVMRemarks', 'LLVMBitstreamReader', 'LLVMDebugInfoDWARF',
])
elif llvm_version >= distutils.version.LooseVersion('5.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
@@ -120,10 +149,6 @@ def generate(env):
'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',
'LLVMBinaryFormat',
])
if env['platform'] == 'windows' and env['crosscompile']:
# LLVM 5.0 requires MinGW w/ pthreads due to use of std::thread and friends.
assert env['gcc']
env['CXX'] = env['CXX'] + '-posix'
elif llvm_version >= distutils.version.LooseVersion('4.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
@@ -217,6 +242,12 @@ def generate(env):
'uuid',
])
# Mingw-w64 zlib is required when building with LLVM support in MSYS2 environment
if host_platform.system().lower().startswith('mingw'):
env.Append(LIBS = [
'z',
])
if env['msvc']:
# Some of the LLVM C headers use the inline keyword without
# defining it.

View File

@@ -55,6 +55,7 @@ LOCAL_C_INCLUDES := \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src/gallium/auxiliary \
$(MESA_TOP)/src/mesa \
$(intermediates)/common
LOCAL_EXPORT_C_INCLUDE_DIRS := \

View File

@@ -3438,6 +3438,8 @@ ac_build_readlane(struct ac_llvm_context *ctx, LLVMValueRef src, LLVMValueRef la
LLVMConstInt(ctx->i32, i, 0), "");
}
}
if (LLVMGetTypeKind(src_type) == LLVMPointerTypeKind)
return LLVMBuildIntToPtr(ctx->builder, ret, src_type, "");
return LLVMBuildBitCast(ctx->builder, ret, src_type, "");
}
@@ -4016,7 +4018,7 @@ ac_build_wg_scan_bottom(struct ac_llvm_context *ctx, struct ac_wg_scan *ws)
/* ws->result_reduce is already the correct value */
if (ws->enable_inclusive)
ws->result_inclusive = ac_build_alu_op(ctx, ws->result_exclusive, ws->src, ws->op);
ws->result_inclusive = ac_build_alu_op(ctx, ws->result_inclusive, ws->src, ws->op);
if (ws->enable_exclusive)
ws->result_exclusive = ac_build_alu_op(ctx, ws->result_exclusive, ws->extra, ws->op);
}

View File

@@ -151,13 +151,14 @@ static LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family,
LLVMTargetRef target = ac_get_llvm_target(triple);
snprintf(features, sizeof(features),
"+DumpCode,-fp32-denormals,+fp64-denormals%s%s%s%s%s",
"+DumpCode,-fp32-denormals,+fp64-denormals%s%s%s%s%s%s",
HAVE_LLVM >= 0x0800 ? "" : ",+vgpr-spilling",
tm_options & AC_TM_SISCHED ? ",+si-scheduler" : "",
tm_options & AC_TM_FORCE_ENABLE_XNACK ? ",+xnack" : "",
tm_options & AC_TM_FORCE_DISABLE_XNACK ? ",-xnack" : "",
tm_options & AC_TM_PROMOTE_ALLOCA_TO_SCRATCH ? ",-promote-alloca" : "");
tm_options & AC_TM_PROMOTE_ALLOCA_TO_SCRATCH ? ",-promote-alloca" : "",
tm_options & AC_TM_NO_LOAD_STORE_OPT ? ",-load-store-opt" : "");
LLVMTargetMachineRef tm = LLVMCreateTargetMachine(
target,
triple,

View File

@@ -65,6 +65,7 @@ enum ac_target_machine_options {
AC_TM_CHECK_IR = (1 << 5),
AC_TM_ENABLE_GLOBAL_ISEL = (1 << 6),
AC_TM_CREATE_LOW_OPT = (1 << 7),
AC_TM_NO_LOAD_STORE_OPT = (1 << 8),
};
enum ac_float_mode {

View File

@@ -38,6 +38,7 @@ struct ac_nir_context {
struct ac_shader_abi *abi;
gl_shader_stage stage;
shader_info *info;
LLVMValueRef *ssa_defs;
@@ -1395,6 +1396,22 @@ static LLVMValueRef build_tex_intrinsic(struct ac_nir_context *ctx,
}
args->attributes = AC_FUNC_ATTR_READNONE;
bool cs_derivs = ctx->stage == MESA_SHADER_COMPUTE &&
ctx->info->cs.derivative_group != DERIVATIVE_GROUP_NONE;
if (ctx->stage == MESA_SHADER_FRAGMENT || cs_derivs) {
/* Prevent texture instructions with implicit derivatives from being
* sinked into branches. */
switch (instr->op) {
case nir_texop_tex:
case nir_texop_txb:
case nir_texop_lod:
args->attributes |= AC_FUNC_ATTR_CONVERGENT;
break;
default:
break;
}
}
return ac_build_image_opcode(&ctx->ac, args);
}
@@ -3730,7 +3747,7 @@ static void visit_tex(struct ac_nir_context *ctx, nir_tex_instr *instr)
goto write_result;
}
if (args.offset && instr->op != nir_texop_txf) {
if (args.offset && instr->op != nir_texop_txf && instr->op != nir_texop_txf_ms) {
LLVMValueRef offset[3], pack;
for (unsigned chan = 0; chan < 3; ++chan)
offset[chan] = ctx->ac.i32_0;
@@ -3864,7 +3881,7 @@ static void visit_tex(struct ac_nir_context *ctx, nir_tex_instr *instr)
args.coords[sample_chan], fmask_ptr);
}
if (args.offset && instr->op == nir_texop_txf) {
if (args.offset && (instr->op == nir_texop_txf || instr->op == nir_texop_txf_ms)) {
int num_offsets = instr->src[offset_src].src.ssa->num_components;
num_offsets = MIN2(num_offsets, instr->coord_components);
for (unsigned i = 0; i < num_offsets; ++i) {
@@ -4351,6 +4368,7 @@ void ac_nir_translate(struct ac_llvm_context *ac, struct ac_shader_abi *abi,
ctx.abi = abi;
ctx.stage = nir->info.stage;
ctx.info = &nir->info;
ctx.main_function = LLVMGetBasicBlockParent(LLVMGetInsertBlock(ctx.ac.builder));

View File

@@ -71,7 +71,8 @@ LOCAL_C_INCLUDES := \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_amd_common,,) \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_radv_common,,) \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_util,,)/util
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_util,,)/util \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_util,,)
LOCAL_WHOLE_STATIC_LIBRARIES := \
libmesa_vulkan_util \
@@ -165,5 +166,14 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \
LOCAL_SHARED_LIBRARIES += $(RADV_SHARED_LIBRARIES) libz libsync liblog
# If Android version >=8 MESA should static link libexpat else should dynamic link
ifeq ($(shell test $(PLATFORM_SDK_VERSION) -ge 27; echo $$?), 0)
LOCAL_STATIC_LIBRARIES := \
libexpat
else
LOCAL_SHARED_LIBRARIES += \
libexpat
endif
include $(MESA_COMMON_MK)
include $(BUILD_SHARED_LIBRARY)

View File

@@ -129,21 +129,27 @@ if with_xlib_lease
radv_flags += '-DVK_USE_PLATFORM_XLIB_XRANDR_EXT'
endif
if with_platform_android
radv_flags += [
'-DVK_USE_PLATFORM_ANDROID_KHR'
]
libradv_files += files('radv_android.c')
endif
libvulkan_radeon = shared_library(
'vulkan_radeon',
[libradv_files, radv_entrypoints, radv_extensions_c, amd_vk_format_table_c, sha1_h, xmlpool_options_h],
include_directories : [
inc_common, inc_amd, inc_amd_common, inc_compiler, inc_util, inc_vulkan_util,
inc_vulkan_wsi,
inc_common, inc_amd, inc_amd_common, inc_compiler, inc_util, inc_vulkan_wsi,
],
link_with : [
libamd_common, libamdgpu_addrlib, libvulkan_util, libvulkan_wsi,
libamd_common, libamdgpu_addrlib, libvulkan_wsi,
libmesa_util, libxmlconfig
],
dependencies : [
dep_llvm, dep_libdrm_amdgpu, dep_thread, dep_elf, dep_dl, dep_m,
dep_valgrind, radv_deps,
idep_nir,
idep_nir, idep_vulkan_util,
],
c_args : [c_vis_args, no_override_init_args, radv_flags],
cpp_args : [cpp_vis_args, radv_flags],

View File

@@ -301,7 +301,6 @@ radv_cmd_buffer_destroy(struct radv_cmd_buffer *cmd_buffer)
static VkResult
radv_reset_cmd_buffer(struct radv_cmd_buffer *cmd_buffer)
{
cmd_buffer->device->ws->cs_reset(cmd_buffer->cs);
list_for_each_entry_safe(struct radv_cmd_buffer_upload, up,
@@ -326,6 +325,8 @@ radv_reset_cmd_buffer(struct radv_cmd_buffer *cmd_buffer)
cmd_buffer->record_result = VK_SUCCESS;
memset(cmd_buffer->vertex_bindings, 0, sizeof(cmd_buffer->vertex_bindings));
for (unsigned i = 0; i < VK_PIPELINE_BIND_POINT_RANGE_SIZE; i++) {
cmd_buffer->descriptors[i].dirty = 0;
cmd_buffer->descriptors[i].valid = 0;
@@ -565,8 +566,8 @@ radv_save_descriptors(struct radv_cmd_buffer *cmd_buffer,
for_each_bit(i, descriptors_state->valid) {
struct radv_descriptor_set *set = descriptors_state->sets[i];
data[i * 2] = (uintptr_t)set;
data[i * 2 + 1] = (uintptr_t)set >> 32;
data[i * 2] = (uint64_t)(uintptr_t)set;
data[i * 2 + 1] = (uint64_t)(uintptr_t)set >> 32;
}
radv_emit_write_data_packet(cmd_buffer, va, MAX_SETS * 2, data);
@@ -4663,6 +4664,9 @@ static void radv_handle_image_transition(struct radv_cmd_buffer *cmd_buffer,
assert(src_family == cmd_buffer->queue_family_index ||
dst_family == cmd_buffer->queue_family_index);
if (src_family == VK_QUEUE_FAMILY_EXTERNAL)
return;
if (cmd_buffer->queue_family_index == RADV_QUEUE_TRANSFER)
return;
@@ -4824,7 +4828,7 @@ static void write_event(struct radv_cmd_buffer *cmd_buffer,
radv_cs_add_buffer(cmd_buffer->device->ws, cs, event->bo);
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cs, 18);
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cs, 21);
/* Flags that only require a top-of-pipe event. */
VkPipelineStageFlags top_of_pipe_flags =

View File

@@ -51,6 +51,7 @@ enum {
RADV_DEBUG_CHECKIR = 0x200000,
RADV_DEBUG_NOTHREADLLVM = 0x400000,
RADV_DEBUG_NOBINNING = 0x800000,
RADV_DEBUG_NO_LOAD_STORE_OPT = 0x1000000,
};
enum {

View File

@@ -200,7 +200,7 @@ VkResult radv_CreateDescriptorSetLayout(
break;
case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
/* main descriptor + fmask descriptor + sampler */
set_layout->binding[b].size = 32 + 32 * max_sampled_image_descriptors;
set_layout->binding[b].size = 96;
binding_buffer_count = 1;
alignment = 32;
break;
@@ -247,7 +247,8 @@ VkResult radv_CreateDescriptorSetLayout(
/* Don't reserve space for the samplers if they're not accessed. */
if (set_layout->binding[b].immutable_samplers_equal) {
if (binding->descriptorType == VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER)
if (binding->descriptorType == VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER &&
max_sampled_image_descriptors <= 2)
set_layout->binding[b].size -= 32;
else if (binding->descriptorType == VK_DESCRIPTOR_TYPE_SAMPLER)
set_layout->binding[b].size -= 16;
@@ -476,8 +477,17 @@ radv_descriptor_set_create(struct radv_device *device,
struct radv_descriptor_set **out_set)
{
struct radv_descriptor_set *set;
uint32_t buffer_count = layout->buffer_count;
if (variable_count) {
unsigned stride = 1;
if (layout->binding[layout->binding_count - 1].type == VK_DESCRIPTOR_TYPE_SAMPLER ||
layout->binding[layout->binding_count - 1].type == VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT)
stride = 0;
buffer_count = layout->binding[layout->binding_count - 1].buffer_offset +
*variable_count * stride;
}
unsigned range_offset = sizeof(struct radv_descriptor_set) +
sizeof(struct radeon_winsys_bo *) * layout->buffer_count;
sizeof(struct radeon_winsys_bo *) * buffer_count;
unsigned mem_size = range_offset +
sizeof(struct radv_descriptor_range) * layout->dynamic_offset_count;
@@ -502,7 +512,17 @@ radv_descriptor_set_create(struct radv_device *device,
}
set->layout = layout;
uint32_t layout_size = align_u32(layout->size, 32);
uint32_t layout_size = layout->size;
if (variable_count) {
assert(layout->has_variable_descriptors);
uint32_t stride = layout->binding[layout->binding_count - 1].size;
if (layout->binding[layout->binding_count - 1].type == VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT)
stride = 1;
layout_size = layout->binding[layout->binding_count - 1].offset +
*variable_count * stride;
}
layout_size = align_u32(layout_size, 32);
if (layout_size) {
set->size = layout_size;
@@ -776,9 +796,13 @@ VkResult radv_AllocateDescriptorSets(
pDescriptorSets[i] = radv_descriptor_set_to_handle(set);
}
if (result != VK_SUCCESS)
if (result != VK_SUCCESS) {
radv_FreeDescriptorSets(_device, pAllocateInfo->descriptorPool,
i, pDescriptorSets);
for (i = 0; i < pAllocateInfo->descriptorSetCount; i++) {
pDescriptorSets[i] = VK_NULL_HANDLE;
}
}
return result;
}

View File

@@ -104,7 +104,7 @@ radv_immutable_samplers(const struct radv_descriptor_set_layout *set,
static inline unsigned
radv_combined_image_descriptor_sampler_offset(const struct radv_descriptor_set_binding_layout *binding)
{
return binding->size - ((!binding->immutable_samplers_equal) ? 32 : 0);
return binding->size - ((!binding->immutable_samplers_equal) ? 16 : 0);
}
static inline const struct radv_sampler_ycbcr_conversion *

View File

@@ -464,6 +464,7 @@ static const struct debug_control radv_debug_options[] = {
{"checkir", RADV_DEBUG_CHECKIR},
{"nothreadllvm", RADV_DEBUG_NOTHREADLLVM},
{"nobinning", RADV_DEBUG_NOBINNING},
{"noloadstoreopt", RADV_DEBUG_NO_LOAD_STORE_OPT},
{NULL, 0}
};
@@ -510,6 +511,21 @@ radv_handle_per_app_options(struct radv_instance *instance,
} else if (!strcmp(name, "DOOM_VFR")) {
/* Work around a Doom VFR game bug */
instance->debug_flags |= RADV_DEBUG_NO_DYNAMIC_BOUNDS;
} else if (!strcmp(name, "MonsterHunterWorld.exe")) {
/* Workaround for a WaW hazard when LLVM moves/merges
* load/store memory operations.
* See https://reviews.llvm.org/D61313
*/
if (HAVE_LLVM < 0x900)
instance->debug_flags |= RADV_DEBUG_NO_LOAD_STORE_OPT;
} else if (!strcmp(name, "Fledge")) {
/*
* Zero VRAM for "The Surge 2"
*
* This avoid a hang when when rendering any level. Likely
* uninitialized data in an indirect draw.
*/
instance->debug_flags |= RADV_DEBUG_ZERO_VRAM;
}
}
@@ -524,8 +540,9 @@ static int radv_get_instance_extension_index(const char *name)
static const char radv_dri_options_xml[] =
DRI_CONF_BEGIN
DRI_CONF_SECTION_QUALITY
DRI_CONF_SECTION_PERFORMANCE
DRI_CONF_ADAPTIVE_SYNC("true")
DRI_CONF_VK_X11_OVERRIDE_MIN_IMAGE_COUNT(0)
DRI_CONF_SECTION_END
DRI_CONF_END;
@@ -1477,40 +1494,46 @@ radv_get_memory_budget_properties(VkPhysicalDevice physicalDevice,
* Note that the application heap usages are not really accurate (eg.
* in presence of shared buffers).
*/
if (vram_size) {
heap_usage = device->ws->query_value(device->ws,
RADEON_ALLOCATED_VRAM);
for (int i = 0; i < device->memory_properties.memoryTypeCount; i++) {
uint32_t heap_index = device->memory_properties.memoryTypes[i].heapIndex;
heap_budget = vram_size -
device->ws->query_value(device->ws, RADEON_VRAM_USAGE) +
heap_usage;
switch (device->mem_type_indices[i]) {
case RADV_MEM_TYPE_VRAM:
heap_usage = device->ws->query_value(device->ws,
RADEON_ALLOCATED_VRAM);
memoryBudget->heapBudget[RADV_MEM_HEAP_VRAM] = heap_budget;
memoryBudget->heapUsage[RADV_MEM_HEAP_VRAM] = heap_usage;
}
heap_budget = vram_size -
device->ws->query_value(device->ws, RADEON_VRAM_USAGE) +
heap_usage;
if (visible_vram_size) {
heap_usage = device->ws->query_value(device->ws,
RADEON_ALLOCATED_VRAM_VIS);
memoryBudget->heapBudget[heap_index] = heap_budget;
memoryBudget->heapUsage[heap_index] = heap_usage;
break;
case RADV_MEM_TYPE_VRAM_CPU_ACCESS:
heap_usage = device->ws->query_value(device->ws,
RADEON_ALLOCATED_VRAM_VIS);
heap_budget = visible_vram_size -
device->ws->query_value(device->ws, RADEON_VRAM_VIS_USAGE) +
heap_usage;
heap_budget = visible_vram_size -
device->ws->query_value(device->ws, RADEON_VRAM_VIS_USAGE) +
heap_usage;
memoryBudget->heapBudget[RADV_MEM_HEAP_VRAM_CPU_ACCESS] = heap_budget;
memoryBudget->heapUsage[RADV_MEM_HEAP_VRAM_CPU_ACCESS] = heap_usage;
}
memoryBudget->heapBudget[heap_index] = heap_budget;
memoryBudget->heapUsage[heap_index] = heap_usage;
break;
case RADV_MEM_TYPE_GTT_WRITE_COMBINE:
heap_usage = device->ws->query_value(device->ws,
RADEON_ALLOCATED_GTT);
if (gtt_size) {
heap_usage = device->ws->query_value(device->ws,
RADEON_ALLOCATED_GTT);
heap_budget = gtt_size -
device->ws->query_value(device->ws, RADEON_GTT_USAGE) +
heap_usage;
heap_budget = gtt_size -
device->ws->query_value(device->ws, RADEON_GTT_USAGE) +
heap_usage;
memoryBudget->heapBudget[RADV_MEM_HEAP_GTT] = heap_budget;
memoryBudget->heapUsage[RADV_MEM_HEAP_GTT] = heap_usage;
memoryBudget->heapBudget[heap_index] = heap_budget;
memoryBudget->heapUsage[heap_index] = heap_usage;
break;
default:
break;
}
}
/* The heapBudget and heapUsage values must be zero for array elements

View File

@@ -127,8 +127,8 @@ EXTENSIONS = [
Extension('VK_EXT_ycbcr_image_arrays', 1, True),
Extension('VK_AMD_draw_indirect_count', 1, True),
Extension('VK_AMD_gcn_shader', 1, True),
Extension('VK_AMD_gpu_shader_half_float', 1, 'device->rad_info.chip_class >= VI && HAVE_LLVM >= 0x0800'),
Extension('VK_AMD_gpu_shader_int16', 1, 'device->rad_info.chip_class >= VI'),
Extension('VK_AMD_gpu_shader_half_float', 1, 'device->rad_info.chip_class >= GFX9 && HAVE_LLVM >= 0x0800'),
Extension('VK_AMD_gpu_shader_int16', 1, 'device->rad_info.chip_class >= GFX9'),
Extension('VK_AMD_rasterization_order', 1, 'device->has_out_of_order_rast'),
Extension('VK_AMD_shader_core_properties', 1, True),
Extension('VK_AMD_shader_info', 1, True),

View File

@@ -547,7 +547,7 @@ static bool radv_is_storage_image_format_supported(struct radv_physical_device *
}
}
static bool radv_is_buffer_format_supported(VkFormat format, bool *scaled)
bool radv_is_buffer_format_supported(VkFormat format, bool *scaled)
{
const struct vk_format_description *desc = vk_format_description(format);
unsigned data_format, num_format;
@@ -559,7 +559,8 @@ static bool radv_is_buffer_format_supported(VkFormat format, bool *scaled)
num_format = radv_translate_buffer_numformat(desc,
vk_format_get_first_non_void_channel(format));
*scaled = (num_format == V_008F0C_BUF_NUM_FORMAT_SSCALED) || (num_format == V_008F0C_BUF_NUM_FORMAT_USCALED);
if (scaled)
*scaled = (num_format == V_008F0C_BUF_NUM_FORMAT_SSCALED) || (num_format == V_008F0C_BUF_NUM_FORMAT_USCALED);
return data_format != V_008F0C_BUF_DATA_FORMAT_INVALID &&
num_format != ~0;
}
@@ -635,7 +636,8 @@ radv_physical_device_get_format_properties(struct radv_physical_device *physical
const struct vk_format_description *desc = vk_format_description(format);
bool blendable;
bool scaled = false;
if (!desc) {
/* TODO: implement some software emulation of SUBSAMPLED formats. */
if (!desc || desc->layout == VK_FORMAT_LAYOUT_SUBSAMPLED) {
out_properties->linearTilingFeatures = linear;
out_properties->optimalTilingFeatures = tiled;
out_properties->bufferFeatures = buffer;
@@ -655,6 +657,7 @@ radv_physical_device_get_format_properties(struct radv_physical_device *physical
uint32_t tiling = VK_FORMAT_FEATURE_TRANSFER_SRC_BIT |
VK_FORMAT_FEATURE_TRANSFER_DST_BIT |
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT |
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT |
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT;
/* The subsampled formats have no support for linear filters. */

View File

@@ -729,7 +729,8 @@ radv_query_opaque_metadata(struct radv_device *device,
for (i = 0; i <= image->info.levels - 1; i++)
md->metadata[10+i] = image->planes[0].surface.u.legacy.level[i].offset >> 8;
md->size_metadata = (11 + image->info.levels - 1) * 4;
}
} else
md->size_metadata = 10 * 4;
}
void
@@ -860,6 +861,11 @@ radv_image_alloc_cmask(struct radv_device *device,
uint32_t clear_value_size = 0;
radv_image_get_cmask_info(device, image, &image->cmask);
if (!image->cmask.size)
return;
assert(image->cmask.alignment);
image->cmask.offset = align64(image->size, image->cmask.alignment);
/* + 8 for storing the clear values */
if (!image->clear_value_offset) {

View File

@@ -81,7 +81,7 @@ radv_meta_save(struct radv_meta_saved_state *state,
if (state->flags & RADV_META_SAVE_DESCRIPTORS) {
state->old_descriptor_set0 = descriptors_state->sets[0];
if (!state->old_descriptor_set0)
if (!(descriptors_state->valid & 1) || !state->old_descriptor_set0)
state->flags &= ~RADV_META_SAVE_DESCRIPTORS;
}

View File

@@ -650,6 +650,7 @@ static bool depth_view_can_fast_clear(struct radv_cmd_buffer *cmd_buffer,
if (radv_image_has_htile(iview->image) &&
iview->base_mip == 0 &&
iview->base_layer == 0 &&
iview->layer_count == iview->image->info.array_size &&
radv_layout_is_htile_compressed(iview->image, layout, queue_mask) &&
radv_image_extent_compare(iview->image, &iview->extent))
return true;
@@ -1575,6 +1576,9 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer,
emit_color_clear(cmd_buffer, clear_att, clear_rect, view_mask);
}
} else {
if (!subpass->depth_stencil_attachment)
return;
const uint32_t pass_att = subpass->depth_stencil_attachment->attachment;
if (pass_att == VK_ATTACHMENT_UNUSED)
return;

View File

@@ -187,6 +187,24 @@ meta_copy_buffer_to_image(struct radv_cmd_buffer *cmd_buffer,
&pRegions[r].imageSubresource,
pRegions[r].imageSubresource.aspectMask);
if (!radv_is_buffer_format_supported(img_bsurf.format, NULL)) {
uint32_t queue_mask = radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
MAYBE_UNUSED bool compressed = radv_layout_dcc_compressed(image, layout, queue_mask);
if (compressed) {
radv_decompress_dcc(cmd_buffer, image, &(VkImageSubresourceRange) {
.aspectMask = pRegions[r].imageSubresource.aspectMask,
.baseMipLevel = pRegions[r].imageSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = pRegions[r].imageSubresource.baseArrayLayer,
.layerCount = pRegions[r].imageSubresource.layerCount,
});
}
img_bsurf.format = vk_format_for_size(vk_format_get_blocksize(img_bsurf.format));
img_bsurf.current_layout = VK_IMAGE_LAYOUT_GENERAL;
}
struct radv_meta_blit2d_buffer buf_bsurf = {
.bs = img_bsurf.bs,
.format = img_bsurf.format,
@@ -313,6 +331,24 @@ meta_copy_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
&pRegions[r].imageSubresource,
pRegions[r].imageSubresource.aspectMask);
if (!radv_is_buffer_format_supported(img_info.format, NULL)) {
uint32_t queue_mask = radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
MAYBE_UNUSED bool compressed = radv_layout_dcc_compressed(image, layout, queue_mask);
if (compressed) {
radv_decompress_dcc(cmd_buffer, image, &(VkImageSubresourceRange) {
.aspectMask = pRegions[r].imageSubresource.aspectMask,
.baseMipLevel = pRegions[r].imageSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = pRegions[r].imageSubresource.baseArrayLayer,
.layerCount = pRegions[r].imageSubresource.layerCount,
});
}
img_info.format = vk_format_for_size(vk_format_get_blocksize(img_info.format));
img_info.current_layout = VK_IMAGE_LAYOUT_GENERAL;
}
struct radv_meta_blit2d_buffer buf_info = {
.bs = img_info.bs,
.format = img_info.format,

View File

@@ -24,6 +24,7 @@
#include "radv_meta.h"
#include "radv_private.h"
#include "vk_format.h"
static nir_shader *
build_fmask_expand_compute_shader(struct radv_device *device, int samples)
@@ -132,7 +133,7 @@ radv_expand_fmask_image_inplace(struct radv_cmd_buffer *cmd_buffer,
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(image),
.viewType = radv_meta_get_view_type(image),
.format = image->vk_format,
.format = vk_format_no_srgb(image->vk_format),
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = 0,

View File

@@ -156,6 +156,73 @@ convert_ycbcr(struct ycbcr_state *state,
converted_channels[2], nir_imm_float(b, 1.0f));
}
static nir_ssa_def *
get_texture_size(struct ycbcr_state *state, nir_deref_instr *texture)
{
nir_builder *b = state->builder;
const struct glsl_type *type = texture->type;
nir_tex_instr *tex = nir_tex_instr_create(b->shader, 1);
tex->op = nir_texop_txs;
tex->sampler_dim = glsl_get_sampler_dim(type);
tex->is_array = glsl_sampler_type_is_array(type);
tex->is_shadow = glsl_sampler_type_is_shadow(type);
tex->dest_type = nir_type_int;
tex->src[0].src_type = nir_tex_src_texture_deref;
tex->src[0].src = nir_src_for_ssa(&texture->dest.ssa);
nir_ssa_dest_init(&tex->instr, &tex->dest,
nir_tex_instr_dest_size(tex), 32, NULL);
nir_builder_instr_insert(b, &tex->instr);
return nir_i2f32(b, &tex->dest.ssa);
}
static nir_ssa_def *
implicit_downsampled_coord(nir_builder *b,
nir_ssa_def *value,
nir_ssa_def *max_value,
int div_scale)
{
return nir_fadd(b,
value,
nir_fdiv(b,
nir_imm_float(b, 1.0f),
nir_fmul(b,
nir_imm_float(b, div_scale),
max_value)));
}
static nir_ssa_def *
implicit_downsampled_coords(struct ycbcr_state *state,
nir_ssa_def *old_coords)
{
nir_builder *b = state->builder;
const struct radv_sampler_ycbcr_conversion *conversion = state->conversion;
nir_ssa_def *image_size = NULL;
nir_ssa_def *comp[4] = { NULL, };
const struct vk_format_description *fmt_desc = vk_format_description(state->conversion->format);
const unsigned divisors[2] = {fmt_desc->width_divisor, fmt_desc->height_divisor};
for (int c = 0; c < old_coords->num_components; c++) {
if (c < ARRAY_SIZE(divisors) && divisors[c] > 1 &&
conversion->chroma_offsets[c] == VK_CHROMA_LOCATION_COSITED_EVEN) {
if (!image_size)
image_size = get_texture_size(state, state->tex_deref);
comp[c] = implicit_downsampled_coord(b,
nir_channel(b, old_coords, c),
nir_channel(b, image_size, c),
divisors[c]);
} else {
comp[c] = nir_channel(b, old_coords, c);
}
}
return nir_vec(b, comp, old_coords->num_components);
}
static nir_ssa_def *
create_plane_tex_instr_implicit(struct ycbcr_state *state,
uint32_t plane)
@@ -163,10 +230,23 @@ create_plane_tex_instr_implicit(struct ycbcr_state *state,
nir_builder *b = state->builder;
nir_tex_instr *old_tex = state->origin_tex;
nir_tex_instr *tex = nir_tex_instr_create(b->shader, old_tex->num_srcs+ 1);
for (uint32_t i = 0; i < old_tex->num_srcs; i++) {
tex->src[i].src_type = old_tex->src[i].src_type;
nir_src_copy(&tex->src[i].src, &old_tex->src[i].src, tex);
switch (old_tex->src[i].src_type) {
case nir_tex_src_coord:
if (plane && true/*state->conversion->chroma_reconstruction*/) {
assert(old_tex->src[i].src.is_ssa);
tex->src[i].src =
nir_src_for_ssa(implicit_downsampled_coords(state,
old_tex->src[i].src.ssa));
break;
}
/* fall through */
default:
nir_src_copy(&tex->src[i].src, &old_tex->src[i].src, tex);
break;
}
}
tex->src[tex->num_srcs - 1].src = nir_src_for_ssa(nir_imm_int(b, plane));

View File

@@ -737,7 +737,7 @@ static void allocate_user_sgprs(struct radv_shader_context *ctx,
if (ctx->shader_info->info.loads_push_constants)
user_sgpr_count++;
if (ctx->streamout_buffers)
if (ctx->shader_info->info.so.num_outputs)
user_sgpr_count++;
uint32_t available_sgprs = ctx->options->chip_class >= GFX9 && stage != MESA_SHADER_COMPUTE ? 32 : 16;
@@ -2019,16 +2019,34 @@ static LLVMValueRef radv_get_sampler_desc(struct ac_shader_abi *abi,
assert(stride % type_size == 0);
if (!index)
index = ctx->ac.i32_0;
LLVMValueRef adjusted_index = index;
if (!adjusted_index)
adjusted_index = ctx->ac.i32_0;
index = LLVMBuildMul(builder, index, LLVMConstInt(ctx->ac.i32, stride / type_size, 0), "");
adjusted_index = LLVMBuildMul(builder, adjusted_index, LLVMConstInt(ctx->ac.i32, stride / type_size, 0), "");
list = ac_build_gep0(&ctx->ac, list, LLVMConstInt(ctx->ac.i32, offset, 0));
list = LLVMBuildPointerCast(builder, list,
ac_array_in_const32_addr_space(type), "");
return ac_build_load_to_sgpr(&ctx->ac, list, index);
LLVMValueRef descriptor = ac_build_load_to_sgpr(&ctx->ac, list, adjusted_index);
/* 3 plane formats always have same size and format for plane 1 & 2, so
* use the tail from plane 1 so that we can store only the first 16 bytes
* of the last plane. */
if (desc_type == AC_DESC_PLANE_2) {
LLVMValueRef descriptor2 = radv_get_sampler_desc(abi, descriptor_set, base_index, constant_index, index, AC_DESC_PLANE_1,image, write, bindless);
LLVMValueRef components[8];
for (unsigned i = 0; i < 4; ++i)
components[i] = ac_llvm_extract_elem(&ctx->ac, descriptor, i);
for (unsigned i = 4; i < 8; ++i)
components[i] = ac_llvm_extract_elem(&ctx->ac, descriptor2, i);
descriptor = ac_build_gather_values(&ctx->ac, components, 8);
}
return descriptor;
}
/* For 2_10_10_10 formats the alpha is handled as unsigned by pre-vega HW.
@@ -3592,9 +3610,10 @@ ac_setup_rings(struct radv_shader_context *ctx)
unsigned
radv_nir_get_max_workgroup_size(enum chip_class chip_class,
gl_shader_stage stage,
const struct nir_shader *nir)
{
switch (nir->info.stage) {
switch (stage) {
case MESA_SHADER_TESS_CTRL:
return chip_class >= CIK ? 128 : 64;
case MESA_SHADER_GEOMETRY:
@@ -3605,6 +3624,8 @@ radv_nir_get_max_workgroup_size(enum chip_class chip_class,
return 0;
}
if (!nir)
return chip_class >= GFX9 ? 128 : 64;
unsigned max_workgroup_size = nir->info.cs.local_size[0] *
nir->info.cs.local_size[1] *
nir->info.cs.local_size[2];
@@ -3671,7 +3692,8 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct ac_llvm_compiler *ac_llvm,
for (int i = 0; i < shader_count; ++i) {
ctx.max_workgroup_size = MAX2(ctx.max_workgroup_size,
radv_nir_get_max_workgroup_size(ctx.options->chip_class,
shaders[i]));
shaders[i]->info.stage,
shaders[i]));
}
create_function(&ctx, shaders[shader_count - 1]->info.stage, shader_count >= 2,
@@ -4044,7 +4066,7 @@ ac_gs_copy_shader_emit(struct radv_shader_context *ctx)
LLVMBasicBlockRef bb;
unsigned offset;
if (!num_components)
if (stream > 0 && !num_components)
continue;
if (stream > 0 && !ctx->shader_info->info.so.num_outputs)

View File

@@ -524,7 +524,7 @@ radv_pipeline_compute_spi_color_formats(struct radv_pipeline *pipeline,
col_format |= cf << (4 * i);
}
if (!col_format && blend->need_src_alpha & (1 << 0)) {
if (!(col_format & 0xf) && blend->need_src_alpha & (1 << 0)) {
/* When a subpass doesn't have any color attachments, write the
* alpha channel of MRT0 when alpha coverage is enabled because
* the depth attachment needs it.
@@ -542,10 +542,13 @@ radv_pipeline_compute_spi_color_formats(struct radv_pipeline *pipeline,
}
}
blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
/* The output for dual source blending should have the same format as
* the first output.
*/
if (blend->mrt0_is_dual_src)
col_format |= (col_format & 0xf) << 4;
blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
blend->spi_shader_col_format = col_format;
}
@@ -1417,11 +1420,13 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
const VkPipelineDiscardRectangleStateCreateInfoEXT *discard_rectangle_info =
vk_find_struct_const(pCreateInfo->pNext, PIPELINE_DISCARD_RECTANGLE_STATE_CREATE_INFO_EXT);
if (states & RADV_DYNAMIC_DISCARD_RECTANGLE) {
if (needed_states & RADV_DYNAMIC_DISCARD_RECTANGLE) {
dynamic->discard_rectangle.count = discard_rectangle_info->discardRectangleCount;
typed_memcpy(dynamic->discard_rectangle.rectangles,
discard_rectangle_info->pDiscardRectangles,
discard_rectangle_info->discardRectangleCount);
if (states & RADV_DYNAMIC_DISCARD_RECTANGLE) {
typed_memcpy(dynamic->discard_rectangle.rectangles,
discard_rectangle_info->pDiscardRectangles,
discard_rectangle_info->discardRectangleCount);
}
}
pipeline->dynamic_state.mask = states;
@@ -2177,12 +2182,12 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
for (int i = 0; i < MESA_SHADER_STAGES; ++i) {
if (nir[i]) {
NIR_PASS_V(nir[i], nir_lower_bool_to_int32);
NIR_PASS_V(nir[i], nir_lower_non_uniform_access,
nir_lower_non_uniform_ubo_access |
nir_lower_non_uniform_ssbo_access |
nir_lower_non_uniform_texture_access |
nir_lower_non_uniform_image_access);
NIR_PASS_V(nir[i], nir_lower_bool_to_int32);
}
if (radv_can_dump_shader(device, modules[i], false))
@@ -2668,8 +2673,10 @@ radv_pipeline_generate_binning_state(struct radeon_cmdbuf *ctx_cs,
break;
case CHIP_RAVEN:
case CHIP_RAVEN2:
context_states_per_bin = 6;
persistent_states_per_bin = 32;
/* The context states are affected by the scissor bug. */
context_states_per_bin = pipeline->device->physical_device->has_scissor_bug ? 1 : 6;
/* 32 causes hangs for RAVEN. */
persistent_states_per_bin = 16;
fpovs_per_batch = 63;
break;
default:
@@ -2706,7 +2713,6 @@ radv_pipeline_generate_depth_stencil_state(struct radeon_cmdbuf *ctx_cs,
const VkPipelineDepthStencilStateCreateInfo *vkds = pCreateInfo->pDepthStencilState;
RADV_FROM_HANDLE(radv_render_pass, pass, pCreateInfo->renderPass);
struct radv_subpass *subpass = pass->subpasses + pCreateInfo->subpass;
struct radv_shader_variant *ps = pipeline->shaders[MESA_SHADER_FRAGMENT];
struct radv_render_pass_attachment *attachment = NULL;
uint32_t db_depth_control = 0, db_stencil_control = 0;
uint32_t db_render_control = 0, db_render_override2 = 0;
@@ -2755,8 +2761,7 @@ radv_pipeline_generate_depth_stencil_state(struct radeon_cmdbuf *ctx_cs,
db_render_override |= S_02800C_FORCE_HIS_ENABLE0(V_02800C_FORCE_DISABLE) |
S_02800C_FORCE_HIS_ENABLE1(V_02800C_FORCE_DISABLE);
if (!pCreateInfo->pRasterizationState->depthClampEnable &&
ps->info.info.ps.writes_z) {
if (!pCreateInfo->pRasterizationState->depthClampEnable) {
/* From VK_EXT_depth_range_unrestricted spec:
*
* "The behavior described in Primitive Clipping still applies.
@@ -2927,8 +2932,11 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf *ctx_cs,
struct radv_pipeline *pipeline)
{
const struct radv_vs_output_info *outinfo = get_vs_output_info(pipeline);
uint32_t vgt_primitiveid_en = false;
const struct radv_shader_variant *vs =
pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
pipeline->shaders[MESA_SHADER_TESS_EVAL] :
pipeline->shaders[MESA_SHADER_VERTEX];
uint32_t vgt_gs_mode = 0;
if (radv_pipeline_has_gs(pipeline)) {
@@ -2937,7 +2945,7 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf *ctx_cs,
vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
pipeline->device->physical_device->rad_info.chip_class);
} else if (outinfo->export_prim_id) {
} else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
vgt_primitiveid_en = true;
}

View File

@@ -1456,6 +1456,7 @@ uint32_t radv_translate_buffer_dataformat(const struct vk_format_description *de
int first_non_void);
uint32_t radv_translate_buffer_numformat(const struct vk_format_description *desc,
int first_non_void);
bool radv_is_buffer_format_supported(VkFormat format, bool *scaled);
uint32_t radv_translate_colorformat(VkFormat format);
uint32_t radv_translate_color_numformat(VkFormat format,
const struct vk_format_description *desc,
@@ -1993,6 +1994,7 @@ void radv_compile_nir_shader(struct ac_llvm_compiler *ac_llvm,
const struct radv_nir_compiler_options *options);
unsigned radv_nir_get_max_workgroup_size(enum chip_class chip_class,
gl_shader_stage stage,
const struct nir_shader *nir);
/* radv_shader_info.h */

View File

@@ -40,18 +40,6 @@
static const int pipelinestat_block_size = 11 * 8;
static const unsigned pipeline_statistics_indices[] = {7, 6, 3, 4, 5, 2, 1, 0, 8, 9, 10};
static unsigned get_max_db(struct radv_device *device)
{
unsigned num_db = device->physical_device->rad_info.num_render_backends;
MAYBE_UNUSED unsigned rb_mask = device->physical_device->rad_info.enabled_rb_mask;
/* Otherwise we need to change the query reset procedure */
assert(rb_mask == ((1ull << num_db) - 1));
return num_db;
}
static nir_ssa_def *nir_test_flag(nir_builder *b, nir_ssa_def *flags, uint32_t flag)
{
return nir_i2b(b, nir_iand(b, flags, nir_imm_int(b, flag)));
@@ -108,12 +96,14 @@ build_occlusion_query_shader(struct radv_device *device) {
* uint64_t dst_offset = dst_stride * global_id.x;
* bool available = true;
* for (int i = 0; i < db_count; ++i) {
* uint64_t start = src_buf[src_offset + 16 * i];
* uint64_t end = src_buf[src_offset + 16 * i + 8];
* if ((start & (1ull << 63)) && (end & (1ull << 63)))
* result += end - start;
* else
* available = false;
* if (enabled_rb_mask & (1 << i)) {
* uint64_t start = src_buf[src_offset + 16 * i];
* uint64_t end = src_buf[src_offset + 16 * i + 8];
* if ((start & (1ull << 63)) && (end & (1ull << 63)))
* result += end - start;
* else
* available = false;
* }
* }
* uint32_t elem_size = flags & VK_QUERY_RESULT_64_BIT ? 8 : 4;
* if ((flags & VK_QUERY_RESULT_PARTIAL_BIT) || available) {
@@ -139,7 +129,8 @@ build_occlusion_query_shader(struct radv_device *device) {
nir_variable *start = nir_local_variable_create(b.impl, glsl_uint64_t_type(), "start");
nir_variable *end = nir_local_variable_create(b.impl, glsl_uint64_t_type(), "end");
nir_variable *available = nir_local_variable_create(b.impl, glsl_bool_type(), "available");
unsigned db_count = get_max_db(device);
unsigned enabled_rb_mask = device->physical_device->rad_info.enabled_rb_mask;
unsigned db_count = device->physical_device->rad_info.num_render_backends;
nir_ssa_def *flags = radv_load_push_int(&b, 0, "flags");
@@ -187,6 +178,16 @@ build_occlusion_query_shader(struct radv_device *device) {
nir_ssa_def *current_outer_count = nir_load_var(&b, outer_counter);
radv_break_on_count(&b, outer_counter, nir_imm_int(&b, db_count));
nir_ssa_def *enabled_cond =
nir_iand(&b, nir_imm_int(&b, enabled_rb_mask),
nir_ishl(&b, nir_imm_int(&b, 1), current_outer_count));
nir_if *enabled_if = nir_if_create(b.shader);
enabled_if->condition = nir_src_for_ssa(nir_i2b(&b, enabled_cond));
nir_cf_node_insert(b.cursor, &enabled_if->cf_node);
b.cursor = nir_after_cf_list(&enabled_if->then_list);
nir_ssa_def *load_offset = nir_imul(&b, current_outer_count, nir_imm_int(&b, 16));
load_offset = nir_iadd(&b, input_base, load_offset);
@@ -1044,7 +1045,7 @@ VkResult radv_CreateQueryPool(
switch(pCreateInfo->queryType) {
case VK_QUERY_TYPE_OCCLUSION:
pool->stride = 16 * get_max_db(device);
pool->stride = 16 * device->physical_device->rad_info.num_render_backends;
break;
case VK_QUERY_TYPE_PIPELINE_STATISTICS:
pool->stride = pipelinestat_block_size * 2;
@@ -1128,17 +1129,18 @@ VkResult radv_GetQueryPoolResults(
if (flags & VK_QUERY_RESULT_WAIT_BIT)
while(!*(volatile uint32_t*)(pool->ptr + pool->availability_offset + 4 * query))
;
available = *(uint32_t*)(pool->ptr + pool->availability_offset + 4 * query);
available = *(volatile uint32_t*)(pool->ptr + pool->availability_offset + 4 * query);
}
switch (pool->type) {
case VK_QUERY_TYPE_TIMESTAMP: {
available = *(uint64_t *)src != TIMESTAMP_NOT_READY;
volatile uint64_t const *src64 = (volatile uint64_t const *)src;
available = *src64 != TIMESTAMP_NOT_READY;
if (flags & VK_QUERY_RESULT_WAIT_BIT) {
while (*(volatile uint64_t *)src == TIMESTAMP_NOT_READY)
while (*src64 == TIMESTAMP_NOT_READY)
;
available = *(uint64_t *)src != TIMESTAMP_NOT_READY;
available = true;
}
if (!available && !(flags & VK_QUERY_RESULT_PARTIAL_BIT))
@@ -1146,23 +1148,28 @@ VkResult radv_GetQueryPoolResults(
if (flags & VK_QUERY_RESULT_64_BIT) {
if (available || (flags & VK_QUERY_RESULT_PARTIAL_BIT))
*(uint64_t*)dest = *(uint64_t*)src;
*(uint64_t*)dest = *src64;
dest += 8;
} else {
if (available || (flags & VK_QUERY_RESULT_PARTIAL_BIT))
*(uint32_t*)dest = *(uint32_t*)src;
*(uint32_t*)dest = *(volatile uint32_t*)src;
dest += 4;
}
break;
}
case VK_QUERY_TYPE_OCCLUSION: {
volatile uint64_t const *src64 = (volatile uint64_t const *)src;
uint32_t db_count = device->physical_device->rad_info.num_render_backends;
uint32_t enabled_rb_mask = device->physical_device->rad_info.enabled_rb_mask;
uint64_t sample_count = 0;
int db_count = get_max_db(device);
available = 1;
for (int i = 0; i < db_count; ++i) {
uint64_t start, end;
if (!(enabled_rb_mask & (1 << i)))
continue;
do {
start = src64[2 * i];
end = src64[2 * i + 1];
@@ -1193,8 +1200,8 @@ VkResult radv_GetQueryPoolResults(
if (!available && !(flags & VK_QUERY_RESULT_PARTIAL_BIT))
result = VK_NOT_READY;
const uint64_t *start = (uint64_t*)src;
const uint64_t *stop = (uint64_t*)(src + pipelinestat_block_size);
const volatile uint64_t *start = (uint64_t*)src;
const volatile uint64_t *stop = (uint64_t*)(src + pipelinestat_block_size);
if (flags & VK_QUERY_RESULT_64_BIT) {
uint64_t *dst = (uint64_t*)dest;
dest += util_bitcount(pool->pipeline_stats_mask) * 8;

View File

@@ -311,6 +311,8 @@ radv_shader_compile_to_nir(struct radv_device *device,
NIR_PASS_V(nir, nir_remove_dead_variables,
nir_var_shader_in | nir_var_shader_out | nir_var_system_value);
NIR_PASS_V(nir, nir_propagate_invariant);
NIR_PASS_V(nir, nir_lower_system_values);
NIR_PASS_V(nir, nir_lower_clip_cull_distance_arrays);
NIR_PASS_V(nir, radv_nir_lower_ycbcr_textures, layout);
@@ -624,6 +626,8 @@ shader_variant_create(struct radv_device *device,
tm_options |= AC_TM_SISCHED;
if (options->check_ir)
tm_options |= AC_TM_CHECK_IR;
if (device->instance->debug_flags & RADV_DEBUG_NO_LOAD_STORE_OPT)
tm_options |= AC_TM_NO_LOAD_STORE_OPT;
thread_compiler = !(device->instance->debug_flags & RADV_DEBUG_NOTHREADLLVM);
radv_init_llvm_once();
@@ -763,7 +767,7 @@ generate_shader_stats(struct radv_device *device,
lds_increment);
} else if (stage == MESA_SHADER_COMPUTE) {
unsigned max_workgroup_size =
radv_nir_get_max_workgroup_size(chip_class, variant->nir);
radv_nir_get_max_workgroup_size(chip_class, stage, variant->nir);
lds_per_wave = (conf->lds_size * lds_increment) /
DIV_ROUND_UP(max_workgroup_size, 64);
}

View File

@@ -102,7 +102,7 @@ vir_opt_redundant_flags_block(struct v3d_compile *c, struct qblock *block)
vir_for_each_inst(inst, block) {
if (inst->qpu.type != V3D_QPU_INSTR_TYPE_ALU ||
inst->qpu.flags.auf != V3D_QPU_UF_NONE ||
inst->qpu.flags.auf != V3D_QPU_UF_NONE) {
inst->qpu.flags.muf != V3D_QPU_UF_NONE) {
last_flags = NULL;
continue;
}

View File

@@ -244,6 +244,7 @@ NIR_FILES = \
nir/nir_lower_constant_initializers.c \
nir/nir_lower_double_ops.c \
nir/nir_lower_drawpixels.c \
nir/nir_lower_fb_read.c \
nir/nir_lower_fragcoord_wtrans.c \
nir/nir_lower_frexp.c \
nir/nir_lower_global_vars_to_local.c \

View File

@@ -1681,17 +1681,22 @@ __fround64(uint64_t __a)
if (unbiasedExp < 20) {
if (unbiasedExp < 0) {
if ((aHi & 0x80000000u) != 0u && aLo == 0u) {
return 0;
}
aHi &= 0x80000000u;
if (unbiasedExp == -1 && aLo != 0u)
aHi |= (1023u << 20);
if ((a.y & 0x000FFFFFu) == 0u && a.x == 0u) {
aLo = 0u;
return packUint2x32(uvec2(aLo, aHi));
}
aHi = mix(aHi, (aHi | 0x3FF00000u), unbiasedExp == -1);
aLo = 0u;
} else {
uint maskExp = 0x000FFFFFu >> unbiasedExp;
/* a is an integral value */
if (((aHi & maskExp) == 0u) && (aLo == 0u))
return __a;
uint lastBit = maskExp + 1;
aHi += 0x00080000u >> unbiasedExp;
if ((aHi & maskExp) == 0u)
aHi &= ~lastBit;
aHi &= ~maskExp;
aLo = 0u;
}
@@ -1708,9 +1713,7 @@ __fround64(uint64_t __a)
aLo &= ~maskExp;
}
a.x = aLo;
a.y = aHi;
return packUint2x32(a);
return packUint2x32(uvec2(aLo, aHi));
}
uint64_t

View File

@@ -443,7 +443,8 @@ nir_link_uniform(struct gl_context *ctx,
state->num_shader_uniform_components += values;
state->num_values += values;
if (state->max_uniform_location < uniform->remap_location + entries)
if (uniform->remap_location != UNMAPPED_UNIFORM_LOC &&
state->max_uniform_location < uniform->remap_location + entries)
state->max_uniform_location = uniform->remap_location + entries;
return MAX2(uniform->array_elements, 1);

View File

@@ -106,7 +106,7 @@ bitcast_i642d(int64_t i)
return d;
}
static double
static uint64_t
bitcast_d2u64(double d)
{
assert(sizeof(double) == sizeof(uint64_t));
@@ -115,7 +115,7 @@ bitcast_d2u64(double d)
return u;
}
static double
static int64_t
bitcast_d2i64(double d)
{
assert(sizeof(double) == sizeof(int64_t));

View File

@@ -180,6 +180,11 @@ loop_unroll_visitor::simple_unroll(ir_loop *ir, int iterations)
void *const mem_ctx = ralloc_parent(ir);
loop_variable_state *const ls = this->state->get(ir);
/* If there are no terminators, then the loop iteration count must be 1.
* This is the 'do { } while (false);' case.
*/
assert(!ls->terminators.is_empty() || iterations == 1);
ir_instruction *first_ir =
(ir_instruction *) ir->body_instructions.get_head();
@@ -221,7 +226,8 @@ loop_unroll_visitor::simple_unroll(ir_loop *ir, int iterations)
* the loop, or it the exit branch contains instructions. This ensures we
* execute any instructions before the terminator or in its exit branch.
*/
if (limit_if != first_ir->as_if() || exit_branch_has_instructions)
if (!ls->terminators.is_empty() &&
(limit_if != first_ir->as_if() || exit_branch_has_instructions))
iterations++;
for (int i = 0; i < iterations; i++) {

View File

@@ -507,6 +507,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
if (is_vec_zero(op_const[1]))
return ir->operands[0];
/* Replace (x + (-x)) with constant 0 */
for (int i = 0; i < 2; i++) {
if (op_expr[i]) {
if (op_expr[i]->operation == ir_unop_neg) {
ir_rvalue *other = ir->operands[(i + 1) % 2];
if (other && op_expr[i]->operands[0]->equals(other)) {
return ir_constant::zero(ir, ir->type);
}
}
}
}
/* Reassociate addition of constants so that we can do constant
* folding.
*/

View File

@@ -165,9 +165,8 @@ shader_cache_read_program_metadata(struct gl_context *ctx,
prog->FragDataIndexBindings->iterate(create_binding_str, &buf);
ralloc_asprintf_append(&buf, "tf: %d ", prog->TransformFeedback.BufferMode);
for (unsigned int i = 0; i < prog->TransformFeedback.NumVarying; i++) {
ralloc_asprintf_append(&buf, "%s:%d ",
prog->TransformFeedback.VaryingNames[i],
prog->TransformFeedback.BufferStride[i]);
ralloc_asprintf_append(&buf, "%s ",
prog->TransformFeedback.VaryingNames[i]);
}
/* SSO has an effect on the linked program so include this when generating

View File

@@ -50,7 +50,7 @@ glsl_type::glsl_type(GLenum gl_type,
gl_type(gl_type),
base_type(base_type), sampled_type(GLSL_TYPE_VOID),
sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
interface_packing(0), interface_row_major(row_major),
interface_packing(0), interface_row_major(row_major), packed(0),
vector_elements(vector_elements), matrix_columns(matrix_columns),
length(0), explicit_stride(explicit_stride)
{
@@ -85,7 +85,7 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type base_type,
base_type(base_type), sampled_type(type),
sampler_dimensionality(dim), sampler_shadow(shadow),
sampler_array(array), interface_packing(0),
interface_row_major(0),
interface_row_major(0), packed(0),
length(0), explicit_stride(0)
{
this->mem_ctx = ralloc_context(NULL);
@@ -134,7 +134,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,
base_type(GLSL_TYPE_INTERFACE), sampled_type(GLSL_TYPE_VOID),
sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
interface_packing((unsigned) packing),
interface_row_major((unsigned) row_major),
interface_row_major((unsigned) row_major), packed(0),
vector_elements(0), matrix_columns(0),
length(num_fields), explicit_stride(0)
{
@@ -159,7 +159,7 @@ glsl_type::glsl_type(const glsl_type *return_type,
gl_type(0),
base_type(GLSL_TYPE_FUNCTION), sampled_type(GLSL_TYPE_VOID),
sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
interface_packing(0), interface_row_major(0),
interface_packing(0), interface_row_major(0), packed(0),
vector_elements(0), matrix_columns(0),
length(num_params), explicit_stride(0)
{
@@ -188,7 +188,7 @@ glsl_type::glsl_type(const char *subroutine_name) :
gl_type(0),
base_type(GLSL_TYPE_SUBROUTINE), sampled_type(GLSL_TYPE_VOID),
sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
interface_packing(0), interface_row_major(0),
interface_packing(0), interface_row_major(0), packed(0),
vector_elements(1), matrix_columns(1),
length(0), explicit_stride(0)
{
@@ -534,7 +534,7 @@ glsl_type::glsl_type(const glsl_type *array, unsigned length,
unsigned explicit_stride) :
base_type(GLSL_TYPE_ARRAY), sampled_type(GLSL_TYPE_VOID),
sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
interface_packing(0), interface_row_major(0),
interface_packing(0), interface_row_major(0), packed(0),
vector_elements(0), matrix_columns(0),
length(length), name(NULL), explicit_stride(explicit_stride)
{
@@ -1311,9 +1311,7 @@ glsl_type::get_function_instance(const glsl_type *return_type,
const glsl_type *
glsl_type::get_mul_type(const glsl_type *type_a, const glsl_type *type_b)
{
if (type_a == type_b) {
return type_a;
} else if (type_a->is_matrix() && type_b->is_matrix()) {
if (type_a->is_matrix() && type_b->is_matrix()) {
/* Matrix multiply. The columns of A must match the rows of B. Given
* the other previously tested constraints, this means the vector type
* of a row from A must be the same as the vector type of a column from
@@ -1333,6 +1331,8 @@ glsl_type::get_mul_type(const glsl_type *type_a, const glsl_type *type_b)
return type;
}
} else if (type_a == type_b) {
return type_a;
} else if (type_a->is_matrix()) {
/* A is a matrix and B is a column vector. Columns of A must match
* rows of B. Given the other previously tested constraints, this

View File

@@ -299,4 +299,16 @@ if with_tests
link_with : libmesa_util,
)
)
test(
'comparison_pre',
executable(
'comparison_pre',
files('tests/comparison_pre_tests.cpp'),
c_args : [c_vis_args, c_msvc_compat_args, no_override_init_args],
include_directories : [inc_common],
dependencies : [dep_thread, idep_gtest, idep_nir],
link_with : libmesa_util,
)
)
endif

View File

@@ -1204,6 +1204,41 @@ nir_foreach_src(nir_instr *instr, nir_foreach_src_cb cb, void *state)
return nir_foreach_dest(instr, visit_dest_indirect, &dest_state);
}
nir_const_value
nir_const_value_for_float(double f, unsigned bit_size)
{
nir_const_value v;
memset(&v, 0, sizeof(v));
switch (bit_size) {
case 16:
v.u16 = _mesa_float_to_half(f);
break;
case 32:
v.f32 = f;
break;
case 64:
v.f64 = f;
break;
default:
unreachable("Invalid bit size");
}
return v;
}
double
nir_const_value_as_float(nir_const_value value, unsigned bit_size)
{
switch (bit_size) {
case 16: return _mesa_half_to_float(value.u16);
case 32: return value.f32;
case 64: return value.f64;
default:
unreachable("Invalid bit size");
}
}
int64_t
nir_src_comp_as_int(nir_src src, unsigned comp)
{
@@ -1997,6 +2032,8 @@ void
nir_rewrite_image_intrinsic(nir_intrinsic_instr *intrin, nir_ssa_def *src,
bool bindless)
{
enum gl_access_qualifier access = nir_intrinsic_access(intrin);
switch (intrin->intrinsic) {
#define CASE(op) \
case nir_intrinsic_image_deref_##op: \
@@ -2028,7 +2065,7 @@ nir_rewrite_image_intrinsic(nir_intrinsic_instr *intrin, nir_ssa_def *src,
nir_intrinsic_set_image_dim(intrin, glsl_get_sampler_dim(deref->type));
nir_intrinsic_set_image_array(intrin, glsl_sampler_type_is_array(deref->type));
nir_intrinsic_set_access(intrin, var->data.image.access);
nir_intrinsic_set_access(intrin, access | var->data.image.access);
nir_intrinsic_set_format(intrin, var->data.image.format);
nir_instr_rewrite_src(&intrin->instr, &intrin->src[0],

View File

@@ -140,6 +140,106 @@ typedef union {
arr[i] = c[i].m; \
} while (false)
static inline nir_const_value
nir_const_value_for_raw_uint(uint64_t x, unsigned bit_size)
{
nir_const_value v;
memset(&v, 0, sizeof(v));
switch (bit_size) {
case 1: v.b = x; break;
case 8: v.u8 = x; break;
case 16: v.u16 = x; break;
case 32: v.u32 = x; break;
case 64: v.u64 = x; break;
default:
unreachable("Invalid bit size");
}
return v;
}
static inline nir_const_value
nir_const_value_for_int(int64_t i, unsigned bit_size)
{
nir_const_value v;
memset(&v, 0, sizeof(v));
assert(bit_size <= 64);
if (bit_size < 64) {
assert(i >= (-(1ll << (bit_size - 1))));
assert(i < (1ll << (bit_size - 1)));
}
return nir_const_value_for_raw_uint(i, bit_size);
}
static inline nir_const_value
nir_const_value_for_uint(uint64_t u, unsigned bit_size)
{
nir_const_value v;
memset(&v, 0, sizeof(v));
assert(bit_size <= 64);
if (bit_size < 64)
assert(u < (1ull << bit_size));
return nir_const_value_for_raw_uint(u, bit_size);
}
static inline nir_const_value
nir_const_value_for_bool(bool b, unsigned bit_size)
{
/* Booleans use a 0/-1 convention */
return nir_const_value_for_int(-(int)b, bit_size);
}
/* This one isn't inline because it requires half-float conversion */
nir_const_value nir_const_value_for_float(double b, unsigned bit_size);
static inline int64_t
nir_const_value_as_int(nir_const_value value, unsigned bit_size)
{
switch (bit_size) {
/* int1_t uses 0/-1 convention */
case 1: return -(int)value.b;
case 8: return value.i8;
case 16: return value.i16;
case 32: return value.i32;
case 64: return value.i64;
default:
unreachable("Invalid bit size");
}
}
static inline int64_t
nir_const_value_as_uint(nir_const_value value, unsigned bit_size)
{
switch (bit_size) {
case 1: return value.b;
case 8: return value.u8;
case 16: return value.u16;
case 32: return value.u32;
case 64: return value.u64;
default:
unreachable("Invalid bit size");
}
}
static inline bool
nir_const_value_as_bool(nir_const_value value, unsigned bit_size)
{
int64_t i = nir_const_value_as_int(value, bit_size);
/* Booleans of any size use 0/-1 convention */
assert(i == 0 || i == -1);
return i;
}
/* This one isn't inline because it requires half-float conversion */
double nir_const_value_as_float(nir_const_value value, unsigned bit_size);
typedef struct nir_constant {
/**
* Value of the constant.
@@ -1281,6 +1381,10 @@ typedef enum {
*/
NIR_INTRINSIC_DESC_TYPE = 19,
/* Separate source/dest access flags for copies */
NIR_INTRINSIC_SRC_ACCESS,
NIR_INTRINSIC_DST_ACCESS,
NIR_INTRINSIC_NUM_INDEX_FLAGS,
} nir_intrinsic_index_flag;
@@ -1381,6 +1485,8 @@ INTRINSIC_IDX_ACCESSORS(param_idx, PARAM_IDX, unsigned)
INTRINSIC_IDX_ACCESSORS(image_dim, IMAGE_DIM, enum glsl_sampler_dim)
INTRINSIC_IDX_ACCESSORS(image_array, IMAGE_ARRAY, bool)
INTRINSIC_IDX_ACCESSORS(access, ACCESS, enum gl_access_qualifier)
INTRINSIC_IDX_ACCESSORS(src_access, SRC_ACCESS, enum gl_access_qualifier)
INTRINSIC_IDX_ACCESSORS(dst_access, DST_ACCESS, enum gl_access_qualifier)
INTRINSIC_IDX_ACCESSORS(format, FORMAT, unsigned)
INTRINSIC_IDX_ACCESSORS(align_mul, ALIGN_MUL, unsigned)
INTRINSIC_IDX_ACCESSORS(align_offset, ALIGN_OFFSET, unsigned)
@@ -1416,6 +1522,16 @@ nir_intrinsic_align(const nir_intrinsic_instr *intrin)
void nir_rewrite_image_intrinsic(nir_intrinsic_instr *instr,
nir_ssa_def *handle, bool bindless);
/* Determine if an intrinsic can be arbitrarily reordered and eliminated. */
static inline bool
nir_intrinsic_can_reorder(nir_intrinsic_instr *instr)
{
const nir_intrinsic_info *info =
&nir_intrinsic_infos[instr->intrinsic];
return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
(info->flags & NIR_INTRINSIC_CAN_REORDER);
}
/**
* \group texture information
*
@@ -1815,6 +1931,85 @@ NIR_DEFINE_CAST(nir_instr_as_parallel_copy, nir_instr,
nir_parallel_copy_instr, instr,
type, nir_instr_type_parallel_copy)
typedef struct {
nir_ssa_def *def;
unsigned comp;
} nir_ssa_scalar;
static inline bool
nir_ssa_scalar_is_const(nir_ssa_scalar s)
{
return s.def->parent_instr->type == nir_instr_type_load_const;
}
static inline nir_const_value
nir_ssa_scalar_as_const_value(nir_ssa_scalar s)
{
assert(s.comp < s.def->num_components);
nir_load_const_instr *load = nir_instr_as_load_const(s.def->parent_instr);
return load->value[s.comp];
}
#define NIR_DEFINE_SCALAR_AS_CONST(type, suffix) \
static inline type \
nir_ssa_scalar_as_##suffix(nir_ssa_scalar s) \
{ \
return nir_const_value_as_##suffix( \
nir_ssa_scalar_as_const_value(s), s.def->bit_size); \
}
NIR_DEFINE_SCALAR_AS_CONST(int64_t, int)
NIR_DEFINE_SCALAR_AS_CONST(uint64_t, uint)
NIR_DEFINE_SCALAR_AS_CONST(bool, bool)
NIR_DEFINE_SCALAR_AS_CONST(double, float)
#undef NIR_DEFINE_SCALAR_AS_CONST
static inline bool
nir_ssa_scalar_is_alu(nir_ssa_scalar s)
{
return s.def->parent_instr->type == nir_instr_type_alu;
}
static inline nir_op
nir_ssa_scalar_alu_op(nir_ssa_scalar s)
{
return nir_instr_as_alu(s.def->parent_instr)->op;
}
static inline nir_ssa_scalar
nir_ssa_scalar_chase_alu_src(nir_ssa_scalar s, unsigned alu_src_idx)
{
nir_ssa_scalar out = { NULL, 0 };
nir_alu_instr *alu = nir_instr_as_alu(s.def->parent_instr);
assert(alu_src_idx < nir_op_infos[alu->op].num_inputs);
/* Our component must be written */
assert(s.comp < s.def->num_components);
assert(alu->dest.write_mask & (1u << s.comp));
assert(alu->src[alu_src_idx].src.is_ssa);
out.def = alu->src[alu_src_idx].src.ssa;
if (nir_op_infos[alu->op].input_sizes[alu_src_idx] == 0) {
/* The ALU src is unsized so the source component follows the
* destination component.
*/
out.comp = alu->src[alu_src_idx].swizzle[s.comp];
} else {
/* This is a sized source so all source components work together to
* produce all the destination components. Since we need to return a
* scalar, this only works if the source is a scalar.
*/
assert(nir_op_infos[alu->op].input_sizes[alu_src_idx] == 1);
out.comp = alu->src[alu_src_idx].swizzle[0];
}
assert(out.comp < out.def->num_components);
return out;
}
/*
* Control flow
*
@@ -2196,6 +2391,7 @@ typedef enum {
nir_lower_minmax64 = (1 << 10),
nir_lower_shift64 = (1 << 11),
nir_lower_imul_2x32_64 = (1 << 12),
nir_lower_extract64 = (1 << 13),
} nir_lower_int64_options;
typedef enum {
@@ -2785,6 +2981,7 @@ NIR_SRC_AS_(deref, nir_deref_instr, nir_instr_type_deref, nir_instr_as_deref)
bool nir_src_is_dynamically_uniform(nir_src src);
bool nir_srcs_equal(nir_src src1, nir_src src2);
bool nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2);
void nir_instr_rewrite_src(nir_instr *instr, nir_src *src, nir_src new_src);
void nir_instr_move_src(nir_instr *dest_instr, nir_src *dest, nir_src *src);
void nir_if_rewrite_condition(nir_if *if_stmt, nir_src new_src);
@@ -2994,6 +3191,7 @@ void nir_calc_dominance(nir_shader *shader);
nir_block *nir_dominance_lca(nir_block *b1, nir_block *b2);
bool nir_block_dominates(nir_block *parent, nir_block *child);
bool nir_block_is_unreachable(nir_block *block);
void nir_dump_dom_tree_impl(nir_function_impl *impl, FILE *fp);
void nir_dump_dom_tree(nir_shader *shader, FILE *fp);
@@ -3487,6 +3685,9 @@ bool nir_lower_phis_to_regs_block(nir_block *block);
bool nir_lower_ssa_defs_to_regs_block(nir_block *block);
bool nir_rematerialize_derefs_in_use_blocks_impl(nir_function_impl *impl);
/* This is here for unit tests. */
bool nir_opt_comparison_pre_impl(nir_function_impl *impl);
bool nir_opt_comparison_pre(nir_shader *shader);
bool nir_opt_algebraic(nir_shader *shader);
@@ -3535,6 +3736,7 @@ bool nir_opt_peephole_select(nir_shader *shader, unsigned limit,
bool indirect_load_ok, bool expensive_alu_ok);
bool nir_opt_remove_phis(nir_shader *shader);
bool nir_opt_remove_phis_block(nir_block *block);
bool nir_opt_shrink_load(nir_shader *shader);

View File

@@ -1124,15 +1124,28 @@ nir_store_deref(nir_builder *build, nir_deref_instr *deref,
}
static inline void
nir_copy_deref(nir_builder *build, nir_deref_instr *dest, nir_deref_instr *src)
nir_copy_deref_with_access(nir_builder *build, nir_deref_instr *dest,
nir_deref_instr *src,
enum gl_access_qualifier dest_access,
enum gl_access_qualifier src_access)
{
nir_intrinsic_instr *copy =
nir_intrinsic_instr_create(build->shader, nir_intrinsic_copy_deref);
copy->src[0] = nir_src_for_ssa(&dest->dest.ssa);
copy->src[1] = nir_src_for_ssa(&src->dest.ssa);
nir_intrinsic_set_dst_access(copy, dest_access);
nir_intrinsic_set_src_access(copy, src_access);
nir_builder_instr_insert(build, &copy->instr);
}
static inline void
nir_copy_deref(nir_builder *build, nir_deref_instr *dest, nir_deref_instr *src)
{
nir_copy_deref_with_access(build, dest, src,
(enum gl_access_qualifier) 0,
(enum gl_access_qualifier) 0);
}
static inline nir_ssa_def *
nir_load_var(nir_builder *build, nir_variable *var)
{

View File

@@ -151,9 +151,11 @@ nir_variable_clone(const nir_variable *var, nir_shader *shader)
nvar->name = ralloc_strdup(nvar, var->name);
nvar->data = var->data;
nvar->num_state_slots = var->num_state_slots;
nvar->state_slots = ralloc_array(nvar, nir_state_slot, var->num_state_slots);
memcpy(nvar->state_slots, var->state_slots,
var->num_state_slots * sizeof(nir_state_slot));
if (var->num_state_slots) {
nvar->state_slots = ralloc_array(nvar, nir_state_slot, var->num_state_slots);
memcpy(nvar->state_slots, var->state_slots,
var->num_state_slots * sizeof(nir_state_slot));
}
if (var->constant_initializer) {
nvar->constant_initializer =
nir_constant_clone(var->constant_initializer, nvar);

View File

@@ -414,7 +414,8 @@ nir_eval_const_opcode(nir_op op, nir_const_value *dest,
switch (op) {
% for name in sorted(opcodes.keys()):
case nir_op_${name}:
return evaluate_${name}(dest, num_components, bit_width, src);
evaluate_${name}(dest, num_components, bit_width, src);
return;
% endfor
default:
unreachable("shouldn't get here");

View File

@@ -124,17 +124,15 @@ nir_deref_instr_has_indirect(nir_deref_instr *instr)
unsigned
nir_deref_instr_ptr_as_array_stride(nir_deref_instr *deref)
{
assert(deref->deref_type == nir_deref_type_ptr_as_array);
nir_deref_instr *parent = nir_deref_instr_parent(deref);
switch (parent->deref_type) {
switch (deref->deref_type) {
case nir_deref_type_array:
return glsl_get_explicit_stride(nir_deref_instr_parent(parent)->type);
return glsl_get_explicit_stride(nir_deref_instr_parent(deref)->type);
case nir_deref_type_ptr_as_array:
return nir_deref_instr_ptr_as_array_stride(parent);
return nir_deref_instr_ptr_as_array_stride(nir_deref_instr_parent(deref));
case nir_deref_type_cast:
return parent->cast.ptr_stride;
return deref->cast.ptr_stride;
default:
unreachable("Invalid parent for ptr_as_array deref");
return 0;
}
}

View File

@@ -239,6 +239,20 @@ nir_block_dominates(nir_block *parent, nir_block *child)
child->dom_post_index <= parent->dom_post_index;
}
bool
nir_block_is_unreachable(nir_block *block)
{
assert(nir_cf_node_get_function(&block->cf_node)->valid_metadata &
nir_metadata_dominance);
assert(nir_cf_node_get_function(&block->cf_node)->valid_metadata &
nir_metadata_block_index);
/* Unreachable blocks have no dominator. The only reachable block with no
* dominator is the start block which has index 0.
*/
return block->index > 0 && block->imm_dom == NULL;
}
void
nir_dump_dom_tree_impl(nir_function_impl *impl, FILE *fp)
{

View File

@@ -827,7 +827,7 @@ nir_convert_from_ssa(nir_shader *shader, bool phi_webs_only)
static void
place_phi_read(nir_shader *shader, nir_register *reg,
nir_ssa_def *def, nir_block *block)
nir_ssa_def *def, nir_block *block, unsigned depth)
{
if (block != def->parent_instr->block) {
/* Try to go up the single-successor tree */
@@ -840,14 +840,24 @@ place_phi_read(nir_shader *shader, nir_register *reg,
}
}
if (all_single_successors) {
if (all_single_successors && depth < 32) {
/* All predecessors of this block have exactly one successor and it
* is this block so they must eventually lead here without
* intersecting each other. Place the reads in the predecessors
* instead of this block.
*
* We only let this function recurse 32 times because it can recurse
* indefinitely in the presence of infinite loops. Because we're
* crawling a single-successor chain, it doesn't matter where we
* place it so it's ok to stop at an arbitrary distance.
*
* TODO: One day, we could detect back edges and avoid the recursion
* that way.
*/
set_foreach(block->predecessors, entry)
place_phi_read(shader, reg, def, (nir_block *)entry->key);
set_foreach(block->predecessors, entry) {
place_phi_read(shader, reg, def, (nir_block *)entry->key,
depth + 1);
}
return;
}
}
@@ -904,7 +914,7 @@ nir_lower_phis_to_regs_block(nir_block *block)
assert(src->src.is_ssa);
/* We don't want derefs ending up in phi sources */
assert(!nir_src_as_deref(src->src));
place_phi_read(shader, reg, src->src.ssa, src->pred);
place_phi_read(shader, reg, src->src.ssa, src->pred, 0);
}
nir_instr_remove(&phi->instr);

View File

@@ -25,6 +25,64 @@
#include "nir_vla.h"
#include "util/half_float.h"
static bool
src_is_ssa(nir_src *src, void *data)
{
(void) data;
return src->is_ssa;
}
static bool
dest_is_ssa(nir_dest *dest, void *data)
{
(void) data;
return dest->is_ssa;
}
static inline bool
instr_each_src_and_dest_is_ssa(const nir_instr *instr)
{
if (!nir_foreach_dest((nir_instr *)instr, dest_is_ssa, NULL) ||
!nir_foreach_src((nir_instr *)instr, src_is_ssa, NULL))
return false;
return true;
}
/* This function determines if uses of an instruction can safely be rewritten
* to use another identical instruction instead. Note that this function must
* be kept in sync with hash_instr() and nir_instrs_equal() -- only
* instructions that pass this test will be handed on to those functions, and
* conversely they must handle everything that this function returns true for.
*/
static bool
instr_can_rewrite(const nir_instr *instr)
{
/* We only handle SSA. */
assert(instr_each_src_and_dest_is_ssa(instr));
switch (instr->type) {
case nir_instr_type_alu:
case nir_instr_type_deref:
case nir_instr_type_tex:
case nir_instr_type_load_const:
case nir_instr_type_phi:
return true;
case nir_instr_type_intrinsic:
return nir_intrinsic_can_reorder(nir_instr_as_intrinsic(instr));
case nir_instr_type_call:
case nir_instr_type_jump:
case nir_instr_type_ssa_undef:
return false;
case nir_instr_type_parallel_copy:
default:
unreachable("Invalid instruction type");
}
return false;
}
#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data))
static uint32_t
@@ -430,12 +488,16 @@ nir_alu_srcs_negative_equal(const nir_alu_instr *alu1,
if (const2 == NULL)
return false;
if (nir_src_bit_size(alu1->src[src1].src) !=
nir_src_bit_size(alu2->src[src2].src))
return false;
/* FINISHME: Apply the swizzle? */
return nir_const_value_negative_equal(const1,
const2,
nir_ssa_alu_instr_src_components(alu1, src1),
nir_op_infos[alu1->op].input_types[src1],
alu1->dest.dest.ssa.bit_size);
nir_src_bit_size(alu1->src[src1].src));
}
uint8_t alu1_swizzle[4] = {0};
@@ -503,9 +565,11 @@ nir_alu_srcs_equal(const nir_alu_instr *alu1, const nir_alu_instr *alu2,
* the same hash for (ignoring collisions, of course).
*/
static bool
bool
nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2)
{
assert(instr_can_rewrite(instr1) && instr_can_rewrite(instr2));
if (instr1->type != instr2->type)
return false;
@@ -701,68 +765,6 @@ nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2)
unreachable("All cases in the above switch should return");
}
static bool
src_is_ssa(nir_src *src, void *data)
{
(void) data;
return src->is_ssa;
}
static bool
dest_is_ssa(nir_dest *dest, void *data)
{
(void) data;
return dest->is_ssa;
}
static inline bool
instr_each_src_and_dest_is_ssa(nir_instr *instr)
{
if (!nir_foreach_dest(instr, dest_is_ssa, NULL) ||
!nir_foreach_src(instr, src_is_ssa, NULL))
return false;
return true;
}
/* This function determines if uses of an instruction can safely be rewritten
* to use another identical instruction instead. Note that this function must
* be kept in sync with hash_instr() and nir_instrs_equal() -- only
* instructions that pass this test will be handed on to those functions, and
* conversely they must handle everything that this function returns true for.
*/
static bool
instr_can_rewrite(nir_instr *instr)
{
/* We only handle SSA. */
assert(instr_each_src_and_dest_is_ssa(instr));
switch (instr->type) {
case nir_instr_type_alu:
case nir_instr_type_deref:
case nir_instr_type_tex:
case nir_instr_type_load_const:
case nir_instr_type_phi:
return true;
case nir_instr_type_intrinsic: {
const nir_intrinsic_info *info =
&nir_intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic];
return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
(info->flags & NIR_INTRINSIC_CAN_REORDER);
}
case nir_instr_type_call:
case nir_instr_type_jump:
case nir_instr_type_ssa_undef:
return false;
case nir_instr_type_parallel_copy:
default:
unreachable("Invalid instruction type");
}
return false;
}
static nir_ssa_def *
nir_instr_get_dest_ssa_def(nir_instr *instr)
{

View File

@@ -111,6 +111,8 @@ IMAGE_DIM = "NIR_INTRINSIC_IMAGE_DIM"
IMAGE_ARRAY = "NIR_INTRINSIC_IMAGE_ARRAY"
# Access qualifiers for image and memory access intrinsics
ACCESS = "NIR_INTRINSIC_ACCESS"
DST_ACCESS = "NIR_INTRINSIC_DST_ACCESS"
SRC_ACCESS = "NIR_INTRINSIC_SRC_ACCESS"
# Image format for image intrinsics
FORMAT = "NIR_INTRINSIC_FORMAT"
# Offset or address alignment
@@ -152,7 +154,7 @@ intrinsic("load_param", dest_comp=0, indices=[PARAM_IDX], flags=[CAN_ELIMINATE])
intrinsic("load_deref", dest_comp=0, src_comp=[-1],
indices=[ACCESS], flags=[CAN_ELIMINATE])
intrinsic("store_deref", src_comp=[-1, 0], indices=[WRMASK, ACCESS])
intrinsic("copy_deref", src_comp=[-1, -1])
intrinsic("copy_deref", src_comp=[-1, -1], indices=[DST_ACCESS, SRC_ACCESS])
# Interpolation of input. The interp_deref_at* intrinsics are similar to the
# load_var intrinsic acting on a shader input except that they interpolate the
@@ -333,7 +335,8 @@ atomic3("atomic_counter_comp_swap")
# either one or two additional scalar arguments with the same meaning as in
# the ARB_shader_image_load_store specification.
def image(name, src_comp=[], **kwargs):
intrinsic("image_deref_" + name, src_comp=[1] + src_comp, **kwargs)
intrinsic("image_deref_" + name, src_comp=[1] + src_comp,
indices=[ACCESS], **kwargs)
intrinsic("image_" + name, src_comp=[1] + src_comp,
indices=[IMAGE_DIM, IMAGE_ARRAY, FORMAT, ACCESS], **kwargs)
intrinsic("bindless_image_" + name, src_comp=[1] + src_comp,

View File

@@ -32,7 +32,10 @@ typedef enum {
basic_induction
} nir_loop_variable_type;
struct nir_basic_induction_var;
typedef struct nir_basic_induction_var {
nir_alu_instr *alu; /* The def of the alu-operation */
nir_ssa_def *def_outside_loop; /* The phi-src outside the loop */
} nir_basic_induction_var;
typedef struct {
/* A link for the work list */
@@ -57,13 +60,6 @@ typedef struct {
} nir_loop_variable;
typedef struct nir_basic_induction_var {
nir_op alu_op; /* The type of alu-operation */
nir_loop_variable *alu_def; /* The def of the alu-operation */
nir_loop_variable *invariant; /* The invariant alu-operand */
nir_loop_variable *def_outside_loop; /* The phi-src outside the loop */
} nir_basic_induction_var;
typedef struct {
/* The loop we store information for */
nir_loop *loop;
@@ -274,6 +270,44 @@ compute_invariance_information(loop_info_state *state)
}
}
/* If all of the instruction sources point to identical ALU instructions (as
* per nir_instrs_equal), return one of the ALU instructions. Otherwise,
* return NULL.
*/
static nir_alu_instr *
phi_instr_as_alu(nir_phi_instr *phi)
{
nir_alu_instr *first = NULL;
nir_foreach_phi_src(src, phi) {
assert(src->src.is_ssa);
if (src->src.ssa->parent_instr->type != nir_instr_type_alu)
return NULL;
nir_alu_instr *alu = nir_instr_as_alu(src->src.ssa->parent_instr);
if (first == NULL) {
first = alu;
} else {
if (!nir_instrs_equal(&first->instr, &alu->instr))
return NULL;
}
}
return first;
}
static bool
alu_src_has_identity_swizzle(nir_alu_instr *alu, unsigned src_idx)
{
assert(nir_op_infos[alu->op].input_sizes[src_idx] == 0);
assert(alu->dest.dest.is_ssa);
for (unsigned i = 0; i < alu->dest.dest.ssa.num_components; i++) {
if (alu->src[src_idx].swizzle[i] != i)
return false;
}
return true;
}
static bool
compute_induction_information(loop_info_state *state)
{
@@ -298,6 +332,7 @@ compute_induction_information(loop_info_state *state)
nir_phi_instr *phi = nir_instr_as_phi(var->def->parent_instr);
nir_basic_induction_var *biv = rzalloc(state, nir_basic_induction_var);
nir_loop_variable *alu_src_var = NULL;
nir_foreach_phi_src(src, phi) {
nir_loop_variable *src_var = get_loop_var(src->src.ssa, state);
@@ -313,60 +348,44 @@ compute_induction_information(loop_info_state *state)
if (is_var_phi(src_var)) {
nir_phi_instr *src_phi =
nir_instr_as_phi(src_var->def->parent_instr);
nir_op alu_op = nir_num_opcodes; /* avoid uninitialized warning */
nir_ssa_def *alu_srcs[2] = {0};
nir_foreach_phi_src(src2, src_phi) {
nir_loop_variable *src_var2 =
get_loop_var(src2->src.ssa, state);
if (!src_var2->in_if_branch || !is_var_alu(src_var2))
nir_alu_instr *src_phi_alu = phi_instr_as_alu(src_phi);
if (src_phi_alu) {
src_var = get_loop_var(&src_phi_alu->dest.dest.ssa, state);
if (!src_var->in_if_branch)
break;
nir_alu_instr *alu =
nir_instr_as_alu(src_var2->def->parent_instr);
if (nir_op_infos[alu->op].num_inputs != 2)
break;
if (alu->src[0].src.ssa == alu_srcs[0] &&
alu->src[1].src.ssa == alu_srcs[1] &&
alu->op == alu_op) {
/* Both branches perform the same calculation so we can use
* one of them to find the induction variable.
*/
src_var = src_var2;
} else {
alu_srcs[0] = alu->src[0].src.ssa;
alu_srcs[1] = alu->src[1].src.ssa;
alu_op = alu->op;
}
}
}
if (!src_var->in_loop) {
biv->def_outside_loop = src_var;
} else if (is_var_alu(src_var)) {
if (!src_var->in_loop && !biv->def_outside_loop) {
biv->def_outside_loop = src_var->def;
} else if (is_var_alu(src_var) && !biv->alu) {
alu_src_var = src_var;
nir_alu_instr *alu = nir_instr_as_alu(src_var->def->parent_instr);
if (nir_op_infos[alu->op].num_inputs == 2) {
biv->alu_def = src_var;
biv->alu_op = alu->op;
for (unsigned i = 0; i < 2; i++) {
/* Is one of the operands const, and the other the phi */
if (alu->src[i].src.ssa->parent_instr->type == nir_instr_type_load_const &&
alu->src[1-i].src.ssa == &phi->dest.ssa)
biv->invariant = get_loop_var(alu->src[i].src.ssa, state);
/* Is one of the operands const, and the other the phi. The
* phi source can't be swizzled in any way.
*/
if (nir_src_is_const(alu->src[i].src) &&
alu->src[1-i].src.ssa == &phi->dest.ssa &&
alu_src_has_identity_swizzle(alu, 1 - i))
biv->alu = alu;
}
}
if (!biv->alu)
break;
} else {
biv->alu = NULL;
break;
}
}
if (biv->alu_def && biv->def_outside_loop && biv->invariant &&
is_var_constant(biv->def_outside_loop)) {
assert(is_var_constant(biv->invariant));
biv->alu_def->type = basic_induction;
biv->alu_def->ind = biv;
if (biv->alu && biv->def_outside_loop &&
biv->def_outside_loop->parent_instr->type == nir_instr_type_load_const) {
alu_src_var->type = basic_induction;
alu_src_var->ind = biv;
var->type = basic_induction;
var->ind = biv;
@@ -493,7 +512,7 @@ find_array_access_via_induction(loop_info_state *state,
static bool
guess_loop_limit(loop_info_state *state, nir_const_value *limit_val,
nir_loop_variable *basic_ind)
nir_ssa_scalar basic_ind)
{
unsigned min_array_size = 0;
@@ -514,8 +533,10 @@ guess_loop_limit(loop_info_state *state, nir_const_value *limit_val,
find_array_access_via_induction(state,
nir_src_as_deref(intrin->src[0]),
&array_idx);
if (basic_ind == array_idx &&
if (array_idx && basic_ind.def == array_idx->def &&
(min_array_size == 0 || min_array_size > array_size)) {
/* Array indices are scalars */
assert(basic_ind.def->num_components == 1);
min_array_size = array_size;
}
@@ -526,8 +547,10 @@ guess_loop_limit(loop_info_state *state, nir_const_value *limit_val,
find_array_access_via_induction(state,
nir_src_as_deref(intrin->src[1]),
&array_idx);
if (basic_ind == array_idx &&
if (array_idx && basic_ind.def == array_idx->def &&
(min_array_size == 0 || min_array_size > array_size)) {
/* Array indices are scalars */
assert(basic_ind.def->num_components == 1);
min_array_size = array_size;
}
}
@@ -535,7 +558,8 @@ guess_loop_limit(loop_info_state *state, nir_const_value *limit_val,
}
if (min_array_size) {
limit_val->i32 = min_array_size;
*limit_val = nir_const_value_for_uint(min_array_size,
basic_ind.def->bit_size);
return true;
}
@@ -543,71 +567,84 @@ guess_loop_limit(loop_info_state *state, nir_const_value *limit_val,
}
static bool
try_find_limit_of_alu(nir_loop_variable *limit, nir_const_value *limit_val,
try_find_limit_of_alu(nir_ssa_scalar limit, nir_const_value *limit_val,
nir_loop_terminator *terminator, loop_info_state *state)
{
if(!is_var_alu(limit))
if (!nir_ssa_scalar_is_alu(limit))
return false;
nir_alu_instr *limit_alu = nir_instr_as_alu(limit->def->parent_instr);
if (limit_alu->op == nir_op_imin ||
limit_alu->op == nir_op_fmin) {
limit = get_loop_var(limit_alu->src[0].src.ssa, state);
if (!is_var_constant(limit))
limit = get_loop_var(limit_alu->src[1].src.ssa, state);
if (!is_var_constant(limit))
return false;
*limit_val = nir_instr_as_load_const(limit->def->parent_instr)->value[0];
terminator->exact_trip_count_unknown = true;
return true;
nir_op limit_op = nir_ssa_scalar_alu_op(limit);
if (limit_op == nir_op_imin || limit_op == nir_op_fmin) {
for (unsigned i = 0; i < 2; i++) {
nir_ssa_scalar src = nir_ssa_scalar_chase_alu_src(limit, i);
if (nir_ssa_scalar_is_const(src)) {
*limit_val = nir_ssa_scalar_as_const_value(src);
terminator->exact_trip_count_unknown = true;
return true;
}
}
}
return false;
}
static int32_t
get_iteration(nir_op cond_op, nir_const_value *initial, nir_const_value *step,
nir_const_value *limit)
static nir_const_value
eval_const_unop(nir_op op, unsigned bit_size, nir_const_value src0)
{
int32_t iter;
assert(nir_op_infos[op].num_inputs == 1);
nir_const_value dest;
nir_const_value *src[1] = { &src0 };
nir_eval_const_opcode(op, &dest, 1, bit_size, src);
return dest;
}
static nir_const_value
eval_const_binop(nir_op op, unsigned bit_size,
nir_const_value src0, nir_const_value src1)
{
assert(nir_op_infos[op].num_inputs == 2);
nir_const_value dest;
nir_const_value *src[2] = { &src0, &src1 };
nir_eval_const_opcode(op, &dest, 1, bit_size, src);
return dest;
}
static int32_t
get_iteration(nir_op cond_op, nir_const_value initial, nir_const_value step,
nir_const_value limit, unsigned bit_size)
{
nir_const_value span, iter;
switch (cond_op) {
case nir_op_ige:
case nir_op_ilt:
case nir_op_ieq:
case nir_op_ine: {
int32_t initial_val = initial->i32;
int32_t span = limit->i32 - initial_val;
iter = span / step->i32;
case nir_op_ine:
span = eval_const_binop(nir_op_isub, bit_size, limit, initial);
iter = eval_const_binop(nir_op_idiv, bit_size, span, step);
break;
}
case nir_op_uge:
case nir_op_ult: {
uint32_t initial_val = initial->u32;
uint32_t span = limit->u32 - initial_val;
iter = span / step->u32;
case nir_op_ult:
span = eval_const_binop(nir_op_isub, bit_size, limit, initial);
iter = eval_const_binop(nir_op_udiv, bit_size, span, step);
break;
}
case nir_op_fge:
case nir_op_flt:
case nir_op_feq:
case nir_op_fne: {
float initial_val = initial->f32;
float span = limit->f32 - initial_val;
iter = span / step->f32;
case nir_op_fne:
span = eval_const_binop(nir_op_fsub, bit_size, limit, initial);
iter = eval_const_binop(nir_op_fdiv, bit_size, span, step);
iter = eval_const_unop(nir_op_f2i64, bit_size, iter);
break;
}
default:
return -1;
}
return iter;
uint64_t iter_u64 = nir_const_value_as_uint(iter, bit_size);
return iter_u64 > INT_MAX ? -1 : (int)iter_u64;
}
static bool
@@ -618,18 +655,18 @@ test_iterations(int32_t iter_int, nir_const_value *step,
{
assert(nir_op_infos[cond_op].num_inputs == 2);
nir_const_value iter_src = {0, };
nir_const_value iter_src;
nir_op mul_op;
nir_op add_op;
switch (induction_base_type) {
case nir_type_float:
iter_src.f32 = (float) iter_int;
iter_src = nir_const_value_for_float(iter_int, bit_size);
mul_op = nir_op_fmul;
add_op = nir_op_fadd;
break;
case nir_type_int:
case nir_type_uint:
iter_src.i32 = iter_int;
iter_src = nir_const_value_for_int(iter_int, bit_size);
mul_op = nir_op_imul;
add_op = nir_op_iadd;
break;
@@ -662,14 +699,12 @@ test_iterations(int32_t iter_int, nir_const_value *step,
static int
calculate_iterations(nir_const_value *initial, nir_const_value *step,
nir_const_value *limit, nir_loop_variable *alu_def,
nir_alu_instr *cond_alu, nir_op alu_op, bool limit_rhs,
nir_const_value *limit, nir_alu_instr *alu,
nir_ssa_scalar cond, nir_op alu_op, bool limit_rhs,
bool invert_cond)
{
assert(initial != NULL && step != NULL && limit != NULL);
nir_alu_instr *alu = nir_instr_as_alu(alu_def->def->parent_instr);
/* nir_op_isub should have been lowered away by this point */
assert(alu->op != nir_op_isub);
@@ -701,12 +736,16 @@ calculate_iterations(nir_const_value *initial, nir_const_value *step,
* condition and if so we assume we need to step the initial value.
*/
unsigned trip_offset = 0;
if (cond_alu->src[0].src.ssa == alu_def->def ||
cond_alu->src[1].src.ssa == alu_def->def) {
nir_alu_instr *cond_alu = nir_instr_as_alu(cond.def->parent_instr);
if (cond_alu->src[0].src.ssa == &alu->dest.dest.ssa ||
cond_alu->src[1].src.ssa == &alu->dest.dest.ssa) {
trip_offset = 1;
}
int iter_int = get_iteration(alu_op, initial, step, limit);
assert(nir_src_bit_size(alu->src[0].src) ==
nir_src_bit_size(alu->src[1].src));
unsigned bit_size = nir_src_bit_size(alu->src[0].src);
int iter_int = get_iteration(alu_op, *initial, *step, *limit, bit_size);
/* If iter_int is negative the loop is ill-formed or is the conditional is
* unsigned with a huge iteration count so don't bother going any further.
@@ -723,9 +762,6 @@ calculate_iterations(nir_const_value *initial, nir_const_value *step,
*
* for (float x = 0.0; x != 0.9; x += 0.2);
*/
assert(nir_src_bit_size(alu->src[0].src) ==
nir_src_bit_size(alu->src[1].src));
unsigned bit_size = nir_src_bit_size(alu->src[0].src);
for (int bias = -1; bias <= 1; bias++) {
const int iter_bias = iter_int + bias;
@@ -740,9 +776,9 @@ calculate_iterations(nir_const_value *initial, nir_const_value *step,
}
static nir_op
inverse_comparison(nir_alu_instr *alu)
inverse_comparison(nir_op alu_op)
{
switch (alu->op) {
switch (alu_op) {
case nir_op_fge:
return nir_op_flt;
case nir_op_ige:
@@ -769,95 +805,97 @@ inverse_comparison(nir_alu_instr *alu)
}
static bool
is_supported_terminator_condition(nir_alu_instr *alu)
is_supported_terminator_condition(nir_ssa_scalar cond)
{
if (!nir_ssa_scalar_is_alu(cond))
return false;
nir_alu_instr *alu = nir_instr_as_alu(cond.def->parent_instr);
return nir_alu_instr_is_comparison(alu) &&
nir_op_infos[alu->op].num_inputs == 2;
}
static bool
get_induction_and_limit_vars(nir_alu_instr *alu, nir_loop_variable **ind,
nir_loop_variable **limit,
get_induction_and_limit_vars(nir_ssa_scalar cond,
nir_ssa_scalar *ind,
nir_ssa_scalar *limit,
bool *limit_rhs,
loop_info_state *state)
{
bool limit_rhs = true;
nir_ssa_scalar rhs, lhs;
lhs = nir_ssa_scalar_chase_alu_src(cond, 0);
rhs = nir_ssa_scalar_chase_alu_src(cond, 1);
/* We assume that the limit is the "right" operand */
*ind = get_loop_var(alu->src[0].src.ssa, state);
*limit = get_loop_var(alu->src[1].src.ssa, state);
if ((*ind)->type != basic_induction) {
/* We had it the wrong way, flip things around */
*ind = get_loop_var(alu->src[1].src.ssa, state);
*limit = get_loop_var(alu->src[0].src.ssa, state);
limit_rhs = false;
if (get_loop_var(lhs.def, state)->type == basic_induction) {
*ind = lhs;
*limit = rhs;
*limit_rhs = true;
return true;
} else if (get_loop_var(rhs.def, state)->type == basic_induction) {
*ind = rhs;
*limit = lhs;
*limit_rhs = false;
return true;
} else {
return false;
}
return limit_rhs;
}
static void
try_find_trip_count_vars_in_iand(nir_alu_instr **alu,
nir_loop_variable **ind,
nir_loop_variable **limit,
static bool
try_find_trip_count_vars_in_iand(nir_ssa_scalar *cond,
nir_ssa_scalar *ind,
nir_ssa_scalar *limit,
bool *limit_rhs,
loop_info_state *state)
{
assert((*alu)->op == nir_op_ieq || (*alu)->op == nir_op_inot);
const nir_op alu_op = nir_ssa_scalar_alu_op(*cond);
assert(alu_op == nir_op_ieq || alu_op == nir_op_inot);
nir_ssa_def *iand_def = (*alu)->src[0].src.ssa;
nir_ssa_scalar iand = nir_ssa_scalar_chase_alu_src(*cond, 0);
if ((*alu)->op == nir_op_ieq) {
nir_ssa_def *zero_def = (*alu)->src[1].src.ssa;
if (iand_def->parent_instr->type != nir_instr_type_alu ||
zero_def->parent_instr->type != nir_instr_type_load_const) {
if (alu_op == nir_op_ieq) {
nir_ssa_scalar zero = nir_ssa_scalar_chase_alu_src(*cond, 1);
if (!nir_ssa_scalar_is_alu(iand) || !nir_ssa_scalar_is_const(zero)) {
/* Maybe we had it the wrong way, flip things around */
iand_def = (*alu)->src[1].src.ssa;
zero_def = (*alu)->src[0].src.ssa;
nir_ssa_scalar tmp = zero;
zero = iand;
iand = tmp;
/* If we still didn't find what we need then return */
if (zero_def->parent_instr->type != nir_instr_type_load_const)
return;
if (!nir_ssa_scalar_is_const(zero))
return false;
}
/* If the loop is not breaking on (x && y) == 0 then return */
nir_const_value *zero =
nir_instr_as_load_const(zero_def->parent_instr)->value;
if (zero[0].i32 != 0)
return;
if (nir_ssa_scalar_as_uint(zero) != 0)
return false;
}
if (iand_def->parent_instr->type != nir_instr_type_alu)
return;
if (!nir_ssa_scalar_is_alu(iand))
return false;
nir_alu_instr *iand = nir_instr_as_alu(iand_def->parent_instr);
if (iand->op != nir_op_iand)
return;
if (nir_ssa_scalar_alu_op(iand) != nir_op_iand)
return false;
/* Check if iand src is a terminator condition and try get induction var
* and trip limit var.
*/
nir_ssa_def *src = iand->src[0].src.ssa;
if (src->parent_instr->type == nir_instr_type_alu) {
*alu = nir_instr_as_alu(src->parent_instr);
if (is_supported_terminator_condition(*alu))
*limit_rhs = get_induction_and_limit_vars(*alu, ind, limit, state);
}
bool found_induction_var = false;
for (unsigned i = 0; i < 2; i++) {
nir_ssa_scalar src = nir_ssa_scalar_chase_alu_src(iand, i);
if (is_supported_terminator_condition(src) &&
get_induction_and_limit_vars(src, ind, limit, limit_rhs, state)) {
*cond = src;
found_induction_var = true;
/* Try the other iand src if needed */
if (*ind == NULL || (*ind && (*ind)->type != basic_induction) ||
!is_var_constant(*limit)) {
src = iand->src[1].src.ssa;
if (src->parent_instr->type == nir_instr_type_alu) {
nir_alu_instr *tmp_alu = nir_instr_as_alu(src->parent_instr);
if (is_supported_terminator_condition(tmp_alu)) {
*alu = tmp_alu;
*limit_rhs = get_induction_and_limit_vars(*alu, ind, limit, state);
}
/* If we've found one with a constant limit, stop. */
if (nir_ssa_scalar_is_const(*limit))
return true;
}
}
return found_induction_var;
}
/* Run through each of the terminators of the loop and try to infer a possible
@@ -877,8 +915,10 @@ find_trip_count(loop_info_state *state)
list_for_each_entry(nir_loop_terminator, terminator,
&state->loop->info->loop_terminator_list,
loop_terminator_link) {
assert(terminator->nif->condition.is_ssa);
nir_ssa_scalar cond = { terminator->nif->condition.ssa, 0 };
if (terminator->conditional_instr->type != nir_instr_type_alu) {
if (!nir_ssa_scalar_is_alu(cond)) {
/* If we get here the loop is dead and will get cleaned up by the
* nir_opt_dead_cf pass.
*/
@@ -886,43 +926,35 @@ find_trip_count(loop_info_state *state)
continue;
}
nir_alu_instr *alu = nir_instr_as_alu(terminator->conditional_instr);
nir_op alu_op = alu->op;
nir_op alu_op = nir_ssa_scalar_alu_op(cond);
bool limit_rhs;
nir_loop_variable *basic_ind = NULL;
nir_loop_variable *limit;
if (alu->op == nir_op_inot || alu->op == nir_op_ieq) {
nir_alu_instr *new_alu = alu;
try_find_trip_count_vars_in_iand(&new_alu, &basic_ind, &limit,
&limit_rhs, state);
nir_ssa_scalar basic_ind = { NULL, 0 };
nir_ssa_scalar limit;
if ((alu_op == nir_op_inot || alu_op == nir_op_ieq) &&
try_find_trip_count_vars_in_iand(&cond, &basic_ind, &limit,
&limit_rhs, state)) {
/* The loop is exiting on (x && y) == 0 so we need to get the
* inverse of x or y (i.e. which ever contained the induction var) in
* order to compute the trip count.
*/
if (basic_ind && basic_ind->type == basic_induction) {
alu = new_alu;
alu_op = inverse_comparison(alu);
trip_count_known = false;
terminator->exact_trip_count_unknown = true;
}
alu_op = inverse_comparison(nir_ssa_scalar_alu_op(cond));
trip_count_known = false;
terminator->exact_trip_count_unknown = true;
}
if (!basic_ind) {
if (!is_supported_terminator_condition(alu)) {
trip_count_known = false;
continue;
if (!basic_ind.def) {
if (is_supported_terminator_condition(cond)) {
get_induction_and_limit_vars(cond, &basic_ind,
&limit, &limit_rhs, state);
}
limit_rhs = get_induction_and_limit_vars(alu, &basic_ind, &limit,
state);
}
/* The comparison has to have a basic induction variable for us to be
* able to find trip counts.
*/
if (basic_ind->type != basic_induction) {
if (!basic_ind.def) {
trip_count_known = false;
continue;
}
@@ -931,9 +963,8 @@ find_trip_count(loop_info_state *state)
/* Attempt to find a constant limit for the loop */
nir_const_value limit_val;
if (is_var_constant(limit)) {
limit_val =
nir_instr_as_load_const(limit->def->parent_instr)->value[0];
if (nir_ssa_scalar_is_const(limit)) {
limit_val = nir_ssa_scalar_as_const_value(limit);
} else {
trip_count_known = false;
@@ -955,17 +986,38 @@ find_trip_count(loop_info_state *state)
* Thats all thats needed to calculate the trip-count
*/
nir_const_value *initial_val =
nir_instr_as_load_const(basic_ind->ind->def_outside_loop->
def->parent_instr)->value;
nir_basic_induction_var *ind_var =
get_loop_var(basic_ind.def, state)->ind;
nir_const_value *step_val =
nir_instr_as_load_const(basic_ind->ind->invariant->def->
parent_instr)->value;
/* The basic induction var might be a vector but, because we guarantee
* earlier that the phi source has a scalar swizzle, we can take the
* component from basic_ind.
*/
nir_ssa_scalar initial_s = { ind_var->def_outside_loop, basic_ind.comp };
nir_ssa_scalar alu_s = { &ind_var->alu->dest.dest.ssa, basic_ind.comp };
int iterations = calculate_iterations(initial_val, step_val,
nir_const_value initial_val = nir_ssa_scalar_as_const_value(initial_s);
/* We are guaranteed by earlier code that at least one of these sources
* is a constant but we don't know which.
*/
nir_const_value step_val;
memset(&step_val, 0, sizeof(step_val));
UNUSED bool found_step_value = false;
assert(nir_op_infos[ind_var->alu->op].num_inputs == 2);
for (unsigned i = 0; i < 2; i++) {
nir_ssa_scalar alu_src = nir_ssa_scalar_chase_alu_src(alu_s, i);
if (nir_ssa_scalar_is_const(alu_src)) {
found_step_value = true;
step_val = nir_ssa_scalar_as_const_value(alu_src);
break;
}
}
assert(found_step_value);
int iterations = calculate_iterations(&initial_val, &step_val,
&limit_val,
basic_ind->ind->alu_def, alu,
ind_var->alu, cond,
alu_op, limit_rhs,
terminator->continue_from_then);

View File

@@ -629,6 +629,34 @@ lower_irem64(nir_builder *b, nir_ssa_def *n, nir_ssa_def *d)
return nir_bcsel(b, n_is_neg, nir_ineg(b, r), r);
}
static nir_ssa_def *
lower_extract(nir_builder *b, nir_op op, nir_ssa_def *x, nir_ssa_def *c)
{
assert(op == nir_op_extract_u8 || op == nir_op_extract_i8 ||
op == nir_op_extract_u16 || op == nir_op_extract_i16);
const int chunk = nir_src_as_uint(nir_src_for_ssa(c));
const int chunk_bits =
(op == nir_op_extract_u8 || op == nir_op_extract_i8) ? 8 : 16;
const int num_chunks_in_32 = 32 / chunk_bits;
nir_ssa_def *extract32;
if (chunk < num_chunks_in_32) {
extract32 = nir_build_alu(b, op, nir_unpack_64_2x32_split_x(b, x),
nir_imm_int(b, chunk),
NULL, NULL);
} else {
extract32 = nir_build_alu(b, op, nir_unpack_64_2x32_split_y(b, x),
nir_imm_int(b, chunk - num_chunks_in_32),
NULL, NULL);
}
if (op == nir_op_extract_i8 || op == nir_op_extract_i16)
return lower_i2i64(b, extract32);
else
return lower_u2u64(b, extract32);
}
nir_lower_int64_options
nir_lower_int64_op_to_options_mask(nir_op opcode)
{
@@ -685,6 +713,11 @@ nir_lower_int64_op_to_options_mask(nir_op opcode)
case nir_op_ishr:
case nir_op_ushr:
return nir_lower_shift64;
case nir_op_extract_u8:
case nir_op_extract_i8:
case nir_op_extract_u16:
case nir_op_extract_i16:
return nir_lower_extract64;
default:
return 0;
}
@@ -779,6 +812,11 @@ lower_int64_alu_instr(nir_builder *b, nir_alu_instr *alu)
return lower_ishr64(b, src[0], src[1]);
case nir_op_ushr:
return lower_ushr64(b, src[0], src[1]);
case nir_op_extract_u8:
case nir_op_extract_i8:
case nir_op_extract_u16:
case nir_op_extract_i16:
return lower_extract(b, alu->op, src[0], src[1]);
default:
unreachable("Invalid ALU opcode to lower");
}

View File

@@ -34,6 +34,7 @@ read_first_invocation(nir_builder *b, nir_ssa_def *x)
first->src[0] = nir_src_for_ssa(x);
nir_ssa_dest_init(&first->instr, &first->dest,
x->num_components, x->bit_size, NULL);
nir_builder_instr_insert(b, &first->instr);
return &first->dest.ssa;
}
@@ -128,8 +129,8 @@ nir_lower_non_uniform_access_impl(nir_function_impl *impl,
nir_builder b;
nir_builder_init(&b, impl);
nir_foreach_block(block, impl) {
nir_foreach_instr(instr, block) {
nir_foreach_block_safe(block, impl) {
nir_foreach_instr_safe(instr, block) {
switch (instr->type) {
case nir_instr_type_tex: {
nir_tex_instr *tex = nir_instr_as_tex(instr);

View File

@@ -251,9 +251,17 @@ nir_lower_regs_to_ssa_impl(nir_function_impl *impl)
nir_foreach_block(block, impl) {
nir_foreach_instr(instr, block) {
if (instr->type == nir_instr_type_alu) {
switch (instr->type) {
case nir_instr_type_alu:
rewrite_alu_instr(nir_instr_as_alu(instr), &state);
} else {
break;
case nir_instr_type_phi:
/* We rewrite sources as a separate pass */
nir_foreach_dest(instr, rewrite_dest, &state);
break;
default:
nir_foreach_src(instr, rewrite_src, &state);
nir_foreach_dest(instr, rewrite_dest, &state);
}
@@ -262,6 +270,28 @@ nir_lower_regs_to_ssa_impl(nir_function_impl *impl)
nir_if *following_if = nir_block_get_following_if(block);
if (following_if)
rewrite_if_condition(following_if, &state);
/* Handle phi sources that source from this block. We have to do this
* as a separate pass because the phi builder assumes that uses and
* defs are processed in an order that respects dominance. When we have
* loops, a phi source may be a back-edge so we have to handle it as if
* it were one of the last instructions in the predecessor block.
*/
for (unsigned i = 0; i < ARRAY_SIZE(block->successors); i++) {
if (block->successors[i] == NULL)
continue;
nir_foreach_instr(instr, block->successors[i]) {
if (instr->type != nir_instr_type_phi)
break;
nir_phi_instr *phi = nir_instr_as_phi(instr);
nir_foreach_phi_src(phi_src, phi) {
if (phi_src->pred == block)
rewrite_src(&phi_src->src, &state);
}
}
}
}
nir_phi_builder_finish(phi_build);

View File

@@ -56,7 +56,9 @@ emit_deref_copy_load_store(nir_builder *b,
nir_deref_instr *dst_deref,
nir_deref_instr **dst_deref_arr,
nir_deref_instr *src_deref,
nir_deref_instr **src_deref_arr)
nir_deref_instr **src_deref_arr,
enum gl_access_qualifier dst_access,
enum gl_access_qualifier src_access)
{
if (dst_deref_arr || src_deref_arr) {
assert(dst_deref_arr && src_deref_arr);
@@ -79,14 +81,16 @@ emit_deref_copy_load_store(nir_builder *b,
nir_build_deref_array_imm(b, dst_deref, i),
dst_deref_arr + 1,
nir_build_deref_array_imm(b, src_deref, i),
src_deref_arr + 1);
src_deref_arr + 1, dst_access, src_access);
}
} else {
assert(glsl_get_bare_type(dst_deref->type) ==
glsl_get_bare_type(src_deref->type));
assert(glsl_type_is_vector_or_scalar(dst_deref->type));
nir_store_deref(b, dst_deref, nir_load_deref(b, src_deref), ~0);
nir_store_deref_with_access(b, dst_deref,
nir_load_deref_with_access(b, src_deref, src_access),
~0, src_access);
}
}
@@ -106,7 +110,9 @@ nir_lower_deref_copy_instr(nir_builder *b, nir_intrinsic_instr *copy)
b->cursor = nir_before_instr(&copy->instr);
emit_deref_copy_load_store(b, dst_path.path[0], &dst_path.path[1],
src_path.path[0], &src_path.path[1]);
src_path.path[0], &src_path.path[1],
nir_intrinsic_dst_access(copy),
nir_intrinsic_src_access(copy));
nir_deref_path_finish(&dst_path);
nir_deref_path_finish(&src_path);

View File

@@ -985,7 +985,7 @@ def bitfield_reverse(u):
return step5
optimizations += [(bitfield_reverse('x@32'), ('bitfield_reverse', 'x'))]
optimizations += [(bitfield_reverse('x@32'), ('bitfield_reverse', 'x'), '!options->lower_bitfield_reverse')]
# For any float comparison operation, "cmp", if you have "a == a && a cmp b"
# then the "a == a" is redundant because it's equivalent to "a is not NaN"
@@ -1086,9 +1086,6 @@ late_optimizations = [
(('fdot4', a, b), ('fdot_replicated4', a, b), 'options->fdot_replicates'),
(('fdph', a, b), ('fdph_replicated', a, b), 'options->fdot_replicates'),
(('b2f(is_used_more_than_once)', ('inot', 'a@1')), ('bcsel', a, 0.0, 1.0)),
(('fneg(is_used_more_than_once)', ('b2f', ('inot', 'a@1'))), ('bcsel', a, -0.0, -1.0)),
# we do these late so that we don't get in the way of creating ffmas
(('fmin', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', b)), ('fadd', c, ('fmin', a, b))),
(('fmax', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', b)), ('fadd', c, ('fmax', a, b))),

View File

@@ -107,8 +107,10 @@ push_block(struct block_queue *bq)
if (!u_vector_init(&bi->instructions,
sizeof(nir_alu_instr *),
8 * sizeof(nir_alu_instr *)))
8 * sizeof(nir_alu_instr *))) {
free(bi);
return NULL;
}
exec_list_push_tail(&bq->blocks, &bi->node);
@@ -346,7 +348,7 @@ comparison_pre_block(nir_block *block, struct block_queue *bq, nir_builder *bld)
return progress;
}
static bool
bool
nir_opt_comparison_pre_impl(nir_function_impl *impl)
{
struct block_queue bq;

View File

@@ -216,7 +216,7 @@ node_is_dead(nir_cf_node *node)
nir_foreach_instr(instr, block) {
if (instr->type == nir_instr_type_call)
return true;
return false;
/* Return instructions can cause us to skip over other side-effecting
* instructions after the loop, so consider them to have side effects
@@ -355,11 +355,22 @@ opt_dead_cf_impl(nir_function_impl *impl)
if (progress) {
nir_metadata_preserve(impl, nir_metadata_none);
} else {
/* The CF manipulation code called by this pass is smart enough to keep
* from breaking any SSA use/def chains by replacing any uses of removed
* instructions with SSA undefs. However, it's not quite smart enough
* to always preserve the dominance properties. In particular, if you
* remove the one break from a loop, stuff in the loop may still be used
* outside the loop even though there's no path between the two. We can
* easily fix these issues by calling nir_repair_ssa which will ensure
* that the dominance properties hold.
*/
nir_repair_ssa_impl(impl);
} else {
#ifndef NDEBUG
impl->valid_metadata &= ~nir_metadata_not_properly_reset;
#endif
}
}
return progress;
}

View File

@@ -152,11 +152,7 @@ gcm_pin_instructions_block(nir_block *block, struct gcm_state *state)
break;
case nir_instr_type_intrinsic: {
const nir_intrinsic_info *info =
&nir_intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic];
if ((info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
(info->flags & NIR_INTRINSIC_CAN_REORDER)) {
if (nir_intrinsic_can_reorder(nir_instr_as_intrinsic(instr))) {
instr->pass_flags = 0;
} else {
instr->pass_flags = GCM_INSTR_PINNED;

View File

@@ -65,15 +65,17 @@ build_umod(nir_builder *b, nir_ssa_def *n, uint64_t d)
static nir_ssa_def *
build_idiv(nir_builder *b, nir_ssa_def *n, int64_t d)
{
uint64_t abs_d = d < 0 ? -d : d;
if (d == 0) {
return nir_imm_intN_t(b, 0, n->bit_size);
} else if (d == 1) {
return n;
} else if (d == -1) {
return nir_ineg(b, n);
} else if (util_is_power_of_two_or_zero64(d)) {
uint64_t abs_d = d < 0 ? -d : d;
nir_ssa_def *uq = nir_ishr(b, n, nir_imm_int(b, util_logbase2_64(abs_d)));
} else if (util_is_power_of_two_or_zero64(abs_d)) {
nir_ssa_def *uq = nir_ushr(b, nir_iabs(b, n),
nir_imm_int(b, util_logbase2_64(abs_d)));
nir_ssa_def *n_neg = nir_ilt(b, n, nir_imm_intN_t(b, 0, n->bit_size));
nir_ssa_def *neg = d < 0 ? nir_inot(b, n_neg) : n_neg;
return nir_bcsel(b, neg, nir_ineg(b, uq), uq);

View File

@@ -1040,6 +1040,13 @@ opt_if_loop_terminator(nir_if *nif)
if (!nir_is_trivial_loop_if(nif, break_blk))
return false;
/* Even though this if statement has a jump on one side, we may still have
* phis afterwards. Single-source phis can be produced by loop unrolling
* or dead control-flow passes and are perfectly legal. Run a quick phi
* removal on the block after the if to clean up any such phis.
*/
nir_opt_remove_phis_block(nir_cf_node_as_block(nir_cf_node_next(&nif->cf_node)));
/* Finally, move the continue from branch after the if-statement. */
nir_cf_list tmp;
nir_cf_extract(&tmp, nir_before_block(first_continue_from_blk),

View File

@@ -560,31 +560,7 @@ wrapper_unroll(nir_loop *loop)
nir_after_block(nir_if_last_else_block(terminator->nif));
}
} else {
nir_block *blk_after_loop =
nir_cursor_current_block(nir_after_cf_node(&loop->cf_node));
/* There may still be some single src phis following the loop that
* have not yet been cleaned up by another pass. Tidy those up
* before unrolling the loop.
*/
nir_foreach_instr_safe(instr, blk_after_loop) {
if (instr->type != nir_instr_type_phi)
break;
nir_phi_instr *phi = nir_instr_as_phi(instr);
assert(exec_list_length(&phi->srcs) == 1);
nir_phi_src *phi_src =
exec_node_data(nir_phi_src, exec_list_get_head(&phi->srcs), node);
nir_ssa_def_rewrite_uses(&phi->dest.ssa, phi_src->src);
nir_instr_remove(instr);
}
/* Remove break at end of the loop */
nir_block *last_loop_blk = nir_loop_last_block(loop);
nir_instr *break_instr = nir_block_last_instr(last_loop_blk);
nir_instr_remove(break_instr);
loop_prepare_for_unroll(loop);
}
/* Pluck out the loop body. */

View File

@@ -91,7 +91,7 @@ move_load_ubo(nir_block *block)
}
}
return false;
return progress;
}
bool

View File

@@ -109,12 +109,13 @@ remove_phis_block(nir_block *block, nir_builder *b)
if (!srcs_same)
continue;
/* We must have found at least one definition, since there must be at
* least one forward edge.
*/
assert(def != NULL);
if (!def) {
/* In this case, the phi had no sources. So turn it into an undef. */
if (mov) {
b->cursor = nir_after_phis(block);
def = nir_ssa_undef(b, phi->dest.ssa.num_components,
phi->dest.ssa.bit_size);
} else if (mov) {
/* If the sources were all movs from the same source with the same
* swizzle, then we can't just pick a random move because it may not
* dominate the phi node. Instead, we need to emit our own move after
@@ -139,6 +140,14 @@ remove_phis_block(nir_block *block, nir_builder *b)
return progress;
}
bool
nir_opt_remove_phis_block(nir_block *block)
{
nir_builder b;
nir_builder_init(&b, nir_cf_node_get_function(&block->cf_node));
return remove_phis_block(block, &b);
}
static bool
nir_opt_remove_phis_impl(nir_function_impl *impl)
{

View File

@@ -771,6 +771,8 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, print_state *state)
[NIR_INTRINSIC_IMAGE_DIM] = "image_dim",
[NIR_INTRINSIC_IMAGE_ARRAY] = "image_array",
[NIR_INTRINSIC_ACCESS] = "access",
[NIR_INTRINSIC_SRC_ACCESS] = "src-access",
[NIR_INTRINSIC_DST_ACCESS] = "dst-access",
[NIR_INTRINSIC_FORMAT] = "format",
[NIR_INTRINSIC_ALIGN_MUL] = "align_mul",
[NIR_INTRINSIC_ALIGN_OFFSET] = "align_offset",

View File

@@ -65,12 +65,21 @@ add_cf_node(nir_cf_node *cf, struct set *invariants)
static void
add_var(nir_variable *var, struct set *invariants)
{
_mesa_set_add(invariants, var);
/* Because we pass the result of nir_intrinsic_get_var directly to this
* function, it's possible for var to be NULL if, for instance, there's a
* cast somewhere in the chain.
*/
if (var != NULL)
_mesa_set_add(invariants, var);
}
static bool
var_is_invariant(nir_variable *var, struct set * invariants)
{
/* Because we pass the result of nir_intrinsic_get_var directly to this
* function, it's possible for var to be NULL if, for instance, there's a
* cast somewhere in the chain.
*/
return var && (var->data.invariant || _mesa_set_search(invariants, var));
}

View File

@@ -71,7 +71,8 @@ repair_ssa_def(nir_ssa_def *def, void *void_state)
bool is_valid = true;
nir_foreach_use(src, def) {
if (!nir_block_dominates(def->parent_instr->block, get_src_block(src))) {
if (nir_block_is_unreachable(get_src_block(src)) ||
!nir_block_dominates(def->parent_instr->block, get_src_block(src))) {
is_valid = false;
break;
}
@@ -80,7 +81,8 @@ repair_ssa_def(nir_ssa_def *def, void *void_state)
nir_foreach_if_use(src, def) {
nir_block *block_before_if =
nir_cf_node_as_block(nir_cf_node_prev(&src->parent_if->cf_node));
if (!nir_block_dominates(def->parent_instr->block, block_before_if)) {
if (nir_block_is_unreachable(block_before_if) ||
!nir_block_dominates(def->parent_instr->block, block_before_if)) {
is_valid = false;
break;
}
@@ -101,19 +103,57 @@ repair_ssa_def(nir_ssa_def *def, void *void_state)
nir_foreach_use_safe(src, def) {
nir_block *src_block = get_src_block(src);
if (!nir_block_dominates(def->parent_instr->block, src_block)) {
nir_instr_rewrite_src(src->parent_instr, src, nir_src_for_ssa(
nir_phi_builder_value_get_block_def(val, src_block)));
if (src_block == def->parent_instr->block) {
assert(nir_phi_builder_value_get_block_def(val, src_block) == def);
continue;
}
nir_ssa_def *block_def =
nir_phi_builder_value_get_block_def(val, src_block);
if (block_def == def)
continue;
/* If def was a deref and the use we're looking at is a deref that
* isn't a cast, we need to wrap it in a cast so we don't loose any
* deref information.
*/
if (def->parent_instr->type == nir_instr_type_deref &&
src->parent_instr->type == nir_instr_type_deref &&
nir_instr_as_deref(src->parent_instr)->deref_type != nir_deref_type_cast) {
nir_deref_instr *cast =
nir_deref_instr_create(state->impl->function->shader,
nir_deref_type_cast);
nir_deref_instr *deref = nir_instr_as_deref(def->parent_instr);
cast->mode = deref->mode;
cast->type = deref->type;
cast->parent = nir_src_for_ssa(block_def);
cast->cast.ptr_stride = nir_deref_instr_ptr_as_array_stride(deref);
nir_ssa_dest_init(&cast->instr, &cast->dest,
def->num_components, def->bit_size, NULL);
nir_instr_insert(nir_before_instr(src->parent_instr),
&cast->instr);
block_def = &cast->dest.ssa;
}
nir_instr_rewrite_src(src->parent_instr, src, nir_src_for_ssa(block_def));
}
nir_foreach_if_use_safe(src, def) {
nir_block *block_before_if =
nir_cf_node_as_block(nir_cf_node_prev(&src->parent_if->cf_node));
if (!nir_block_dominates(def->parent_instr->block, block_before_if)) {
nir_if_rewrite_condition(src->parent_if, nir_src_for_ssa(
nir_phi_builder_value_get_block_def(val, block_before_if)));
if (block_before_if == def->parent_instr->block) {
assert(nir_phi_builder_value_get_block_def(val, block_before_if) == def);
continue;
}
nir_ssa_def *block_def =
nir_phi_builder_value_get_block_def(val, block_before_if);
if (block_def == def)
continue;
nir_if_rewrite_condition(src->parent_if, nir_src_for_ssa(block_def));
}
return true;

View File

@@ -143,22 +143,6 @@ is_not_const(nir_alu_instr *instr, unsigned src, UNUSED unsigned num_components,
return !nir_src_is_const(instr->src[src].src);
}
static inline bool
is_used_more_than_once(nir_alu_instr *instr)
{
bool zero_if_use = list_empty(&instr->dest.dest.ssa.if_uses);
bool zero_use = list_empty(&instr->dest.dest.ssa.uses);
if (zero_use && zero_if_use)
return false;
else if (zero_use && list_is_singular(&instr->dest.dest.ssa.if_uses))
return false;
else if (zero_if_use && list_is_singular(&instr->dest.dest.ssa.uses))
return false;
return true;
}
static inline bool
is_used_once(nir_alu_instr *instr)
{

View File

@@ -64,21 +64,25 @@
static void
split_deref_copy_instr(nir_builder *b,
nir_deref_instr *dst, nir_deref_instr *src)
nir_deref_instr *dst, nir_deref_instr *src,
enum gl_access_qualifier dst_access,
enum gl_access_qualifier src_access)
{
assert(glsl_get_bare_type(dst->type) ==
glsl_get_bare_type(src->type));
if (glsl_type_is_vector_or_scalar(src->type)) {
nir_copy_deref(b, dst, src);
nir_copy_deref_with_access(b, dst, src, dst_access, src_access);
} else if (glsl_type_is_struct_or_ifc(src->type)) {
for (unsigned i = 0; i < glsl_get_length(src->type); i++) {
split_deref_copy_instr(b, nir_build_deref_struct(b, dst, i),
nir_build_deref_struct(b, src, i));
nir_build_deref_struct(b, src, i),
dst_access, src_access);
}
} else {
assert(glsl_type_is_matrix(src->type) || glsl_type_is_array(src->type));
split_deref_copy_instr(b, nir_build_deref_array_wildcard(b, dst),
nir_build_deref_array_wildcard(b, src));
nir_build_deref_array_wildcard(b, src),
dst_access, src_access);
}
}
@@ -105,7 +109,9 @@ split_var_copies_impl(nir_function_impl *impl)
nir_instr_as_deref(copy->src[0].ssa->parent_instr);
nir_deref_instr *src =
nir_instr_as_deref(copy->src[1].ssa->parent_instr);
split_deref_copy_instr(&b, dst, src);
split_deref_copy_instr(&b, dst, src,
nir_intrinsic_dst_access(copy),
nir_intrinsic_src_access(copy));
progress = true;
}

View File

@@ -111,9 +111,6 @@ convert_loop_exit_for_ssa(nir_ssa_def *def, void *void_state)
if (all_uses_inside_loop)
return true;
/* We don't want derefs ending up in phi sources */
assert(def->parent_instr->type != nir_instr_type_deref);
/* Initialize a phi-instruction */
nir_phi_instr *phi = nir_phi_instr_create(state->shader);
nir_ssa_dest_init(&phi->instr, &phi->dest,
@@ -131,6 +128,25 @@ convert_loop_exit_for_ssa(nir_ssa_def *def, void *void_state)
}
nir_instr_insert_before_block(block_after_loop, &phi->instr);
nir_ssa_def *dest = &phi->dest.ssa;
/* deref instructions need a cast after the phi */
if (def->parent_instr->type == nir_instr_type_deref) {
nir_deref_instr *cast =
nir_deref_instr_create(state->shader, nir_deref_type_cast);
nir_deref_instr *instr = nir_instr_as_deref(def->parent_instr);
cast->mode = instr->mode;
cast->type = instr->type;
cast->parent = nir_src_for_ssa(&phi->dest.ssa);
cast->cast.ptr_stride = nir_deref_instr_ptr_as_array_stride(instr);
nir_ssa_dest_init(&cast->instr, &cast->dest,
phi->dest.ssa.num_components,
phi->dest.ssa.bit_size, NULL);
nir_instr_insert(nir_after_phis(block_after_loop), &cast->instr);
dest = &cast->dest.ssa;
}
/* Run through all uses and rewrite those outside the loop to point to
* the phi instead of pointing to the ssa-def.
@@ -142,15 +158,13 @@ convert_loop_exit_for_ssa(nir_ssa_def *def, void *void_state)
}
if (!is_use_inside_loop(use, state->loop)) {
nir_instr_rewrite_src(use->parent_instr, use,
nir_src_for_ssa(&phi->dest.ssa));
nir_instr_rewrite_src(use->parent_instr, use, nir_src_for_ssa(dest));
}
}
nir_foreach_if_use_safe(use, def) {
if (!is_if_use_inside_loop(use, state->loop)) {
nir_if_rewrite_condition(use->parent_if,
nir_src_for_ssa(&phi->dest.ssa));
nir_if_rewrite_condition(use->parent_if, nir_src_for_ssa(dest));
}
}

View File

@@ -0,0 +1,531 @@
/*
* Copyright © 2019 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#include <gtest/gtest.h>
#include "nir.h"
#include "nir_builder.h"
class comparison_pre_test : public ::testing::Test {
protected:
comparison_pre_test()
{
static const nir_shader_compiler_options options = { };
nir_builder_init_simple_shader(&bld, NULL, MESA_SHADER_VERTEX, &options);
v1 = nir_imm_vec4(&bld, -2.0, -1.0, 1.0, 2.0);
v2 = nir_imm_vec4(&bld, 2.0, 1.0, -1.0, -2.0);
v3 = nir_imm_vec4(&bld, 3.0, 4.0, 5.0, 6.0);
}
~comparison_pre_test()
{
ralloc_free(bld.shader);
}
struct nir_builder bld;
nir_ssa_def *v1;
nir_ssa_def *v2;
nir_ssa_def *v3;
const uint8_t xxxx[4] = { 0, 0, 0, 0 };
const uint8_t wwww[4] = { 3, 3, 3, 3 };
};
TEST_F(comparison_pre_test, a_lt_b_vs_neg_a_plus_b)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 1 ssa_6 = flt ssa_5, ssa_3
*
* if ssa_6 {
* vec1 32 ssa_7 = fneg ssa_5
* vec1 32 ssa_8 = fadd ssa_7, ssa_3
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 32 ssa_9 = fneg ssa_5
* vec1 32 ssa_10 = fadd ssa_3, ssa_9
* vec1 32 ssa_11 = load_const (0.0)
* vec1 1 ssa_12 = flt ssa_11, ssa_10
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* vec1 32 ssa_7 = fneg ssa_5
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *a = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, a, one);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, nir_fneg(&bld, a), one);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, a_lt_b_vs_a_minus_b)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 1 ssa_6 = flt ssa_3, ssa_5
*
* if ssa_6 {
* vec1 32 ssa_7 = fneg ssa_5
* vec1 32 ssa_8 = fadd ssa_3, ssa_7
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 32 ssa_9 = fneg ssa_5
* vec1 32 ssa_10 = fadd ssa_3, ssa_9
* vec1 32 ssa_11 = load_const (0.0)
* vec1 1 ssa_12 = flt ssa_10, ssa_11
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* vec1 32 ssa_7 = fneg ssa_5
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *b = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, one, b);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, one, nir_fneg(&bld, b));
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, neg_a_lt_b_vs_a_plus_b)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 32 ssa_6 = fneg ssa_5
* vec1 1 ssa_7 = flt ssa_6, ssa_3
*
* if ssa_7 {
* vec1 32 ssa_8 = fadd ssa_5, ssa_3
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 32 ssa_9 = fneg ssa_5
* vec1 32 ssa_9 = fneg ssa_6
* vec1 32 ssa_10 = fadd ssa_3, ssa_9
* vec1 32 ssa_11 = load_const ( 0.0)
* vec1 1 ssa_12 = flt ssa_11, ssa_10
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *a = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, nir_fneg(&bld, a), one);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, a, one);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, a_lt_neg_b_vs_a_plus_b)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 32 ssa_6 = fneg ssa_5
* vec1 1 ssa_7 = flt ssa_3, ssa_6
*
* if ssa_7 {
* vec1 32 ssa_8 = fadd ssa_3, ssa_5
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec4 32 ssa_4 = fadd ssa_0, ssa_2
* vec1 32 ssa_5 = mov ssa_4.x
* vec1 32 ssa_9 = fneg ssa_5
* vec1 32 ssa_9 = fneg ssa_6
* vec1 32 ssa_10 = fadd ssa_3, ssa_9
* vec1 32 ssa_11 = load_const ( 0.0)
* vec1 1 ssa_12 = flt ssa_10, ssa_11
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *b = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, one, nir_fneg(&bld, b));
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, one, b);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, imm_lt_b_vs_neg_imm_plus_b)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 1 ssa_7 = flt ssa_3, ssa_6
*
* if ssa_7 {
* vec1 32 ssa_8 = fadd ssa_4, ssa_6
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 32 ssa_9 = fneg ssa_3
* vec1 32 ssa_10 = fadd ssa_6, ssa_9
* vec1 32 ssa_11 = load_const ( 0.0)
* vec1 1 ssa_12 = flt ssa_11, ssa_10
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *neg_one = nir_imm_float(&bld, -1.0f);
nir_ssa_def *a = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, one, a);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, neg_one, a);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, a_lt_imm_vs_a_minus_imm)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 1 ssa_7 = flt ssa_6, ssa_3
*
* if ssa_6 {
* vec1 32 ssa_8 = fadd ssa_6, ssa_4
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 32 ssa_9 = fneg ssa_3
* vec1 32 ssa_10 = fadd ssa_6, ssa_9
* vec1 32 ssa_11 = load_const ( 0.0)
* vec1 1 ssa_12 = flt ssa_10, ssa_11
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *neg_one = nir_imm_float(&bld, -1.0f);
nir_ssa_def *a = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, a, one);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, a, neg_one);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, neg_imm_lt_a_vs_a_plus_imm)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 1 ssa_7 = flt ssa_4, ssa_6
*
* if ssa_7 {
* vec1 32 ssa_8 = fadd ssa_6, ssa_3
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 32 ssa_9 = fneg ssa_4
* vec1 32 ssa_10 = fadd ssa_6, ssa_9
* vec1 32 ssa_11 = load_const ( 0.0)
* vec1 1 ssa_12 = flt ssa_11, ssa_10
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *neg_one = nir_imm_float(&bld, -1.0f);
nir_ssa_def *a = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, neg_one, a);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, a, one);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, a_lt_neg_imm_vs_a_plus_imm)
{
/* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 1 ssa_7 = flt ssa_6, ssa_4
*
* if ssa_7 {
* vec1 32 ssa_8 = fadd ssa_6, ssa_3
* } else {
* }
*
* After:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec1 32 ssa_3 = load_const ( 1.0)
* vec1 32 ssa_4 = load_const (-1.0)
* vec4 32 ssa_5 = fadd ssa_0, ssa_2
* vec1 32 ssa_6 = mov ssa_5.x
* vec1 32 ssa_9 = fneg ssa_4
* vec1 32 ssa_10 = fadd ssa_6, ssa_9
* vec1 32 ssa_11 = load_const ( 0.0)
* vec1 1 ssa_12 = flt ssa_10, ssa_11
* vec1 32 ssa_13 = mov ssa_10
* vec1 1 ssa_14 = mov ssa_12
*
* if ssa_14 {
* } else {
* }
*/
nir_ssa_def *one = nir_imm_float(&bld, 1.0f);
nir_ssa_def *neg_one = nir_imm_float(&bld, -1.0f);
nir_ssa_def *a = nir_channel(&bld, nir_fadd(&bld, v1, v3), 0);
nir_ssa_def *flt = nir_flt(&bld, a, neg_one);
nir_if *nif = nir_push_if(&bld, flt);
nir_fadd(&bld, a, one);
nir_pop_if(&bld, nif);
EXPECT_TRUE(nir_opt_comparison_pre_impl(bld.impl));
}
TEST_F(comparison_pre_test, non_scalar_add_result)
{
/* The optimization pass should not do anything because the result of the
* fadd is not a scalar.
*
* Before:
*
* vec4 32 ssa_0 = load_const (-2.0, -1.0, 1.0, 2.0)
* vec4 32 ssa_1 = load_const ( 2.0, 1.0, -1.0, -2.0)
* vec4 32 ssa_2 = load_const ( 3.0, 4.0, 5.0, 6.0)
* vec4 32 ssa_3 = fadd ssa_0, ssa_2
* vec1 1 ssa_4 = flt ssa_0.x, ssa_3.x
*
* if ssa_4 {
* vec2 32 ssa_5 = fadd ssa_1.xx, ssa_3.xx
* } else {
* }
*
* After:
*
* No change.
*/
nir_ssa_def *a = nir_fadd(&bld, v1, v3);
nir_alu_instr *flt = nir_alu_instr_create(bld.shader, nir_op_flt);
flt->src[0].src = nir_src_for_ssa(v1);
flt->src[1].src = nir_src_for_ssa(a);
memcpy(&flt->src[0].swizzle, xxxx, sizeof(xxxx));
memcpy(&flt->src[1].swizzle, xxxx, sizeof(xxxx));
nir_builder_alu_instr_finish_and_insert(&bld, flt);
flt->dest.dest.ssa.num_components = 1;
flt->dest.write_mask = 1;
nir_if *nif = nir_push_if(&bld, &flt->dest.dest.ssa);
nir_alu_instr *fadd = nir_alu_instr_create(bld.shader, nir_op_fadd);
fadd->src[0].src = nir_src_for_ssa(v2);
fadd->src[1].src = nir_src_for_ssa(a);
memcpy(&fadd->src[0].swizzle, xxxx, sizeof(xxxx));
memcpy(&fadd->src[1].swizzle, xxxx, sizeof(xxxx));
nir_builder_alu_instr_finish_and_insert(&bld, fadd);
fadd->dest.dest.ssa.num_components = 2;
fadd->dest.write_mask = 3;
nir_pop_if(&bld, nif);
EXPECT_FALSE(nir_opt_comparison_pre_impl(bld.impl));
}

View File

@@ -1422,15 +1422,17 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
default:
break;
}
}
if (storage_class == SpvStorageClassWorkgroup &&
b->options->lower_workgroup_access_to_offsets) {
} else if (storage_class == SpvStorageClassWorkgroup &&
b->options->lower_workgroup_access_to_offsets) {
/* Workgroup is laid out by the implementation. */
uint32_t size, align;
val->type->deref = vtn_type_layout_std430(b, val->type->deref,
&size, &align);
val->type->length = size;
val->type->align = align;
/* Override any ArrayStride previously set. */
val->type->stride = vtn_align_u32(size, align);
}
}
break;
@@ -2089,19 +2091,17 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
vtn_value(b, w[4], vtn_value_type_pointer)->pointer;
return;
} else if (opcode == SpvOpImage) {
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_pointer);
struct vtn_value *src_val = vtn_untyped_value(b, w[3]);
if (src_val->value_type == vtn_value_type_sampled_image) {
val->pointer = src_val->sampled_image->image;
vtn_push_value_pointer(b, w[2], src_val->sampled_image->image);
} else {
vtn_assert(src_val->value_type == vtn_value_type_pointer);
val->pointer = src_val->pointer;
vtn_push_value_pointer(b, w[2], src_val->pointer);
}
return;
}
struct vtn_type *ret_type = vtn_value(b, w[1], vtn_value_type_type)->type;
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa);
struct vtn_sampled_image sampled;
struct vtn_value *sampled_val = vtn_untyped_value(b, w[3]);
@@ -2415,8 +2415,9 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
}
}
val->ssa = vtn_create_ssa_value(b, ret_type->type);
val->ssa->def = &instr->dest.ssa;
struct vtn_ssa_value *ssa = vtn_create_ssa_value(b, ret_type->type);
ssa->def = &instr->dest.ssa;
vtn_push_ssa(b, w[2], ret_type, ssa);
nir_builder_instr_insert(&b->nb, &instr->instr);
}
@@ -2606,6 +2607,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
intrin->src[2] = nir_src_for_ssa(image.sample);
}
nir_intrinsic_set_access(intrin, image.image->access);
switch (opcode) {
case SpvOpAtomicLoad:
case SpvOpImageQuerySize:
@@ -2644,7 +2647,6 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
}
if (opcode != SpvOpImageWrite && opcode != SpvOpAtomicStore) {
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa);
struct vtn_type *type = vtn_value(b, w[1], vtn_value_type_type)->type;
unsigned dest_components = glsl_get_vector_elements(type->type);
@@ -2661,7 +2663,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
if (intrin->num_components != dest_components)
result = nir_channels(&b->nb, result, (1 << dest_components) - 1);
val->ssa = vtn_create_ssa_value(b, type->type);
struct vtn_value *val =
vtn_push_ssa(b, w[2], type, vtn_create_ssa_value(b, type->type));
val->ssa->def = result;
} else {
nir_builder_instr_insert(&b->nb, &intrin->instr);
@@ -2972,10 +2975,10 @@ vtn_handle_atomics(struct vtn_builder *b, SpvOp opcode,
glsl_get_vector_elements(type->type),
glsl_get_bit_size(type->type), NULL);
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa);
val->ssa = rzalloc(b, struct vtn_ssa_value);
val->ssa->def = &atomic->dest.ssa;
val->ssa->type = type->type;
struct vtn_ssa_value *ssa = rzalloc(b, struct vtn_ssa_value);
ssa->def = &atomic->dest.ssa;
ssa->type = type->type;
vtn_push_ssa(b, w[2], type, ssa);
}
nir_builder_instr_insert(&b->nb, &atomic->instr);
@@ -3215,65 +3218,65 @@ static void
vtn_handle_composite(struct vtn_builder *b, SpvOp opcode,
const uint32_t *w, unsigned count)
{
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa);
const struct glsl_type *type =
vtn_value(b, w[1], vtn_value_type_type)->type->type;
val->ssa = vtn_create_ssa_value(b, type);
struct vtn_type *type = vtn_value(b, w[1], vtn_value_type_type)->type;
struct vtn_ssa_value *ssa = vtn_create_ssa_value(b, type->type);
switch (opcode) {
case SpvOpVectorExtractDynamic:
val->ssa->def = vtn_vector_extract_dynamic(b, vtn_ssa_value(b, w[3])->def,
vtn_ssa_value(b, w[4])->def);
ssa->def = vtn_vector_extract_dynamic(b, vtn_ssa_value(b, w[3])->def,
vtn_ssa_value(b, w[4])->def);
break;
case SpvOpVectorInsertDynamic:
val->ssa->def = vtn_vector_insert_dynamic(b, vtn_ssa_value(b, w[3])->def,
vtn_ssa_value(b, w[4])->def,
vtn_ssa_value(b, w[5])->def);
ssa->def = vtn_vector_insert_dynamic(b, vtn_ssa_value(b, w[3])->def,
vtn_ssa_value(b, w[4])->def,
vtn_ssa_value(b, w[5])->def);
break;
case SpvOpVectorShuffle:
val->ssa->def = vtn_vector_shuffle(b, glsl_get_vector_elements(type),
vtn_ssa_value(b, w[3])->def,
vtn_ssa_value(b, w[4])->def,
w + 5);
ssa->def = vtn_vector_shuffle(b, glsl_get_vector_elements(type->type),
vtn_ssa_value(b, w[3])->def,
vtn_ssa_value(b, w[4])->def,
w + 5);
break;
case SpvOpCompositeConstruct: {
unsigned elems = count - 3;
assume(elems >= 1);
if (glsl_type_is_vector_or_scalar(type)) {
if (glsl_type_is_vector_or_scalar(type->type)) {
nir_ssa_def *srcs[NIR_MAX_VEC_COMPONENTS];
for (unsigned i = 0; i < elems; i++)
srcs[i] = vtn_ssa_value(b, w[3 + i])->def;
val->ssa->def =
vtn_vector_construct(b, glsl_get_vector_elements(type),
ssa->def =
vtn_vector_construct(b, glsl_get_vector_elements(type->type),
elems, srcs);
} else {
val->ssa->elems = ralloc_array(b, struct vtn_ssa_value *, elems);
ssa->elems = ralloc_array(b, struct vtn_ssa_value *, elems);
for (unsigned i = 0; i < elems; i++)
val->ssa->elems[i] = vtn_ssa_value(b, w[3 + i]);
ssa->elems[i] = vtn_ssa_value(b, w[3 + i]);
}
break;
}
case SpvOpCompositeExtract:
val->ssa = vtn_composite_extract(b, vtn_ssa_value(b, w[3]),
w + 4, count - 4);
ssa = vtn_composite_extract(b, vtn_ssa_value(b, w[3]),
w + 4, count - 4);
break;
case SpvOpCompositeInsert:
val->ssa = vtn_composite_insert(b, vtn_ssa_value(b, w[4]),
vtn_ssa_value(b, w[3]),
w + 5, count - 5);
ssa = vtn_composite_insert(b, vtn_ssa_value(b, w[4]),
vtn_ssa_value(b, w[3]),
w + 5, count - 5);
break;
case SpvOpCopyObject:
val->ssa = vtn_composite_copy(b, vtn_ssa_value(b, w[3]));
ssa = vtn_composite_copy(b, vtn_ssa_value(b, w[3]));
break;
default:
vtn_fail_with_opcode("unknown composite operation", opcode);
}
vtn_push_ssa(b, w[2], type, ssa);
}
static void
@@ -3389,13 +3392,13 @@ vtn_handle_barrier(struct vtn_builder *b, SpvOp opcode,
}
case SpvOpControlBarrier: {
SpvScope execution_scope = vtn_constant_uint(b, w[1]);
if (execution_scope == SpvScopeWorkgroup)
vtn_emit_barrier(b, nir_intrinsic_barrier);
SpvScope memory_scope = vtn_constant_uint(b, w[2]);
SpvMemorySemanticsMask memory_semantics = vtn_constant_uint(b, w[3]);
vtn_emit_memory_barrier(b, memory_scope, memory_semantics);
SpvScope execution_scope = vtn_constant_uint(b, w[1]);
if (execution_scope == SpvScopeWorkgroup)
vtn_emit_barrier(b, nir_intrinsic_barrier);
break;
}

View File

@@ -328,17 +328,12 @@ vtn_cfg_handle_prepass_instruction(struct vtn_builder *b, SpvOp opcode,
} else if (type->base_type == vtn_base_type_pointer &&
type->type != NULL) {
/* This is a pointer with an actual storage type */
struct vtn_value *val =
vtn_push_value(b, w[2], vtn_value_type_pointer);
nir_ssa_def *ssa_ptr = nir_load_param(&b->nb, b->func_param_idx++);
val->pointer = vtn_pointer_from_ssa(b, ssa_ptr, type);
vtn_push_value_pointer(b, w[2], vtn_pointer_from_ssa(b, ssa_ptr, type));
} else if (type->base_type == vtn_base_type_pointer ||
type->base_type == vtn_base_type_image ||
type->base_type == vtn_base_type_sampler) {
struct vtn_value *val =
vtn_push_value(b, w[2], vtn_value_type_pointer);
val->pointer =
vtn_load_param_pointer(b, type, b->func_param_idx++);
vtn_push_value_pointer(b, w[2], vtn_load_param_pointer(b, type, b->func_param_idx++));
} else {
/* We're a regular SSA value. */
struct vtn_ssa_value *value = vtn_create_ssa_value(b, type->type);

View File

@@ -269,6 +269,9 @@ struct vtn_ssa_value {
struct vtn_ssa_value *transposed;
const struct glsl_type *type;
/* Access qualifiers */
enum gl_access_qualifier access;
};
enum vtn_base_type {
@@ -416,6 +419,9 @@ struct vtn_access_chain {
*/
bool ptr_as_array;
/* Access qualifiers */
enum gl_access_qualifier access;
/** Struct elements and array offsets.
*
* This is an array of 1 so that it can conveniently be created on the
@@ -645,6 +651,10 @@ vtn_untyped_value(struct vtn_builder *b, uint32_t value_id)
return &b->values[value_id];
}
/* Consider not using this function directly and instead use
* vtn_push_ssa/vtn_push_value_pointer so that appropriate applying of
* decorations is handled by common code.
*/
static inline struct vtn_value *
vtn_push_value(struct vtn_builder *b, uint32_t value_id,
enum vtn_value_type value_type)
@@ -656,22 +666,8 @@ vtn_push_value(struct vtn_builder *b, uint32_t value_id,
value_id);
val->value_type = value_type;
return &b->values[value_id];
}
static inline struct vtn_value *
vtn_push_ssa(struct vtn_builder *b, uint32_t value_id,
struct vtn_type *type, struct vtn_ssa_value *ssa)
{
struct vtn_value *val;
if (type->base_type == vtn_base_type_pointer) {
val = vtn_push_value(b, value_id, vtn_value_type_pointer);
val->pointer = vtn_pointer_from_ssa(b, ssa->def, type);
} else {
val = vtn_push_value(b, value_id, vtn_value_type_ssa);
val->ssa = ssa;
}
return val;
return &b->values[value_id];
}
static inline struct vtn_value *
@@ -706,8 +702,43 @@ vtn_constant_uint(struct vtn_builder *b, uint32_t value_id)
}
}
static inline enum gl_access_qualifier vtn_value_access(struct vtn_value *value)
{
switch (value->value_type) {
case vtn_value_type_invalid:
case vtn_value_type_undef:
case vtn_value_type_string:
case vtn_value_type_decoration_group:
case vtn_value_type_constant:
case vtn_value_type_function:
case vtn_value_type_block:
case vtn_value_type_extension:
return 0;
case vtn_value_type_type:
return value->type->access;
case vtn_value_type_pointer:
return value->pointer->access;
case vtn_value_type_ssa:
return value->ssa->access;
case vtn_value_type_image_pointer:
return value->image->image->access;
case vtn_value_type_sampled_image:
return value->sampled_image->image->access |
value->sampled_image->sampler->access;
}
unreachable("invalid type");
}
struct vtn_ssa_value *vtn_ssa_value(struct vtn_builder *b, uint32_t value_id);
struct vtn_value *vtn_push_value_pointer(struct vtn_builder *b,
uint32_t value_id,
struct vtn_pointer *ptr);
struct vtn_value *vtn_push_ssa(struct vtn_builder *b, uint32_t value_id,
struct vtn_type *type, struct vtn_ssa_value *ssa);
struct vtn_ssa_value *vtn_create_ssa_value(struct vtn_builder *b,
const struct glsl_type *type);

View File

@@ -30,6 +30,52 @@
#include "nir_deref.h"
#include <vulkan/vulkan_core.h>
static void ptr_decoration_cb(struct vtn_builder *b,
struct vtn_value *val, int member,
const struct vtn_decoration *dec,
void *void_ptr);
struct vtn_value *
vtn_push_value_pointer(struct vtn_builder *b, uint32_t value_id,
struct vtn_pointer *ptr)
{
struct vtn_value *val = vtn_push_value(b, value_id, vtn_value_type_pointer);
val->pointer = ptr;
vtn_foreach_decoration(b, val, ptr_decoration_cb, ptr);
return val;
}
static void
ssa_decoration_cb(struct vtn_builder *b, struct vtn_value *val, int member,
const struct vtn_decoration *dec, void *void_ssa)
{
struct vtn_ssa_value *ssa = void_ssa;
switch (dec->decoration) {
case SpvDecorationNonUniformEXT:
ssa->access |= ACCESS_NON_UNIFORM;
break;
default:
break;
}
}
struct vtn_value *
vtn_push_ssa(struct vtn_builder *b, uint32_t value_id,
struct vtn_type *type, struct vtn_ssa_value *ssa)
{
struct vtn_value *val;
if (type->base_type == vtn_base_type_pointer) {
val = vtn_push_value_pointer(b, value_id, vtn_pointer_from_ssa(b, ssa->def, type));
} else {
val = vtn_push_value(b, value_id, vtn_value_type_ssa);
val->ssa = ssa;
vtn_foreach_decoration(b, val, ssa_decoration_cb, val->ssa);
}
return val;
}
static struct vtn_access_chain *
vtn_access_chain_create(struct vtn_builder *b, unsigned length)
{
@@ -189,7 +235,7 @@ vtn_nir_deref_pointer_dereference(struct vtn_builder *b,
struct vtn_access_chain *deref_chain)
{
struct vtn_type *type = base->type;
enum gl_access_qualifier access = base->access;
enum gl_access_qualifier access = base->access | deref_chain->access;
unsigned idx = 0;
nir_deref_instr *tail;
@@ -2349,6 +2395,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
case SpvOpInBoundsAccessChain:
case SpvOpInBoundsPtrAccessChain: {
struct vtn_access_chain *chain = vtn_access_chain_create(b, count - 4);
enum gl_access_qualifier access = 0;
chain->ptr_as_array = (opcode == SpvOpPtrAccessChain || opcode == SpvOpInBoundsPtrAccessChain);
unsigned idx = 0;
@@ -2376,8 +2423,8 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
} else {
chain->link[idx].mode = vtn_access_mode_id;
chain->link[idx].id = w[i];
}
access |= vtn_value_access(link_val);
idx++;
}
@@ -2404,11 +2451,11 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
val->sampled_image->sampler);
} else {
vtn_assert(base_val->value_type == vtn_value_type_pointer);
struct vtn_value *val =
vtn_push_value(b, w[2], vtn_value_type_pointer);
val->pointer = vtn_pointer_dereference(b, base_val->pointer, chain);
val->pointer->ptr_type = ptr_type;
vtn_foreach_decoration(b, val, ptr_decoration_cb, val->pointer);
struct vtn_pointer *ptr =
vtn_pointer_dereference(b, base_val->pointer, chain);
ptr->ptr_type = ptr_type;
ptr->access |= access;
vtn_push_value_pointer(b, w[2], ptr);
}
break;
}
@@ -2433,7 +2480,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
if (glsl_type_is_image(res_type->type) ||
glsl_type_is_sampler(res_type->type)) {
vtn_push_value(b, w[2], vtn_value_type_pointer)->pointer = src;
vtn_push_value_pointer(b, w[2], src);
return;
}
@@ -2545,10 +2592,11 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
"scalar type");
/* The pointer will be converted to an SSA value automatically */
nir_ssa_def *ptr_ssa = vtn_ssa_value(b, w[3])->def;
struct vtn_ssa_value *ptr_ssa = vtn_ssa_value(b, w[3]);
u_val->ssa = vtn_create_ssa_value(b, u_val->type->type);
u_val->ssa->def = nir_sloppy_bitcast(&b->nb, ptr_ssa, u_val->type->type);
u_val->ssa->def = nir_sloppy_bitcast(&b->nb, ptr_ssa->def, u_val->type->type);
u_val->ssa->access |= ptr_ssa->access;
break;
}
@@ -2568,6 +2616,8 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
nir_ssa_def *ptr_ssa = nir_sloppy_bitcast(&b->nb, u_val->ssa->def,
ptr_val->type->type);
ptr_val->pointer = vtn_pointer_from_ssa(b, ptr_ssa, ptr_val->type);
vtn_foreach_decoration(b, ptr_val, ptr_decoration_cb, ptr_val->pointer);
ptr_val->pointer->access |= u_val->ssa->access;
break;
}

View File

@@ -1424,6 +1424,37 @@ dri2_surf_update_fence_fd(_EGLContext *ctx,
dri2_surface_set_out_fence_fd(surf, fence_fd);
}
EGLBoolean
dri2_create_drawable(struct dri2_egl_display *dri2_dpy,
const __DRIconfig *config,
struct dri2_egl_surface *dri2_surf)
{
__DRIcreateNewDrawableFunc createNewDrawable;
void *loaderPrivate = dri2_surf;
if (dri2_dpy->image_driver)
createNewDrawable = dri2_dpy->image_driver->createNewDrawable;
else if (dri2_dpy->dri2)
createNewDrawable = dri2_dpy->dri2->createNewDrawable;
else if (dri2_dpy->swrast)
createNewDrawable = dri2_dpy->swrast->createNewDrawable;
else
return _eglError(EGL_BAD_ALLOC, "no createNewDrawable");
/* As always gbm is a bit special.. */
#ifdef HAVE_DRM_PLATFORM
if (dri2_surf->gbm_surf)
loaderPrivate = dri2_surf->gbm_surf;
#endif
dri2_surf->dri_drawable = (*createNewDrawable)(dri2_dpy->dri_screen,
config, loaderPrivate);
if (dri2_surf->dri_drawable == NULL)
return _eglError(EGL_BAD_ALLOC, "createNewDrawable");
return EGL_TRUE;
}
/**
* Called via eglMakeCurrent(), drv->API.MakeCurrent().
*/
@@ -2627,21 +2658,39 @@ dri2_export_dma_buf_image_query_mesa(_EGLDriver *drv, _EGLDisplay *disp,
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_image *dri2_img = dri2_egl_image(img);
int num_planes;
(void) drv;
if (!dri2_can_export_dma_buf_image(disp, img))
return EGL_FALSE;
dri2_dpy->image->queryImage(dri2_img->dri_image,
__DRI_IMAGE_ATTRIB_NUM_PLANES, &num_planes);
if (nplanes)
dri2_dpy->image->queryImage(dri2_img->dri_image,
__DRI_IMAGE_ATTRIB_NUM_PLANES, nplanes);
*nplanes = num_planes;
if (fourcc)
dri2_dpy->image->queryImage(dri2_img->dri_image,
__DRI_IMAGE_ATTRIB_FOURCC, fourcc);
if (modifiers)
*modifiers = 0;
if (modifiers) {
int mod_hi, mod_lo;
uint64_t modifier = DRM_FORMAT_MOD_INVALID;
bool query;
query = dri2_dpy->image->queryImage(dri2_img->dri_image,
__DRI_IMAGE_ATTRIB_MODIFIER_UPPER,
&mod_hi);
query &= dri2_dpy->image->queryImage(dri2_img->dri_image,
__DRI_IMAGE_ATTRIB_MODIFIER_LOWER,
&mod_lo);
if (query)
modifier = combine_u32_into_u64 (mod_hi, mod_lo);
for (int i = 0; i < num_planes; i++)
modifiers[i] = modifier;
}
return EGL_TRUE;
}

View File

@@ -322,13 +322,14 @@ struct dri2_egl_surface
__DRIimage *dri_image_front;
/* Used to record all the buffers created by ANativeWindow and their ages.
* Usually Android uses at most triple buffers in ANativeWindow
* so hardcode the number of color_buffers to 3.
* Allocate number of color_buffers based on query to android bufferqueue
* and save color_buffers_count.
*/
int color_buffers_count;
struct {
struct ANativeWindowBuffer *buffer;
int age;
} color_buffers[3], *back;
} *color_buffers, *back;
#endif
#if defined(HAVE_SURFACELESS_PLATFORM)
@@ -540,6 +541,11 @@ dri2_init_surface(_EGLSurface *surf, _EGLDisplay *disp, EGLint type,
void
dri2_fini_surface(_EGLSurface *surf);
EGLBoolean
dri2_create_drawable(struct dri2_egl_display *dri2_dpy,
const __DRIconfig *config,
struct dri2_egl_surface *dri2_surf);
static inline uint64_t
combine_u32_into_u64(uint32_t hi, uint32_t lo)
{

Some files were not shown because too many files have changed in this diff Show More