Comparing 90a7a9c973...cc88eeb6ff - mesa

fran/mesa

Author	SHA1	Message	Date
Juan A. Suarez Romero	cc88eeb6ff	docs: add release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-21 19:10:28 +02:00
Juan A. Suarez Romero	5c6d266c59	docs: add release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-21 13:55:11 +02:00
Juan A. Suarez Romero	6cffdfd192	Update version to 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-21 11:33:41 +00:00
Alan Coopersmith	dfcde49122	intel/common: include unistd.h for ioctl() prototype on Solaris Fixes build errors of: In file included from ../src/intel/vulkan/anv_private.h:48, from ../src/intel/vulkan/genX_blorp_exec.c:26: ../src/intel/common/gen_gem.h: In function ‘gen_ioctl’: ../src/intel/common/gen_gem.h:68:15: error: implicit declaration of function ‘ioctl’ [-Werror=implicit-function-declaration] 68 \| ret = ioctl(fd, request, arg); \| ^~~~~ In file included from ../include/c11/threads_posix.h:35, from ../include/c11/threads.h:66, from ../src/mesa/main/mtypes.h:39, from ../src/intel/compiler/brw_compiler.h:30, from ../src/intel/vulkan/anv_private.h:51, from ../src/intel/vulkan/genX_blorp_exec.c:26: /usr/include/unistd.h: At top level: /usr/include/unistd.h:471:12: error: conflicting types for ‘ioctl’ 471 \| extern int ioctl(int, int, ...); \| ^~~~~ /usr/include/unistd.h:471:1: note: a parameter list with an ellipsis can’t match an empty parameter name list declaration 471 \| extern int ioctl(int, int, ...); \| ^~~~~~ In file included from ../src/intel/vulkan/anv_private.h:48, from ../src/intel/vulkan/genX_blorp_exec.c:26: ../src/intel/common/gen_gem.h:68:15: note: previous implicit declaration of ‘ioctl’ was here 68 \| ret = ioctl(fd, request, arg); \| ^~~~~ Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `6804b8e1ff`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/intel/common/gen_gem.h	2019-10-16 17:36:16 +02:00
Alan Coopersmith	9c100e31a2	meson: recognize "sunos" as the system name for Solaris Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (cherry picked from commit `d8a9420f6f`) [Juan A. Suarez: resolve trivial conflict] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: meson.build	2019-10-16 15:32:51 +00:00
Matt Turner	2fd001f21e	util: Drop preprocessor guards for glibc-2.12 glibc-2.12 was released in 2010. No one is building new Mesa against 9 year old glibc, and removing these checks allows the code to work on other C libraries like musl. Acked-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `9c411e020d`)	2019-10-16 15:26:21 +00:00
Alan Coopersmith	13120904e4	util: Workaround lack of flock on Solaris v2: Replace autoconf check for flock() with meson check Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `b3028a9fb8`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: meson.build	2019-10-16 14:52:59 +00:00
Alan Coopersmith	740b0e9dc7	util: Make Solaris implemention of p_atomic_add work with gcc gcc is very particular about where you place the (void) cast The previous placement made it error out with: In file included from disk_cache.c:40:0: ../../src/util/u_atomic.h:203:29: error: void value not ignored as it ought to be #define p_atomic_add(v, i) ((void) \ ^ disk_cache.c:658:4: note: in expansion of macro ‘p_atomic_add’ p_atomic_add(cache->size, size); ^ Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `a56c3e3a47`)	2019-10-16 14:49:34 +00:00
Alan Coopersmith	9eaa6998cc	c99_compat.h: Don't try to use 'restrict' in C++ code Fixes build failures on Solaris in C++ files using gcc: ../src/util/u_math.h:628:41: error: expected ‘,’ or ‘...’ before ‘dest’ 628 \| util_memcpy_cpu_to_le32(void * restrict dest, const void * restrict src, size_t n) \| ^~~~ ../src/util/u_math.h: In function ‘void* util_memcpy_cpu_to_le32(void*)’: ../src/util/u_math.h:641:18: error: ‘dest’ was not declared in this scope 641 \| return memcpy(dest, src, n); \| ^~~~ ../src/util/u_math.h:641:24: error: ‘src’ was not declared in this scope 641 \| return memcpy(dest, src, n); \| ^~~ ../src/util/u_math.h:641:29: error: ‘n’ was not declared in this scope; did you mean ‘yn’? 641 \| return memcpy(dest, src, n); \| ^ \| yn Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `ddde652e70`)	2019-10-16 14:46:47 +00:00
Juan A. Suarez Romero	e56b3afd2d	cherry-ignore: Revert "radv: disable viewport clamping even if FS doesn't write Z" Revert: this commit was explicitly requested to be removed from the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-07 14:30:05 +00:00
Lionel Landwerlin	d169d0df0e	intel/isl: Set null surface format to R32_UINT It appears we never had a test in piglit or deqp sampling from a null surface... It turns out this triggers a hang on IVB only. Updating the null surface format to R32_UINT fixes the hang on ivb and doesn't affect other platforms, so set it by default for all platforms. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `c445d6f66e`)	2019-10-07 16:27:08 +02:00
Lionel Landwerlin	3d763e801c	intel: fix subslice computation from topology data We're missing the offset of the slice in the subslice mask... This worked for most platforms that don't have first slice fused off because we would reread the same mask from slice0 again and again... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c1900f5b0f` ("intel: devinfo: add helper functions to fill fusing masks values") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869 Reviewed-by: Mark Janes <mark.a.janes@intel.com> (cherry picked from commit `d36763b2a4`)	2019-10-07 16:27:07 +02:00
Prodea Alexandru-Liviu	9b75c1eaef	scons/MSYS2-MinGW-W64: Fix build options defaults Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> When building in a MSYS2 Mingw-w64 environment Mesa3D sets wrong default build options which inevitably lead to build failure. (cherry picked from commit `6309c31fd8`)	2019-10-07 16:27:07 +02:00
Dylan Baker	7e3d942403	meson: Only error building gallium video without libdrm when the platform is drm Fixes: `3b265f61f5` ("meson: gallium media state trackers require libdrm with x11") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1878 Tested-by: Vinson Lee <vlee@freedesktop.org> (cherry picked from commit `1481d05409`)	2019-10-07 16:27:07 +02:00
Andres Gomez	142e51da08	egl: Remove the 565 pbuffer-only EGL config under X11. The CTS finally has agreed to drop the requirement for a 565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the code to satisfy this requirement using a pbuffer-only visual with whatever other buffers the driver happens to have given us. This reverts commit `82607f8a90`, commit `6ad31c4ff3` and commit `dacb11a585`. v2: - Reference the VK-GL-CTS issue (Eric E.). v3: - Don't revert `fc21394bc4` ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers") (Kenneth). References: VK-GL-CTS issue 1601. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `02c265be9d`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/egl/drivers/dri2/platform_x11.c	2019-10-07 16:27:07 +02:00
Juan A. Suarez Romero	844c594837	cherry-ignore: radv: Fix condition for skipping the continue CS. Fixes: this commit depends on commit `e1dc3ab753` in order to compile, which did not land in the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-07 16:27:07 +02:00
Lionel Landwerlin	9fc609585d	mesa: don't forget to clear _Layer field on texture unit On the Android Antutu benchmark we ran into an assert in ISL where the (base layer + num layers) > total layers. It turns out the core of mesa forgot to clear the _Layer variable, potentially leaving an inconsistent value. v2: Pull setting u->_Layer out of the conditional blocks (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `2208d79dde`)	2019-10-02 09:41:27 -04:00
Ken Mays	ab1ae12790	haiku: fix Mesa build 1. The hgl.c file is a read-only file versus read-write. Ref: src/gallium/state_trackers/hgl/hgl.c 2. I've included the Haiku-specific patches I used to get a successful build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure. Shows "[764/764] linking target ... libswpipe.so" at build completion. v2: Remove autotools files (Eric) v3: Update the patch Reported-by: Ken Mays <kmays2000@gmail.com> Tested-by: Ken Mays <kmays2000@gmail.com> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com> (cherry picked from commit `4943c89d6d`)	2019-10-02 09:41:27 -04:00
Kenneth Graunke	9bc34d54db	iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets. We can't just check for the BO base address, we need to check for the full address including any offset we may have applied. When updating the address, we need to include the offset again. Fixes: `5ad0c88dbe` ("iris: Replace buffer backing storage and rebind to update addresses.") (cherry picked from commit `309924c3c9`)	2019-10-02 09:41:27 -04:00
Dylan Baker	f7338bfe1f	meson: gallium media state trackers require libdrm with x11 v2: - update copyright year in all changed files - rebase on master Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `3b265f61f5`)	2019-10-02 09:41:27 -04:00
Kenneth Graunke	2d584d7386	iris: Disable CCS_E for 32-bit floating point textures. A while back, Michael Larabel noticed that Paraview's Wavelet Volume case runs significantly slower on iris than i965. It turns out this is because we enable CCS_E for 32-bit floating point formats, while i965 disables it, with an oblique comment saying that we benchmarked it (on what exactly?) and determined that it was a loss. Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed large framerate drops when enabling CCS_E for either format. However, several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit floating point formats, with no apparent ill effects. So, disable compression for 32-bit float formats for now, but leave it enabled for 16-bit float formats as they seem to be working fine. Improves performance in Paraview's Wavelet Volume test by 62% on a Skylake GT4e. Fixes: `3cfc6a207b` ("iris: Fill out res->aux.possible_usages") (cherry picked from commit `a0a93763fb`)	2019-10-02 09:41:27 -04:00
pal1000	2739dd9621	scons: Fix MSYS2 Mingw-w64 build. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> This patch is based on `28e3f85e09/mingw-w64-mesa/link-ole32.patch` but with tweaks to avoid MSVC build break when applied. v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation; v3: Fix obviously wrong compiler flags for swr driver; v4: Update original patch URL because it has been relocated; v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway; v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place. (cherry picked from commit `ffb0d3a25c`)	2019-10-02 09:41:27 -04:00
pal1000	e53ca66c4a	scons/windows: Support build with LLVM 9. As X86AsmPrinter component is gone, LLVMX86AsmPrinter got replaced with LLVMRemarks, LLVMBitstreamReader and LLVMDebugInfoDWARF. Tests done with llvm-config on both LLVM 8 and 9 indicate that mcjit, bitwriter and x86asmprinter fully fit inside engine component. On other platforms and with meson build mcdisassembler was used to replace X86AsmPrinter but mcdisassembler also fully fits inside engine component for LLVM>=8 according to same tests. v2: Avoid duplicating code related to Mingw pthreads. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> On 19.1 this patch does not apply cleanly without `88eb2a1f` (cherry picked from commit `bcb4dfb14b`)	2019-10-02 09:41:27 -04:00
Michel Zou	7c0ce1b35e	scons: For MinGW use -posix flag. Signed-off-by: Jose Fonseca <jfonseca@vmware.com> (cherry picked from commit `88eb2a1f7e`)	2019-10-02 09:41:27 -04:00
Michel Zou	09ba783aea	scons: add py3 support SCons 3.1 has moved to python 3, requiring this fix to continue supporting scons builds. Closes: #944 Cc: mesa-stable@lists.freedesktop.org Acked-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `3f92d17894`)	2019-10-02 09:41:27 -04:00
Andrii Simiklit	70ef5d63f7	glsl: disallow incompatible matrices multiplication glsl 4.4 spec section '5.9 expressions': "The operator is multiply (), where both operands are matrices or one operand is a vector and the other a matrix. A right vector operand is treated as a column vector and a left vector operand as a row vector. In all these cases, it is required that the number of columns of the left operand is equal to the number of rows of the right operand. Then, the multiply () operation does a linear algebraic multiply, yielding an object that has the same number of rows as the left operand and the same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations” explains in more detail how vectors and matrices are operated on." This fix disallows a multiplication of incompatible matrices like: mat4x3(..) * mat4x3(..) mat4x2(..) * mat4x2(..) mat3x2(..) * mat3x2(..) .... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> (cherry picked from commit `b32bb888c7`)	2019-10-02 09:41:27 -04:00
Jason Ekstrand	f041840367	intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates Without this, we were DCEing flag writes because we didn't think their results were used because we didn't understand that an ANY32 predicate actually read all the flags. Fixes: `df1aec763e` "i965/fs: Define methods to calculate the flag..." Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `6c858b9a91`)	2019-10-02 09:41:27 -04:00
Dylan Baker	4a50b8add1	meson: Link xvmc with libxv Prior to xvmc 1.0.12 libxvmc incorrectly required libxv, but that was fixed. This results in compilation failures for the gallium xvmc tracker and tools. This patch fixes that by explicitly linking to libxv. Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1844 Reviewed-by: Adam Jackson <ajax@redhat.com> (cherry picked from commit `e456a053c3`)	2019-10-02 09:41:27 -04:00
Dylan Baker	b30a0afc0c	meson: Try finding libxvmcw via pkg-config before using find_library This fixes cross compiling issues, because pkg-config is less likely to get the wrong libs. v2: - Fix typo in comment Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/939 Reviewed-by: Adam Jackson <ajax@redhat.com> (cherry picked from commit `8c5c21d7e3`)	2019-10-02 09:41:27 -04:00
Andreas Gottschling	8118131f37	drisw: Fix shared memory leak on drawable resize XDestroyImage will mark the segment as to-be-destroyed, but it will persist until we detach it, and we weren't doing so. Cc: mesa-stable@lists.freedesktop.org Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/121 Reviewed-by: Adam Jackson <ajax@redhat.com> (cherry picked from commit `c5a2ccec5e`)	2019-10-02 09:41:27 -04:00
Michel Dänzer	950d167026	radeonsi: fix VAAPI segfault due to various bugs Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236 (cherry picked from commit `67d930d64b`)	2019-10-02 09:41:27 -04:00
Dylan Baker	e37019723f	meson: fix logic for generating .pc files with old glvnd We want to generate PC files for non-glvnd builds and for builds with old glvnd, but the current logic doesn't do that, it builds them unconditionally, and for GLES it builds the shared libraries, which is also not what we want. This does not generate .pc files for gles1 or gles2. Which it we weren't doing before either, making this not a regression but a return to status-quo.o Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838 Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `fafd20f67d`)	2019-10-02 09:41:27 -04:00
Lionel Landwerlin	450b808eea	intel: use proper label for Comet Lake skus Fixes: `82f6a746e8` ("intel: Add support for Comet Lake") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `813f3460e7`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: include/pci_ids/i965_pci_ids.h	2019-10-02 09:41:27 -04:00
Lionel Landwerlin	3b927c447f	anv: gem-stubs: return a valid fd got anv_gem_userptr() Fixes invalid close(-1) in the unit tests. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `da2d67fc3b`)	2019-10-02 09:41:27 -04:00
Tapani Pälli	52dc974cd1	util: fix os_create_anonymous_file on android Commit fixes current crashes with Vulkan applications on Android. Fixes: `c0376a1234` "util: add anon_file.h for all memfd/temp file usage" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `ce8fd042a5`)	2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero	26ab4e1614	cherry-ignore: util: added missing headers in anon-file Fixes: The commit was reverted later. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-02 09:41:27 -04:00
Eric Engestrom	e5e81d6530	util/anon_file: const string param Fixes: `c0376a1234` ("util: add anon_file.h for all memfd/temp file usage") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Eric Anholt <eric@anholt.net> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> (cherry picked from commit `525a917c6c`)	2019-10-02 09:41:27 -04:00
Eric Engestrom	b13396622c	util/anon_file: add missing #include Fixes: `c0376a1234` ("util: add anon_file.h for all memfd/temp file usage") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Eric Anholt <eric@anholt.net> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> (cherry picked from commit `60af7f5a81`)	2019-10-02 09:41:27 -04:00
Greg V	bb22ac12d6	util: add anon_file.h for all memfd/temp file usage Move the Weston os_create_anonymous_file code from egl/wayland into util, add support for Linux memfd and FreeBSD SHM_ANON, use that code in anv/aubinator instead of explicit memfd calls for portability. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `c0376a1234`)	2019-10-02 09:41:27 -04:00
Danylo Piliaiev	2963e9fa3d	st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader RET as a last instruction could be safely ignored. Remove it to prevent crashes/warnings in case underlying driver doesn't implement arbitrary returns. A better way would be to remove the RET after the whole shader is parsed which will handle a possible case when the last RET is followed by a comment. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com> (cherry picked from commit `2d8f77db83`)	2019-10-02 09:41:27 -04:00
Eric Engestrom	a74657d4aa	meson: re-add incorrect pkg-config files with GLVND for backward compatibility This is a bit counter-intuitive, but the issue is that GLVND is broken in versions <= 1.1.1, so we need to keep wrongly providing these files to cover up their mistake, otherwise the rest of the world ends up broken. Suggested-by: Dylan Baker <dylan@pnwbakers.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (cherry picked from commit `93df862b6a`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/egl/meson.build	2019-10-02 09:41:27 -04:00
Erik Faye-Lund	3a0b77e3f7	glsl: correct bitcast-helpers Without this, we'll incorrectly round off huge values to the nearest representable double instead of keeping it at the exact value as we're supposed to. Found by inspecting compiler-warnings. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `85faf5082f` ("glsl: Add 64-bit integer support for constant expressions") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `88f909eb37`)	2019-10-02 09:41:27 -04:00
Rhys Perry	ef35babd33	nir/opt_remove_phis: handle phis with no sources This can happen with loops with unreachable exits which are later optimized away. Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `12372d60ff`)	2019-10-02 09:41:27 -04:00
Marek Olšák	954ace9e3e	gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH because vl doesn't call flush_resource and I wasn't able to find all places where flush_resource needs to be called. This fixes corrupted / unflushed surfaces with fullscreen videos on Raven. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f52afdf672`)	2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero	4f48aaf50a	cherry-ignore: nir/opt_large_constants: Handle store writemasks Fixes: This commit does not apply cleanly on 19.1 branch, as it depends on other commits not present in the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-02 09:41:27 -04:00
Stephen Barber	d34ae3876a	nouveau: add idep_nir_headers as dep for libnouveau Fixes a compilation error when building libnouveau: In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25: ../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory #include "nir_intrinsics.h" ^~~~~~~~~~~~~~~~~~ compilation terminated. Fixes: `f014ae3c7c` ("nouveau: add support for nir") Signed-off-by: Stephen Barber <smbarber@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> (cherry picked from commit `8c3ace6991`)	2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero	895f0a2ca2	bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars The script only handles commits with "Fixes: <sha1>" where <sha1> is equal or great than 8 chars. But <sha1> can be smaller, like 7 chars. This commit relax the restriction to handle <sha1> 4 or more chars. Fixes: `533fead423` ("bin/get-pick-list.sh: tweak the commit sha matching pattern") Acked-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `b3c25e6f99`)	2019-10-02 09:41:27 -04:00
Bas Nieuwenhuizen	9ee9251ef8	radv: Add workaround for hang in The Surge 2. Released today and hangs on RADV. We don't have the root cause yet, but this should unblock people playing the game. No drirc because the radv debugflags are not usable from drirc and I want this backported. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `780182f0a0`)	2019-10-02 09:41:27 -04:00
Kenneth Graunke	b977c7444c	intel: Increase Gen11 compute shader scratch IDs to 64. From the MEDIA_VFE_STATE docs: "Starting with this configuration, the Maximum Number of Threads must be set to (#EU * 8) for GPGPU dispatches. Although there are only 7 threads per EU in the configuration, the FFTID is calculated as if there are 8 threads per EU, which in turn requires a larger amount of Scratch Space to be allocated by the driver." It's pretty clear that we need to increase this for scratch address calculations, because the FFTID has a certain bit-pattern. The quote above seems to indicate that we should increase the actual thread count programmed in MEDIA_VFE_STATE as well, but we think the intention is to only bump the scratch space. Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8. Fixes: `5ac804bd9a` ("intel: Add a preliminary device for Ice Lake") Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `b9e93db208`)	2019-10-02 09:41:27 -04:00
Jason Ekstrand	c94e5e3ee1	nir/repair_ssa: Replace the unreachable check with the phi builder In `a3268599f3`, I attempted to fix nir_repair_ssa for unreachable blocks. However, that commit missed the possibility that the use is in a block which, itself, is unreachable. In this case, we can end up in an infinite loop trying to replace a def with itself. Even though a no-op replacement is a fine operation, it keeps extending the end of the uses list as we're walking it. Instead of explicitly checking for the group of conditions, just check if the phi builder gives us a different def. That's guaranteed to be 100% reliable and, while it lacks symmetry with the is_valid checks, should be more reliable. Fixes: `a3268599` "nir/repair_ssa: Repair dominance for unreachable..." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `d63162cff0`)	2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero	c140c260c1	cherry-ignore: Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP" revert: The following commit was requested to be removed from stable branch by original author. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero	34ee0eb6cc	Revert "Revert "intel/fs: Move the scalar-region conversion to the generator."" This reverts commit `667920050a`. This commit was breaking Xorg rendering in all Icelake devices. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/795 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-02 09:41:27 -04:00
Hal Gentz	db6974fa2a	gallium/osmesa: Fix the inability to set no context as current. Currently there is no way to make no context current w/gallium + osmesa. The non-gallium version of osmesa does this if the context and buffer passed to `OSMesaMakeCurrent` are both null. This small change makes it so that this is also the case with the gallium version. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Hal Gentz <zegentzy@protonmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `57c894334e`)	2019-10-02 09:41:27 -04:00
Andres Gomez	9e754647ba	docs/features: Update VK_KHR_display_swapchain status It was set as done by mistake. Fixes: `bc15d74529` ("docs/features: Mark some Vulkan extensions as done") Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `bcd9224728`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: docs/features.txt	2019-10-02 09:41:27 -04:00
Adam Jackson	0b8c7cf51c	docs: Update bug report URLs for the gitlab migration Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `5b5c5bf833`)	2019-10-02 09:41:27 -04:00
Arcady Goldmints-Orlov	7e6dd1ce7a	anv: fix descriptor limits on gen8 Later generations support bindless for samplers, images, and buffers and thus per-stage descriptors are not limited by the binding table size. However, gen8 doesn't support bindless images and thus needs to report a lower per-stage limit so that all combinations of descriptors that fit within the advertised limits are reported as supported by vkGetDescriptorSetLayoutSupport. Fixes test dEQP-VK.api.maintenance3_check.descriptor_set Fixes: `79fb0d27f3` ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `5ec5fecc26`)	2019-10-02 09:41:27 -04:00
Tapani Pälli	1ee0add2a5	egl: check for NULL value like eglGetSyncAttribKHR does Commit `d1e1563bb6` added a NULL check for eglGetSyncAttribKHR but eglGetSyncAttrib does not do this. Patch adds same check to happen with eglGetSyncAttrib. Fixes crashes in (when exposing EGL 1.5): dEQP-EGL.functional.fence_sync.invalid.get_invalid_value Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `99cbec0a5f`)	2019-10-02 09:41:27 -04:00
Paulo Zanoni	3b0e591228	intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32 The current code can create functions with a width of 32, which is not supported by our hardware. Add some code to simplify how we express what we want and prevent such cases. For some unknown reason, all the tests I could run seem to work even with these unsupported MOVs. Fixes: `b0858c1cc6` "intel/fs: Add a couple of simple helper opcodes" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (cherry picked from commit `8e614c7a29`)	2019-10-02 09:41:27 -04:00
Eric Engestrom	5936c8f4f4	gl: drop incorrect pkg-config file for glvnd Akin to `1a25980c46` ("egl: drop incorrect pkg-config file for glvnd") and `b01524fff0` ("meson: don't build libGLES*.so with GLVND") , removes a pkg-config file that shouldn't have been there in the first place, but was needed because of that GLVND bug. Now that the glvnd bug has been fixed, it was apparent that this gl.pc pkg-config file was forgotten to be removed, so let's do just that :) Suggested-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `a1de3011f3`)	2019-10-02 09:41:27 -04:00
Juan A. Suarez Romero	8748747007	cherry-ignore: add explicit 19.3 only nominations Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-02 09:41:24 -04:00
Andres Gomez	ac31da3529	docs: Add the maximum implemented Vulkan API version in 19.1 rel notes Currently, Vulkan 1.1. Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (cherry picked from commit `d2db43fcad`)	2019-10-02 09:40:28 -04:00
Bas Nieuwenhuizen	97af29d6da	tu: Set up glsl types. Addresses this assert: deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type glsl_type::get_interface_instance(const glsl_struct_field , unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed. running dEQP-VK.api.smoke.triangle . Fixes: `624789e370` "compiler/glsl: handle case where we have multiple users for types" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `7999e10cab`)	2019-10-02 09:40:28 -04:00
Haihao Xiang	81a4483465	i965: support AYUV/XYUV for external import only Fixes: `89785e2d56` ("i965: add support for sampling from AYUV") Fixes: `7cab8d3661` ("i965: Add support for sampling from XYUV images") Cc: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `8a9b81ab9d`)	2019-10-02 09:40:28 -04:00
Samuel Iglesias Gonsálvez	00651091fa	intel/nir: do not apply the fsin and fcos trig workarounds for consts If we have fsin or fcos trigonometric operations with constant values as inputs, we will multiply the result by 0.99997 in brw_nir_apply_trig_workarounds, making the result wrong. Adjusting the rules so they do not apply to const values we let a later constant fold to deal with it. v2: - Do not early constant fold but only apply the trig workaround for non constants (Caio). - Add fixes tag to commit log (Caio). Fixes: `bfd17c76c1` "i965: Port INTEL_PRECISE_TRIG=1 to NIR." Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `3c474f8513`)	2019-10-02 09:40:28 -04:00
Tapani Pälli	d11d2c6def	iris: close screen fd on iris_destroy_screen Otherwise it never gets closed, this fixes errors seen with deqp-egl where we end up opening 1024 files. Fixes: `2dce0e94` ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `631255387f`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/drivers/iris/iris_screen.c	2019-10-02 09:40:28 -04:00
Rhys Perry	44c38ecd27	radv: always emit a position export in gs copy shaders Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f8d0337299` ('radv: add multiple streams support for the GS copy shader') Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `ffabcbba60`)	2019-10-02 09:40:28 -04:00
Juan A. Suarez Romero	3859e211db	cherry-ignore: add explicit 19.2 only nominations Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-02 09:40:26 -04:00
Kenneth Graunke	4d45999a24	iris: Initialize ice->state.prim_mode to an invalid value It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we fail to notice an initial primitive of points being new, and fail at updating the "primitive is points or lines" field. We do not need to reset this on device loss because we're tracking the last primitive mode sent to us on the CPU via draw_vbo, not the last primitive mode sent to the GPU. Fixes several tests: - dEQP-GLES3.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner Fixes: `dcfca0af7c` ("iris: Set XY Clipping correctly.") (cherry picked from commit `c9fb704f72`)	2019-09-19 07:55:32 +00:00
Juan A. Suarez Romero	b9d7244035	docs: add sha256 checksums for 19.1.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-17 12:53:06 +02:00
Juan A. Suarez Romero	f632aac938	docs: add release notes for 19.1.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-17 12:31:43 +02:00
Juan A. Suarez Romero	952502893a	Update version to 19.1.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-17 10:28:45 +00:00
Samuel Pitoiset	2959de1f92	radv: fix allocating number of user sgprs if streamout is used streamout_buffers is assigned after that function, so the previous fix was completely wrong. This probably fix something when streamout buffers and push constants are used/inlined in the same shader. Fixes: `378e2d2414` ("radv: fix computing number of user SGPRs for streamout buffers") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `8137df3a46`) [Juan A. Suarez: fix the structure usage] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-13 08:18:29 +00:00
Danylo Piliaiev	30689e7da8	tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made it for drivers impossible to have flatshaded color inputs. Translate it to INTERP_MODE_NONE which drivers interpret as smooth or flat depending on flatshading state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467 Fixes: `770faf54` ("tgsi_to_nir: Improve interpolation modes.") Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `175c32e9bd`)	2019-09-13 08:06:17 +00:00
Kenneth Graunke	9556c5b1a2	gallium: Fix util_format_get_depth_only This is a pipe format, not a boolean. Fixes: `5849e0612c` ("gallium/auxiliary: Add util_format_get_depth_only() helper.") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `c6d40b5182`)	2019-09-12 08:25:22 +00:00
Caio Marcelo de Oliveira Filho	83dadcfdc6	glsl/nir: Avoid overflow when setting max_uniform_location Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned max_uniform_location. Those unmapped uniforms don't have to be accounted at this point. Fixes: `7a9e5cdfbb` ("nir/linker: Add gl_nir_link_uniforms()") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `4f33f96c45`)	2019-09-12 08:22:46 +00:00
Mauro Rossi	1f7629a760	android: anv: libmesa_vulkan_common: add libmesa_util static dependency Change needed to fix the following building error: In file included from external/mesa/src/intel/vulkan/anv_device.c:43: external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `4dcb1ff` ("anv: add support for driconf") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `ae5ac26dfa`)	2019-09-10 08:46:03 +00:00
Erik Faye-Lund	5b85ecce0b	util: fix SSE-version needed for double opcodes This code generates CVTSD2SI, which requires SSE2. So let's fix the required SSE-version. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `5de29ae` (util: try to use SSE instructions with MSVC and 32-bit gcc) Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `2ade1c5cf7`)	2019-09-10 08:43:12 +00:00
Mauro Rossi	c34a479cc3	android: amd/common: fix missing include path Fixes the following building error in Android: In file included from external/mesa/src/amd/common/ac_llvm_helper.cpp:34: In file included from external/mesa/src/amd/common/ac_llvm_build.h:30: In file included from external/mesa/src/compiler/nir/nir.h:40: In file included from external/mesa/src/compiler/nir_types.h:36: external/mesa/src/compiler/glsl_types.h:37:10: fatal error: 'main/config.h' file not found ^~~~~~~~~~~~~~~ 1 error generated. Fixes: `bd4c661` ("ac,ac/nir: use a better sync scope for shared atomics") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `bbbbea243a`)	2019-09-10 08:42:05 +00:00
Mauro Rossi	51fc954c90	android: radv: fix necessary dependecies Fixes building errors due to libmesa_util and libexpat dependencies: In file included from external/mesa/src/amd/vulkan/radv_device.c:52: external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so ... external/mesa/src/util/xmlconfig.c:670: error: undefined reference to 'XML_ParserCreate' ... clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `3c2e826` ("radv: Add support for driconf.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `51e24af8fd`)	2019-09-10 08:40:02 +00:00
Juan A. Suarez Romero	b4fd0bae5c	cherry-ignore: add explicit 19.2 only nominations Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-10 08:05:27 +00:00
Jason Ekstrand	05beca4dcf	nir/dead_cf: Repair SSA if the pass makes progress The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `c832820ce9`)	2019-09-09 11:24:07 +00:00
Jason Ekstrand	d77aa3cc1c	nir/repair_ssa: Insert deref casts when needed Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `1005272a2b`)	2019-09-09 11:22:10 +00:00
Jason Ekstrand	ff8122d5a2	nir/repair_ssa: Repair dominance for unreachable blocks NIR currently assumes that unreachable blocks are trivially dominated by everything. However, when considering well-formed SSA, there is no path from any block to an unreachable block. Therefore, we can break any use-def chains where the use is in an unreachable block. This removes any dependencies on code created by uses in unreachable blocks and lets DCE do a better job of cleaning it up. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `a3268599f3`)	2019-09-09 11:19:38 +00:00
Jason Ekstrand	18005d8fd3	nir: Add a block_is_unreachable helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `f81a2623d8`)	2019-09-09 11:15:17 +00:00
Jason Ekstrand	01d452de58	nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `517142252f`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/compiler/nir/nir_from_ssa.c	2019-09-09 13:10:13 +02:00
Eric Engestrom	431f5a8a78	radv: add support for vk_x11_override_min_image_count Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `5eb7d48b58`)	2019-09-09 11:04:32 +00:00
Eric Engestrom	89d1ca343f	amd: move adaptive sync to performance section, as it is defined in xmlpool Fixes: `3844ed8d44` ("radv: Add adaptive_sync driconfig option and enable it by default.") Fixes: `e260493f2a` ("radeonsi: Enable adaptive_sync by default for radeon") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `4ad99ee961`)	2019-09-09 11:00:05 +00:00
Eric Engestrom	a150bb7e03	anv: add support for vk_x11_override_min_image_count Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `037b5b567f`)	2019-09-09 10:54:52 +00:00
Eric Engestrom	4d5bcb4c33	wsi: add minImageCount override Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `a72cdd00ab`)	2019-09-09 10:51:27 +00:00
Eric Engestrom	2977a3e0e1	anv: add support for driconf No option is supported yet, this is just the boilerplate. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `4dcb1fff19`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/intel/vulkan/meson.build	2019-09-09 12:31:46 +02:00
Jason Ekstrand	82edaa5a41	anv: Bump maxComputeWorkgroupSize Fixes: `9a129510f5` "anv: Bump maxComputeWorkgroupInvocations" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `3b1a7e5333`)	2019-09-09 10:22:53 +00:00
Jason Ekstrand	667920050a	Revert "intel/fs: Move the scalar-region conversion to the generator." This reverts commit `c0504569ea`. Now that we're doing interpolation lowering in NIR, we can continue to stride the FS input registers directly in the brw_fs_nir code like we did before. This fixes SIMD32 fragment shaders which broke because lower_simd_width depended on the 0 stride to split PLN instructions correctly. Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit `d15fe8ca82`)	2019-09-06 10:35:49 +00:00
Sergii Romantsov	48f78dfce2	intel/dri: finish proper glthread KWin was able to get NULL-context in the call intelUnbindContext. But a call _mesa_glthread_finish is not resistent to such case. Case can be catched with steps: 1. Create both glx and egl contexts 2. Make glx as current 3. Make egl as current 4. Reset glx context 5. Make egl as current Solution adds proper finishing of glthread-context (context will be taken from the requested dri-context for unbinding, but not from the saved current context). Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87 Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271 Fixes: `dca36d5516` (i965: Implement threaded GL support) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `1dce75c183`)	2019-09-06 10:34:06 +00:00
Connor Abbott	d74ccd46fa	radv: Call nir_propagate_invariant() Without this, invariant qualifiers don't do anything. Together with a fix to the game, this fixes flickering in No Man's Sky. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `3f5b541fc8`)	2019-09-06 10:32:27 +00:00
Hal Gentz	35d435235a	glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX. When run in optirun, applications that linked to `libGLX.so` and then proceeded to querying Mesa for extension strings caused a SEGV in Mesa. `glXQueryExtensionsString` was calling a chain of functions that eventually led to `__glXQueryServerString`. This function would call `xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`. The latter for some unknown reason returned `NULL`. Passing this `NULL` to `xcb_glx_query_server_string_string_length` would cause a SEGV as the function tried to dereference it. The reason behind the function returning `NULL` is yet to be determined, however, simply checking that the ptr is not `NULL` resolves this. A similar check has been added to `__glXGetString` for completeness sake, although not immediately necessary. In addition to that, we stumbled into a similar problem in `AllocAndFetchScreenConfigs` which tries to access the configs to free them if `__glXQueryServerString` fails. This, of course, SEGVs, because the configs are yet to have been allocated. Simply continuing past the configs if their config ptrs are `NULL` resolves this. We also switch to `calloc` to make sure that the config ptrs are `NULL` by default, and not some uninitialized value. Cc: mesa-stable@lists.freedesktop.org Fixes: `24b8a8cfe8` "glx: implement __glXGetString, hide __glXGetStringFromServer" Fixes: `cb3610e37c` "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it " Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com> (cherry picked from commit `1591d1fee5`)	2019-09-06 10:30:46 +00:00
Eric Engestrom	29159cbf21	nir: fix memleak in error path Fixes: `2cf59861a8` ("nir: Add partial redundancy elimination for compares") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `7659c6197f`)	2019-09-05 16:01:12 +00:00
Eric Engestrom	6a5c36715a	anv: fix format string in error message Fixes: `9775894f10` ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `7abf65aedc`)	2019-09-05 15:59:45 +00:00
Eric Engestrom	3bd87314e2	util/os_file: fix double-close() Fixes: `955c63d364` ("util/os_file: resize buffer to what was actually needed") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `1667360f7d`)	2019-09-05 15:57:55 +00:00
Eric Engestrom	5fcb149a46	egl: fix deadlock in malloc error path Fixes: `cb0980e69a` ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `43d470404c`) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> [Juan A. Suarez: resolve trivial conflicts] Conflicts: src/egl/main/egldriver.c	2019-09-05 16:56:04 +01:00
Eric Engestrom	524373ba99	ttn: fix 64-bit shift on 32-bit `1` Fixes: `4d0b2c7aaa` ("ttn: Update shader->info as we generate code.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `3afe9d798a`)	2019-09-05 15:49:07 +00:00
Lionel Landwerlin	4115781efa	vulkan/overlay: bounce image back to present layout Once we write the overlay to an image to be presented, we must not forget to put it back into present layout. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `320b0f66c2`)	2019-09-03 11:37:15 +00:00
Erik Faye-Lund	d16ab58f50	gallium/auxiliary/indices: consistently apply start only to input The majority of these only apply the start argument to the input, but a few of them also does for the output-array. util_primconvert, the only user of this argument expects this pass a non-zero start-argument does not expect this to be applied to the output; if it is, it will write outside of allocated memory, leading to VRAM corruption. The reason this doesn't seem to have been noticed before, is that no driver currently use util_primconvert to convert a primitive-type to itself, which is the cases where this was broken. But for Zink, this will no longer be true, because we need to eliminate the use of 8-bit index-buffers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `28f3f8d413` ("gallium/auxiliary/indices: add start param") Reviewed-by: Rob Clark <robdclark@chromium.org> (cherry picked from commit `52af1427c6`)	2019-09-03 11:33:57 +00:00
Juan A. Suarez Romero	4ec2325dd0	docs: add sha256 checksums for 19.1.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-03 13:04:25 +02:00
Juan A. Suarez Romero	85c8f88a49	docs: add release notes for 19.1.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-03 12:02:19 +02:00
Juan A. Suarez Romero	d45f8ff429	Update version to 19.1.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-03 09:56:49 +00:00
Pierre-Eric Pelloux-Prayer	52aea45dbc	glsl: replace 'x + (-x)' with constant 0 This fixes a hang in shadertoy for radeonsi where a buffer was initialized with: value -= value with value being undefined. In this case LLVM replace the operation with an assignment to NaN. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `47cc660d9c`)	2019-08-30 07:39:55 +00:00
Ian Romanick	938adab8ea	intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware See the previous commit for the explanation of the Fixes tag. Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal Engine 4 tech demos. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `7afa26d4e3` ("nir: Add lowering for nir_op_bitfield_reverse.") (cherry picked from commit `b418269d7d`) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> [Juan A. Suarez: resolve trivial conflicts] Conflicts: src/intel/compiler/brw_compiler.c	2019-08-29 12:04:34 +02:00
Ian Romanick	759afcacd9	nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled This caused a problem on Sandybridge where an open-coded bitfieldReverse() function could be optimized to a nir_op_bitfield_reverse that would generate an unsupported BFREV instruction in the backend. This was encountered in some Unreal4 tech demos in shader-db. The bug was not previously noticed because we don't actually try to run those demos on Sandybridge. The fixes tag is a bit a lie. The actual bug was introduced about 26,000 commits earlier in `371c4b3c48` ("nir: Recognize open-coded bitfield_reverse."). Without the NIR lowering pass, the flag needed to avoid the optimization does not exist. Hopefully nobody will care to fix this on an earlier Mesa release. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `7afa26d4e3` ("nir: Add lowering for nir_op_bitfield_reverse.") (cherry picked from commit `d3fd1c761a`)	2019-08-29 09:51:14 +00:00
Kenneth Graunke	48a671e269	intel/compiler: Fix src0/desc setter ordering src0 vstride and type overlap with bits of the extended descriptor. brw_set_desc() also sets the extended descriptor to 0. So by setting the descriptor, then setting src0, we were accidentally setting a bunch of extended descriptor bits unintentionally. When using this infrastructure for framebuffer writes (in a future patch), this ended up setting the extended descriptor bit 20, which is "Null Render Target" on Icelake, causing nothing to be written to the framebuffer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `c8c9c48684`)	2019-08-29 09:30:42 +00:00
Kenneth Graunke	6138702dec	mesa: Fix _mesa_float_to_unorm() on 32-bit systems. This fixes the following CTS test on 32-bit systems: GTF-GL46.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_init It does glGetTexImage of a 16-bit SNORM image, requesting 32-bit UNORM data. In get_tex_rgba_uncompressed, we round trip through float to handle image transfer ops for clamping. _mesa_format_convert does: _mesa_float_to_unorm(0.571428597f, 32) which translated to: _mesa_lroundevenf(0.571428597f * 0xffffffffu) which produced different results on 64-bit and 32-bit systems: 64-bit: result = 0x92492500 32-bit: result = 0x80000000 This is because the size of "long" varies between the two systems, and 0x92492500 is too large to fit in a signed 32-bit integer. To fix this, we switch to the new _mesa_i64roundevenf function which always does the 64-bit operation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104395 Fixes: `594fc0f859` ("mesa: Replace F_TO_I() with _mesa_lroundevenf().") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `e18cd5452a`)	2019-08-28 08:27:34 +00:00
Kenneth Graunke	68bd0c7b9d	util: Add a _mesa_i64roundevenf() helper. This always returns a int64_t, translating to _mesa_lroundevenf on systems where long is 64-bit, and llrintf where "long long" is needed. Fixes: `594fc0f859` ("mesa: Replace F_TO_I() with _mesa_lroundevenf().") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `b59914e179`)	2019-08-28 08:22:58 +00:00
Marek Olšák	915a272b5a	radeonsi: fix scratch buffer WAVESIZE setting leading to corruption Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (cherry picked from commit `360cf3c4b0`)	2019-08-28 08:19:30 +00:00
Paulo Zanoni	e4df7ffc23	intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails Looks like a copy/paste error. This patch prevents a segfault when running the following on BDW: INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \ dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4 For the curious, the message we're getting is: CS compile failed: Failure to register allocate. Reduce number of live scalar values to avoid this. Fixes: `864737ce6c` ("i965/fs: Build 32-wide compute shader when needed.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (cherry picked from commit `848d5e444a`) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> [Juan A. Suarez: resolve trivial conflicts] Conflicts: src/intel/compiler/brw_fs.cpp	2019-08-27 10:58:48 +02:00
Jonas Ådahl	955c54cea0	wayland/egl: Ensure correct buffer size when allocating Whenever a buffer is allocated, e.g. by the first draw call or EGL call after a buffer swap, make sure the size is up to date. Prior to this commit, we failed to do so when querying the buffer age, or swapping buffers without any prior EGL call or draw call. Signed-off-by: Jonas Ådahl <jadahl@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `903ad59407`)	2019-08-26 18:39:59 +02:00
Andres Rodriguez	c1959aa26d	radv: additional query fixes Make sure we read the updated data from the gpu in cases where WAIT_BIT is not set. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `a410823b3e`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/amd/vulkan/radv_query.c	2019-08-26 13:30:15 +02:00
Kenneth Graunke	fc8e419619	iris: Fix large timeout handling in rel2abs() ...by copying the implementation of anv_get_absolute_timeout(). Appears to fix a CTS test with 32-bit builds: GTF-GL46.gtf32.GL3Tests.sync.sync_functionality_clientwaitsync_flush Fixes: `f459c56be6` ("iris: Add fence support using drm_syncobj") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `7ee7b0ecbc`)	2019-08-26 09:58:42 +00:00
Tapani Pälli	1c9c540b2a	egl: reset blob cache set/get functions on terminate Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when running dEQP that terminates and reinitializes a display. Fixes: `6f5b57093b` "egl: add support for EGL_ANDROID_blob_cache" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `3e03a3fc53`)	2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero	5369eedf37	cherry-ignore: iris: Avoid unnecessary resolves on transfer maps Fixes: The following commit depends on commits `77a1070d36` and `df4c2ec5e1` in order to compile, which did not land in the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-26 09:58:42 +00:00
Kenneth Graunke	4d3dc92628	iris: Drop copy format hacks from copy region based transfer path. This doesn't work for compressed formats, as the source texture and temporary texture would have different block sizes. (Forcing the driver to always take the GPU path would expose the bug.) Instead, just use the source format for the temporary, and let blorp_copy deal with overrides. The one case where we can't do this is ASTC, because isl won't let us create a linear ASTC surface. Fall back to the CPU paths there for now. Fixes: `9d1334d2a0` ("iris: Use copy_region and staging resources to avoid transfer stalls") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (cherry picked from commit `136629a1e3`)	2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero	4f4a38289b	cherry-ignore: iris: Update fast clear colors on Gen9 with direct immediate writes. Fixes: This commit does not apply cleanly on 19.1 branch, as it depends on other commits not present in the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-26 09:58:42 +00:00
Kenneth Graunke	8ad62264d1	iris: Fix broken aux.possible/sampler_usages bitmask handling For renderable surfaces, we allocate SURFACE_STATEs for each bit in res->aux.possible_usages. Sampler views use res->aux.sampler_usages. When pinning buffers, we call surf_state_offset_for_aux() to calculate the offset to the desired surface state. surf_state_offset_for_aux() took an aux_modes parameter, which should be one of those two fields. However...it was not using that parameter. It always used the broader res->aux.possible_usages field directly. One of the callers, update_clear_value(), was passing incorrect masks for this parameter. It iterated through the bits in order, using u_bit_scan(), which destructively modifies the mask. So each time we called it, the count of bits before our selected mode was 0, which would cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE, rather than updating each in turn. This was hidden by the earlier bug where surf_state_offset_for_aux() ignored the parameter. Fixes: `7339660e80` ("iris: Add aux.sampler_usages.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (cherry picked from commit `117a0368b0`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/drivers/iris/iris_state.c	2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero	c5a3f783d2	cherry-ignore: iris: Replace devinfo->gen with GEN_GEN Fixes: This commit does not apply cleanly on 19.1 branch, as it depends on other commits not present in the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-26 09:58:42 +00:00
Juan A. Suarez Romero	fb69feb0b5	cherry-ignore: add explicit 19.2 only nominations Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-26 09:58:42 +00:00
Danylo Piliaiev	61fb6bca53	nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll Without loop_prepare_for_unroll loops are losing phis. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411 Fixes: `5db98195` "nir: add loop unroll support for wrapper loops" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `84b3ef6a96`)	2019-08-23 11:55:04 +00:00
Ilia Mirkin	ac0f71a4af	gallium/vl: use compute preference for all multimedia, not just blit The compute paths in vl are a bit AMD-specific. For example, they (on nouveau), try to use a BGRX8 image format, which is not supported. Fixing all this is probably possible, but since the compute paths aren't in any way better, it's difficult to care. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Fixes: `9364d66cb7` (gallium/auxiliary/vl: Add video compositor compute shader render) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `958390a9bf`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/auxiliary/util/u_screen.c src/gallium/docs/source/screen.rst src/gallium/drivers/radeonsi/si_get.c src/gallium/include/pipe/p_defines.h	2019-08-23 13:48:50 +02:00
Daniel Schürmann	41e8b0d027	nir/lcssa: handle deref instructions properly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `414148cdc1` "nir: Support deref instructions in loop_analyze" (cherry picked from commit `204846ad06`)	2019-08-23 11:42:10 +00:00
Juan A. Suarez Romero	ae2a676cd1	docs: add sha256 checksums for 19.1.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-23 12:38:02 +02:00
Juan A. Suarez Romero	a384fe0ceb	docs: add release notes for 19.1.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-23 12:24:21 +02:00
Juan A. Suarez Romero	6c37279d09	Update version to 19.1.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-23 10:20:54 +00:00
Marek Olšák	9862fc4941	radeonsi: fix an assertion failure: assert(!res->b.is_shared) This only appears to happen on Raven2. Possible way to reproduce: resource_get_handle(WINSYS_HANDLE_TYPE_KMS) --> sets is_shared = true resource_get_handle(WINSYS_HANDLE_TYPE_DMABUF) --> fail Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `8d0d753bd0`)	2019-08-20 09:30:34 +00:00
Greg V	9c9b92c69a	intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `134e750e16` ("i965: extract performance query metrics") (cherry picked from commit `ac1561088d`)	2019-08-10 09:31:43 +00:00
Greg V	a8105085e9	anv: remove unused Linux-specific include Fixes: `4201cc2dd3` ("anv: Implement VK_KHX_external_semaphore_fd") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `2be3f16600`)	2019-08-10 09:30:21 +00:00
Danylo Piliaiev	3627595e3d	i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395 Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `b8842bc312`)	2019-08-10 09:29:01 +00:00
Bas Nieuwenhuizen	c4ab0e18bb	radv: Avoid VEGA/RAVEN scissor bug in binning. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `23a9d20997`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/amd/vulkan/radv_pipeline.c	2019-08-10 11:27:14 +02:00
Bas Nieuwenhuizen	908d85ffce	radv: Avoid binning RAVEN hangs. Mirroring radeonsi. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `4a3f987afd`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/amd/vulkan/radv_pipeline.c	2019-08-10 11:21:23 +02:00
Erik Faye-Lund	a9cbcf09be	gallium/dump: add missing query-type to short-list Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `3f6b3d9db7` ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE") Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `da9e2958ec`)	2019-08-08 10:32:20 +00:00
Erik Faye-Lund	2f7b1159bd	gallium/dump: add missing query-type to short-list Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `a677799e51` ("gallium: add PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE and corresponding cap") Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `70a93922db`)	2019-08-08 10:30:50 +00:00
Eric Engestrom	d38952ef0d	util: fix mem leak of program path Fixes: `759b940389` ("util: Get program name based on path when possible") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `5b10ddf358`)	2019-08-08 10:28:03 +00:00
Matt Turner	945a217e94	meson: Test for program_invocation_name program_invocation_name and program_invocation_short_name are both GNU extensions. I don't believe one can exist without the other, so only check for program_invocation_name. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `c9b86cf526`)	2019-08-08 10:10:21 +00:00
Marek Olšák	f837d0a6a3	radeonsi: disable SDMA image copies on dGPUs to fix corruption in games Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (cherry picked from commit `6b3ee86989`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/drivers/radeonsi/cik_sdma.c	2019-08-08 12:04:18 +02:00
Bas Nieuwenhuizen	f0aa11b054	ac/nir: Use correct cast for readfirstlane and ptrs. Fixes: `028ce527` "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `2af00b1fdd`)	2019-08-08 10:01:00 +00:00
Bas Nieuwenhuizen	3a7d0d760f	radv: Do non-uniform lowering before bool lowering. Since it can introduce comparisons. Fixes: `028ce52739` "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `2301b2e029`)	2019-08-08 09:59:15 +00:00
Jason Ekstrand	84e3025387	anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109630 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111267 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `bc612536eb`)	2019-08-08 09:56:23 +00:00
Juan A. Suarez Romero	f70c6dda43	cherry-ignore: panfrost: Make ctx->job useful Fixes: This commit does not apply cleanly on 19.1 branch, as it depends on other commits not present in the branch. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-08 09:53:12 +00:00
Sergii Romantsov	c9d9ad2e9f	i965/clear: clear_value better precision Test-case with depth-clear 0.5 and format MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent clear-value of 0.4999997. Maybe its better to improve? CC: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `0ae9ce0f29` (i965/clear: Quantize the depth clear value based on the format) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `a86eccfb78`)	2019-08-07 17:23:42 +00:00
Juan A. Suarez Romero	7fcb69a33c	docs: add sha256 checksums for 19.1.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-07 18:49:02 +02:00
Juan A. Suarez Romero	b84ffa028d	docs: add release notes for 19.1.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-07 18:38:23 +02:00
Juan A. Suarez Romero	53cc3e8f7e	Update version to 19.1.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-07 16:33:26 +00:00
Tapani Pälli	83815a97d5	mesa: add glsl_type ref to one_time_init and decref to atexit This fixes problems spotted within vk-gl-cts. Problem is that the builtin functions refer to types and we should not release types before builtins are released. Fixes: `624789e370` ("compiler/glsl: handle case where we have multiple users for types") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-06 16:31:31 +03:00
Francisco Jerez	59cb919ff2	intel/ir: Fix CFG corruption in opt_predicated_break(). Specifically the optimization of a conditional BREAK + WHILE sequence into a conditional WHILE seems pretty broken. The list of successors of "earlier_block" (where the conditional BREAK was found) is emptied and then re-created with the same edges for no apparent reason. On top of that the list of predecessors of the block immediately after the WHILE loop is emptied, but only one of the original edges will be added back, which means that potentially several blocks that still have it on their list of successors won't be on its list of predecessors anymore, causing all sorts of hilarity due to the inconsistency in the control flow graph. The solution is to remove the code that's removing valid edges from the CFG. cfg_t::remove_block() will already clean up after itself. The assert in bblock_t::combine_with() also needs to be removed since we will be merging a block with multiple children into the first one of them. Found the issue on a hardware enabling branch originally, but apparently somebody reproduced the same problem independently on master in the meantime. Fixes: `d13bcdb3a9` ("i965/fs: Extend predicated break pass to predicate WHILE.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009 Cc: jiradet.jd@gmail.com Cc: Sergii Romantsov <sergii.romantsov@globallogic.com> Cc: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Tested-by: Paul Chelombitko <qamonstergl@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `54fbc625ea`)	2019-08-02 07:00:31 +00:00
Eric Engestrom	8f3935b1ac	nir: remove explicit nir_intrinsic_index_flag values These were left after a rebase and happen to make NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it was noticed. Fixes: `6f20643b47` ("nir: Allow qualifiers on copy_deref and image instructions") Cc: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `5d7bcac4e7`)	2019-08-01 07:59:12 +00:00
Emil Velikov	b4f52b1567	egl/drm: ensure the backing gbm is set before using it Currently, if we error out before gbm_dri is set (say due to a different name of the backing GBM implementation, or otherwise) the tear down will trigger a NULL ptr deref and crash out. Move the gbm_dri initialization as early as possible. v2: Drop check in dri2_teardowm_drm (Eric) Reported-by: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `72b97ad9b2`)	2019-08-01 07:57:55 +00:00
Jason Ekstrand	a42361cdb2	intel/fs: Implement quad_swap_horizontal with a swizzle on gen7 This fixes dEQP-VK.subgroups.quad.compute.subgroupquadswaphorizontal_* on all gen7 platforms. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `8fd2f2c276`)	2019-07-31 08:12:46 +00:00
Jason Ekstrand	f522c7ca9e	intel/fs: Use ALIGN16 instructions for all derivatives on gen <= 7 The issue here was discovered by a set of Vulkan CTS tests: dEQP-VK.glsl.derivate..dynamic_ These tests use ballot ops to construct a branch condition that takes the same path for each 2x2 quad but may not be uniform across the whole subgroup. They then tests that derivatives work and give the correct value even when executed inside such a branch. Because the derivative isn't executed in uniform control-flow and the values coming into the derivative aren't smooth (or worse, linear), they nicely catch bugs that aren't uncovered by simpler derivative tests. Unfortunately, these tests require Vulkan and the equivalent GL test would require the GL_ARB_shader_ballot extension which requires int64. Because the requirements for these tests are so high, it's not easy to test on older hardware and the bug is only proven to exist on gen7; gen4-6 are a conjecture. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `499d760c6e`)	2019-07-31 08:06:48 +00:00
Eric Engestrom	ac7f03caed	scons+meson: suppress spammy build warning on MacOS Originally introduced in `c7f3657450` ("darwin: Suppress type conversion warnings for GLhandleARB") to fix Bugzilla #66346 [1], this workaround was never ported to Scons or Meson. [1] https://bugs.freedesktop.org/66346 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (cherry picked from commit `bf8b5de6b9`)	2019-07-31 08:02:16 +00:00
Bas Nieuwenhuizen	b1d66aa9ee	radv: Fix descriptor set allocation failure. Set all the handles to VK_NULL_HANDLE: "If the creation of any of those descriptor sets fails, then the implementation must destroy all successfully created descriptor set objects from this command, set all entries of the pDescriptorSets array to VK_NULL_HANDLE and return the error." (Vulkan 1.1.117 Spec, section 13.2) CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `2b53c49d2f`)	2019-07-31 08:00:27 +00:00
Lionel Landwerlin	d06ccdf9dd	spirv: don't discard access set by vtn_pointer_dereference We can have a access flag already set here so just augment the existing ones. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0fb61dfdeb` ("spirv: propagate access qualifiers through ssa & pointer") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `7deb5ec0e8`)	2019-07-31 07:58:43 +00:00
Andres Rodriguez	ad72ce1ad7	radv: fix queries with WAIT_BIT returning VK_NOT_READY When vkGetQueryPoolResults() is called with VK_QUERY_RESULT_WAIT_BIT set, the driver is supposed to wait for the query to become available before returning. Currently, radv returns once the query is indeed ready, but it returns VK_NOT_READY. It also fails to populate the results. The problem is a missing volatile in the secondary check for query availability. This patch removes the secondary check altogether since it is redundant with the preceding loop. This bug was found with an unreleased version of SteamVR. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `2b71b4e793`)	2019-07-31 07:52:52 +00:00
Andrii Simiklit	23eebaf2ec	meson: add a warning for meson < 0.46.0 This could help somebody to be noticed about meson issue: https://github.com/mesonbuild/meson/pull/3274 as result NDEBUG won't be defined even if b_ndebug is true and buildtype is release. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109791 Cc: mesa-stable@lists.freedesktop.org Acked-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-07-30 11:44:55 +03:00
Eric Anholt	3ec136d583	freedreno: Fix data races with allocating/freeing struct ir3. There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: `8fe2076243` ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `6e3b220ad3`)	2019-07-30 08:33:26 +00:00
Bas Nieuwenhuizen	8fbadb152c	radv: Take variable descriptor counts into account for buffer entries. Fixes: `b5e04e9217` "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `aac492901a`)	2019-07-30 08:32:07 +00:00
Jason Ekstrand	b1df082b00	anv: Don't claim support for 24 and 48-bit formats on IVB Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `99d04a5bd6`)	2019-07-30 08:31:00 +00:00
Jason Ekstrand	66ee5bd082	isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW On Haswell, the format works but it doesn't properly do an sRGB decode. It appears to act identically to R8G8B8_UNORM. Only Vulkan uses this format so this only affects Vulkan on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `7c1b39cf18`)	2019-07-30 08:29:37 +00:00
Rhys Perry	7364cb04c5	ac/nir: fix txf_ms with an offset Seems to fix some hair artifacts in Max Payne 3: https://github.com/daniel-schuermann/mesa/issues/76 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f4e499ec79` ('radv: add initial non-conformant radv vulkan driver') Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `a9f58af454`)	2019-07-30 08:27:47 +00:00
Lionel Landwerlin	83d17d5730	spirv: propagate access qualifiers through ssa & pointer Not only variables can be flagged as NonUniformEXT but also expressions. We're currently ignoring it in an expression such as : imageLoad(data[nonuniformEXT(rIndex)], 0) The associated SPIRV : OpDecorate %69 NonUniformEXT ... %69 = OpLoad %61 %68 This changes propagates access qualifiers through ssa & pointers so that when it hits a OpLoad/OpStore style instructions, qualifiers are not forgotten. Fixes failure the following tests : dEQP-VK.descriptor_indexing.* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8ed583fe52` ("spirv: Handle the NonUniformEXT decoration") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `0fb61dfdeb`)	2019-07-30 08:25:46 +00:00
Lionel Landwerlin	0801a8b906	spirv: wrap push ssa/pointer values This refactor allows for common code to apply decoration on all ssa/pointer values. In particular this will allow to propagage access qualifiers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `86b53770e1`) [Lionel Landwerlin: patch adapted for 19.1 branch]	2019-07-30 08:23:19 +00:00
Connor Abbott	e1fdca7492	nir: Allow qualifiers on copy_deref and image instructions In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `6f20643b47`)	2019-07-30 08:18:49 +00:00
Caio Marcelo de Oliveira Filho	57fc7a23e1	anv: Remove special allocation for anv_push_constants The key reason for that mechanism is gone: all the extra optional data that could be in the anv_push_constants was moved elsewhere. At this point, just put anv_push_constants directly in anv_cmd_state (part of anv_cmd_buffer). v2: Remove a NULL check we don't need anymore in anv_cmd_buffer_push_constants(). (Lionel) Fix size we consider for valid push params. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `f7d53fffa2`)	2019-07-30 08:18:49 +00:00
Ilia Mirkin	630a2e4d97	nv50/ir: handle insn not being there for definition of CVT arg This can happen if it's e.g. a uniform or a function argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `3e468ff2fe`)	2019-07-29 11:08:14 +00:00
Ilia Mirkin	5f640b4692	nvc0: allow a non-user buffer to be bound at position 0 Previously the code only handled it for positions 1 and up (as would be for UBO's in GL). It's not a lot of trouble to handle this, and vl or vdpau want this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `9f8ed5aa67`)	2019-07-29 11:01:18 +00:00
Ilia Mirkin	645462fe85	nv50,nvc0: update sampler/view bind functions to accept NULL array Apparently vl (or vdpau) wants to pass that in now. Handle it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `c52b057e00`)	2019-07-29 10:56:22 +00:00
Ilia Mirkin	e671e68238	gallium/vl: fix compute tgsi shaders to not process undefined components This caused nouveau's function handling logic to think that the MAIN function was due to receive external parameters, and cascaded some failures after that. Instead avoid having the undefined components in the first place. Fixes: `f6ac0b5d71` (gallium/auxiliary/vl: Add compute shader to support video compositor render) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `face27fdc5`)	2019-07-29 10:53:54 +00:00
Boyuan Zhang	b521c3c0c8	radeon/vcn: enable rate control for hevc encoding Set cu_qp_delta_enable_flag on when rate control is enabled, and set it off when rate control is disabled (e.g. constant qp). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: fix typo and add bugzilla info Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> (cherry picked from commit `b0626c1f30`)	2019-07-29 10:51:48 +00:00
Boyuan Zhang	5c7cffe1d4	radeon/uvd: enable rate control for hevc encoding Set cu_qp_delta_enable_flag on when rate control is enabled, and set it off when rate control is disabled (e.g. constant qp). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: fix typo and add bugzilla info Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> (cherry picked from commit `5115c25bb8`)	2019-07-29 10:50:33 +00:00
Boyuan Zhang	e2568bc6e4	radeon/vcn: fix poc for hevc encode MaxPicOrderCntLsb should be at least 16 according to the spec, therefore add minimum value check. Also use poc value passed from st instead of calculation in slice header encoding. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: Fix typo V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb should be power of 2 according to spec. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> (cherry picked from commit `9aaf3aaf5d`)	2019-07-29 10:47:57 +00:00
Boyuan Zhang	7470b25b2b	radeon/uvd: fix poc for hevc encode MaxPicOrderCntLsb should be at least 16 according to the spec, therefore add minimum value check. Also use poc value passed from st instead of calculation in slice header encoding. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: Fix typo V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb should be power of 2 according to spec. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> (cherry picked from commit `77cf700fa3`)	2019-07-29 10:46:12 +00:00
Lionel Landwerlin	c45c624dce	nir: add access to image_deref intrinsics SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `8c330728f3`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/compiler/nir/nir.c	2019-07-29 10:23:45 +02:00
Mark Menzynski	2098b48fa0	nvc0/ir: Fix assert accessing null pointer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111007 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111167 Signed-off-by: Mark Menzynski <mmenzyns@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann<tobias.klausmann@freenet.de> (cherry picked from commit `7493fbf032`)	2019-07-26 15:10:50 +00:00
Jason Ekstrand	eb24e60cdc	anv: Disable transform feedback on gen7 It's totally implementable, it's just that the plumbing is a bit different and we never hooked it up. Don't advertise a broken feature. Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback" (cherry picked from commit `295e5a17da`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/intel/vulkan/anv_extensions.py	2019-07-26 09:35:42 +02:00
Bas Nieuwenhuizen	204a36f270	radv: Set correct metadata size for GFX9+. Without correct size, radeonsi assumes the metadata is incorrect, which can and will cause issues. Since the metadata is really incorrect without the size, let us fix that. Fixes: `e43cc3e3af` "radv/gfx9: handle GFX9 opaque metadata" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `7e1fe81f56`)	2019-07-26 07:32:34 +00:00
Arcady Goldmints-Orlov	2329b87ff4	anv: report HOST_ALLOCATION as supported for images Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as supported for images. It was being shown supported for buffers, but not images. Fixes: `69cc6272fb` ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `832cedfdee`)	2019-07-26 07:29:18 +00:00
Daniel Schürmann	742f348d32	spirv: Fix order of barriers in SpvOpControlBarrier Semantically, the memory barrier has to come first to wait for the completion of pending memory requests. Afterwards, the workgroups can be synchronized. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `e352b4d650`)	2019-07-25 09:41:06 +00:00
Nicolas Dufresne	327a6b3a64	egl: Also query modifiers when exporting DMABuf This fixes eglExportDMABUFImageQueryMESA() so it will report the modififers of the underlying image. Without this information, re-importing will likely be broken as it is rare these days that no modifiers are used. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Daniel Stone <daniels@collabora.com> Fixes: `8f7338f284` ("egl: add initial EGL_MESA_image_dma_buf_export v2.4") (cherry picked from commit `08f1cefecd`)	2019-07-25 09:02:00 +00:00
Yevhenii Kolesnikov	4bb56fdd46	main: Fix memleaks in mesa_use_program Add freeing of SubroutineIndexes to the _mesa_free_shader_state. Fixes: `4566aaaa5b` ("mesa/subroutines: start adding per-context subroutine index support (v1.1)") Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `882fe09a74`)	2019-07-25 08:34:56 +00:00
Andrii Simiklit	61117d653e	intel/compiler: don't use a keyword struct for a class fs_reg warning: struct 'fs_reg' was previously declared as a class Fixes: `e64be391` ("intel/compiler: generalize the combine constants pass") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> (cherry picked from commit `fa2fc68de1`)	2019-07-25 08:14:29 +00:00
Eric Engestrom	97cfb89b73	gallium+mesa: fix tgsi_semantic array type Fixes: `ed23335a31` ("gallium: use enums in p_shader_tokens.h (v2)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `e7e31b18d6`)	2019-07-24 10:38:00 +00:00
Eric Engestrom	d4a64ad09b	util: fix no-op macro (bad number of arguments) Fixes: `b8e077daee` ("util: no-op __builtin_types_compatible_p() for non-GCC compilers") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `f986741a91`)	2019-07-24 10:29:18 +00:00
Dylan Baker	e9a284e8d0	meson: allow building all glx without any drivers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111016 Fixes: `a47c525f32` ("meson: build glx") Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `7cf50af6f5`)	2019-07-24 10:09:37 +00:00
Lionel Landwerlin	dccd75b60c	anv: fix use of comma operator This doesn't fix any bug at the moment because the next statement is 'true' which happens to be APIMODE_D3D, but if that changes it could. The fixes tags is as far I could go but the error predates it (2016 is probably far enough). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8db6f2e6eb` ("anv/pipeline: Roll genX_pipeline_util.h into genX_pipeline.c") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `772a5f9814`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/intel/vulkan/genX_pipeline.c	2019-07-24 12:06:57 +02:00
Eric Engestrom	aff5714c65	nir: don't return void Fixes: `14531d676b` ("nir: make nir_const_value scalar") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> (cherry picked from commit `3acc4278ad`)	2019-07-24 10:02:48 +00:00
Dave Airlie	9305d9b142	st/nir: fix arb fragment stage conversion The comment even justifies the wrongness wrongly. We should be translating to pipe values properly here or else fragment maps to tess ctrl. Fixes: `3d7611e9a6` ("st/nir: use NIR for asm programs") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `365f24705f`)	2019-07-23 11:50:52 +00:00
Kenneth Graunke	2570ee28f5	egl: Only expose 565 pbuffer configs if X can export them as DRI3 images Glamor in xorg-server 1.20 cannot expose 16bpp pixmaps when running in the usual 24bpp mode. This meant our 565 pbuffer configs would ultimately fail to create a backing pixmap, leading to crashes. To hack around this, make a 16bpp pixmap and try and export it. If it works, expose the configs. Otherwise, just skip them. This also disables them on DRI2. These configs were only added to pass conformance requirements, and I doubt anybody cares about testing out 565 pbuffer visuals on DRI2-only drivers. v2: Don't leak the fds (caught by Eric Anholt) v3: Don't free(fds), it's not malloc'd Fixes: `dacb11a585` ("egl: Add a 565 pbuffer-only EGL config under X11.") Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `82607f8a90`)	2019-07-23 11:49:37 +00:00
Kenneth Graunke	f8c0b90f99	egl: Make the 565 pbuffer-only config single buffered. In commit `dacb11a585`, Eric found the first matching 565 pbuffer config, and stopped. Our double-buffered configs come first in the list, so we added that, making a pbuffer-only config that claimed to be double buffered. This doesn't make sense, since pixmaps/pbuffers are fundamentally not double buffered. When using that config, every call to eglCreatePbufferSurface would fail with EGL_BAD_MATCH. The call chain looks like this: - eglCreatePbufferSurface - dri3_create_pbuffer_surface - dri3_create_surface - dri2_get_dri_config which eventually does: const bool double_buffer = surface_type == EGL_WINDOW_BIT; and then fails to find a matching config, because it ends up looking for a single-buffered config - and there aren't any. To fix this, make the 565 pbuffer config single-buffered. This fixes at least 51 dEQP-EGL.* tests. Fixes: `dacb11a585` ("egl: Add a 565 pbuffer-only EGL config under X11.") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `6ad31c4ff3`)	2019-07-23 11:47:55 +00:00
Kenneth Graunke	43f62d2003	egl: Quiet warning about front buffer rendering for pixmaps/pbuffers pbuffer configs cause a million of these warnings to trigger, but when using pixmaps or buffers, there is only one surface, so this warning doesn't make much sense. Retain it for window surfaces for now. Fixes: `dacb11a585` ("egl: Add a 565 pbuffer-only EGL config under X11.") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `fc21394bc4`)	2019-07-23 11:46:43 +00:00
Kenneth Graunke	be12174820	mesa: Fix ReadBuffers with pbuffers pbuffers are internally single-buffered. Marek fixed DrawBuffers to handle this case, but we need to fix ReadBuffers too. Otherwise, pretty much every conformance test fails because glReadPixels breaks. v2: Refactor the switch into a helper (suggested by Eric Anholt) Fixes: `35294f2eca` ("mesa: fix pbuffers because internally they are front buffers") Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `78164a3a6c`)	2019-07-23 11:45:31 +00:00
Jason Ekstrand	3cd11985c0	intel/fs: Stop stack allocating large arrays Normally, we haven't worried too much about stack sizes as Linux tends to be fairly friendly towards large stacks. However, when running DXVK apps under wine, we're suddenly subject to Windows' more stringent stack limitations and can run out of space more easily. In particular, some of the shaders in Elite Dangerous: Horizons have quite a few registers and the arrays in split_virtual_grfs are large enough to blow a 1 MiB stack leading to crashes during shader compilation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108662 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `fa63fad333`)	2019-07-23 11:43:54 +00:00
Nataraj Deshpande	87efbe488e	egl/android: Update color_buffers querying for buffer age color_buffers[] is currently hard coded to 3 for android which fails in droid_window_dequeue_buffer when ANativeWindow creates color_buffers >3 while querying buffer age during dEQP partial_update tests on chromeOS. The patch removes static color_buffers[], queries for MIN_UNDEQUEUED_BUFFERS, sets native window buffer count and allocates the correct number of color_buffers as per android. Fixes dEQP-EGL.functional.partial_update* tests on chromebooks with enabling EGL_KHR_partial_update. v2: update comment instead of removing (Eric Engestrom) v3: change static array to dynamic allocated color_buffers querying MIN_UNDEQUEUED_BUFFERS (Chia-I Wu olv@chromium.org) Fixes: `2acc69da8c` "EGL/Android: Add EGL_EXT_buffer_age extension" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (cherry picked from commit `0661c357c6`)	2019-07-23 11:42:30 +00:00
Samuel Pitoiset	e1800b20f4	radv: fix crash in vkCmdClearAttachments with unused attachment depth_stencil_attachment and/or ds_resolve attachment can be NULL. This fixes crashes with dEQP-VK.renderpass.suballocation.unused_clear_attachments.* Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `b5116d3cb7`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/amd/vulkan/radv_meta_clear.c	2019-07-23 13:40:12 +02:00
Juan A. Suarez Romero	33e57d0ace	docs: add sha256 checksums for 19.1.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-23 11:18:10 +00:00
Juan A. Suarez Romero	09a1b2bdba	docs: add release notes for 19.1.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-23 11:07:52 +00:00
Juan A. Suarez Romero	58e93aef96	Update version to 19.1.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-23 11:04:20 +00:00
Dave Airlie	f17ff71f49	radv: fix crash in shader tracing. Enabling tracing, and then having a vmfault, can leads to a segfault before we print out the traces, as if a meta shader is executing and we don't have the NIR for it. Just pass the stage and give back a default. Fixes: `9b9ccee4d6` ("radv: take LDS into account for compute shader occupancy stats") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `2ac2b98780`)	2019-07-19 08:40:05 +00:00
Samuel Pitoiset	d86b14ecbb	radv: fix VGT_GS_MODE if VS uses the primitive ID Found by inspection. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `63d670e350`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/amd/vulkan/radv_pipeline.c	2019-07-19 10:38:00 +02:00
Samuel Iglesias Gonsálvez	900bcab48b	anv: fix alphaToCoverage when there is no color attachment There are tests in CTS for alpha to coverage without a color attachment that are failing. This happens because we remove the shader color outputs when we don't have a valid color attachment for them, but when alpha to coverage is enabled we still want to preserve the the output at location 0 since we need the alpha component. In that case we will also need to create a null render target for RT 0. v2: - We already create a null rt when we don't have any, so reuse that for this case (Jason) - Simplify the code a bit (Iago) v3: - Take alpha to coverage from the key and don't tie this to depth-only rendering only, we want the same behavior if we have multiple render targets but the one at location 0 is not used. (Jason). - Rewrite commit message (Iago) v4: - Make sure we take into account the array length of the shader outputs, which we were no handling correctly either and make sure we also create null render targets for any invalid array entries too. v5: - Simplify removal of unused outputs by using rt_used[] so we don't have to special case alpha to coverage there too. Fixes the following CTS tests: dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.* Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `bc66cebc0d`)	2019-07-18 16:30:25 +00:00
Lionel Landwerlin	0b1ee72bbc	anv: fix format mapping for depth/stencil formats anv_format is supposed to have a pointer back to the associated VkFormat, we were missed this for depth/stencil formats. This doesn't fix anything afaict, but will be needed for future changes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `465de47bad` ("anv: associate vulkan formats with aspects") Acked-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `3adc32df92`)	2019-07-18 08:36:51 +00:00
Lepton Wu	3dea2e2ffc	virgl: Set meta data for textures from handle. The set of meta data was removed by commit `8083464`. It broke lots of dEQP tests when running with pbuffer surface type. Fixes: `8083464013` ("virgl: remove dead code") Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (cherry picked from commit `6109df58e4`)	2019-07-18 08:35:40 +00:00
Bas Nieuwenhuizen	1527d02acb	radv: Only save the descriptor set if we have one. After reset, if valid does not contain the relevant bit the descriptor can be != NULL but still not be valid. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `f1a8967344`)	2019-07-18 08:32:06 +00:00
Lionel Landwerlin	d578b42e34	anv: report timestampComputeAndGraphics true Spec says : "timestampComputeAndGraphics specifies support for timestamps on all graphics and compute queues. If this limit is set to VK_TRUE, all queues that advertise the VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT in the VkQueueFamilyProperties::queueFlags support VkQueueFamilyProperties::timestampValidBits of at least 36." On gen7+ this should be true (we only have 32bits of timestamp on gen6 and below). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `802f00219a` ("anv/device: Update features and limits") Reported-by: Timothy Strelchun <timothy.strelchun@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `ce4c5474af`)	2019-07-18 08:29:59 +00:00
Lionel Landwerlin	a612f0210a	vulkan/wsi: update swapchain status on vkQueuePresent With the following chain of events : vkQueuePresent() <- Surface resize vkQueuePresent() We should be able to report SUBOPTIMAL or OUT_OF_DATE on the second vkQueuePresent() call. Currently we only look at X11 events in the vkAcquireNextImage() path so we're not able to report this. This change checks the queue of events and process any available ones to update the swapchain status. v2: Be consistent about reporting the current error state of the swapchain (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111097 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `6f880f128f`)	2019-07-18 08:27:58 +00:00
Jason Ekstrand	7a072f1f39	nir/loop_analyze: Properly handle swizzles in loop conditions This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for all intermediate values so that we can properly handle swizzles. Even though if conditions are required to be scalars, they may still consume swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your loop termination condition. The old code would just bail the moment it saw its first non-zero swizzle but we can now properly chase the scalar from the if condition to all the way to a, b, and c. Shader-db results on Kaby Lake: total loops in shared programs: 4388 -> 4364 (-0.55%) loops in affected programs: 29 -> 5 (-82.76%) helped: 29 HURT: 5 Shader-db results on Haswell: total loops in shared programs: 4370 -> 4373 (0.07%) loops in affected programs: 2 -> 5 (150.00%) helped: 2 HURT: 5 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `ff972c7a3a`)	2019-07-18 08:24:56 +00:00
Jason Ekstrand	b685e303f7	nir: Add some helpers for chasing SSA values properly There are various cases in which we want to chase SSA values through ALU ops ranging from hand-written optimizations to back-end translation code. In all these cases, it can be very tricky to do properly because of swizzles. This set of helpers lets you easily work with a single component of an SSA def and chase through ALU ops safely. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `8f7405ed9d`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/compiler/nir/nir.h	2019-07-18 08:22:26 +00:00
Jason Ekstrand	b9b376b821	nir/loop_analyze: Refactor detection of limit vars This commit reworks both get_induction_and_limit_vars() and try_find_trip_count_vars_in_iand to return true on success and not modify their output parameters on failure. This makes their callers significantly simpler. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `0333649e63`)	2019-07-18 08:20:12 +00:00
Gert Wollny	fde2473a06	softpipe: Remove unused static function Thanks to Eric Engestrom for pointing out that there was something wrong with that function. Fixes: `724a73509e` softpipe: Prepare handling explicit gradients Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `9c611fb381`)	2019-07-17 08:22:59 +00:00
Jason Ekstrand	b43e2d5a12	nir/regs_to_ssa: Handle regs in phi sources properly Sources of phi instructions act as if they occur at the very end of the predecessor block not the block in which the phi lives. In order to handle them correctly, we have to skip phi sources on the normal instruction walk and handle them as a separate walk over the successor phis. While registers in phi instructions is a bit of an oddity it can happen when we temporarily go out-of-SSA for control-flow manipulations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `6fb685fe4b`)	2019-07-17 08:17:29 +00:00
Yevhenii Kolesnikov	cffebf6f57	meta: leaking of BO with DrawPixels ctx->Unpack.BufferObj wasn't unreferenced. Fixes: `d492e7b017` (meta: Fix invalid PBO access from DrawPixels when trying to just alloc.) CC: Eric Anholt <eric@anholt.net> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `3853871ef8`)	2019-07-17 08:14:46 +00:00
Jason Ekstrand	3a27a5b989	anv: Account for dynamic stencil write disables in the PMA fix In `6ce8592836` we started looking at the dynamic stencil state and disabling stencil writes when the stencil mask is zero. Unfortunately, we never updated the PMA fix code accordingly so 3DSTATE_WM_DEPTH_STENCIL and the PMA fix were getting out-of-sync causing hangs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109203 Fixes: `6ce8592836` "anv: Disable stencil writes when both write..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `6a441151c2`)	2019-07-17 08:12:37 +00:00
Sergii Romantsov	43682f0c6f	meta: memory leak of CopyPixels usage Meta of CopyPixel generates a buffer object but does not free it on cleanup. Fixes: `37d11b13ce` (meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `7417b43211`)	2019-07-17 08:10:41 +00:00
Caio Marcelo de Oliveira Filho	6ba4ce97b7	spirv: Fix stride calculation when lowering Workgroup to offsets Use alignment to calculate the stride associated with the pointer types. That stride is used when the pointers are casted to arrays. Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b; } will have element an element size of 12 bytes, but the stride needs to be 16 bytes to respect the 8 byte alignment. Fixes: `050eb6389a` "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `026cfa1099`)	2019-07-16 07:55:10 +00:00
Jason Ekstrand	f24507425b	nir,intel: Add support for lowering 64-bit nir_opt_extract_* We need this when doing full software 64-bit emulation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309 Fixes: `cbad201c2b` "nir/algebraic: Add missing 64-bit extract_[iu]8..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `0ba508d7a3`)	2019-07-16 07:47:37 +00:00
Jason Ekstrand	cad015acb5	nir/opt_if: Clean up single-src phis in opt_if_loop_terminator Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071 Fixes: `2a74296f24` "nir: add opt_if_loop_terminator()" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `7a19e05e8c`)	2019-07-16 07:36:27 +00:00
Bas Nieuwenhuizen	2c1e3692b8	anv: Add android dependencies on android. Specifically needed for nativewindow for some VK_EXT_external_memory_android_hardware_buffers functions, where we call into some AHardwareBuffer functions. The legacy Android ext did not have us call into any Android function at all and hence it was not noticed. Fixes: `755c633b8d` "anv: Fix vulkan build in meson." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> (cherry picked from commit `d4f0f1a6e2`)	2019-07-16 07:34:36 +00:00
Lionel Landwerlin	fa9ba5e19e	anv: fix crash in vkCmdClearAttachments with unused attachment anv_render_pass_compile() turns an unused attachment into a NULL depth_stencil_attachment pointer so check that pointer before accessing it. Found with updates to existing CTS tests. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `208be8eafa` ("anv: Make subpass::depth_stencil_attachment a pointer") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> (cherry picked from commit `c9c8c2f7d7`)	2019-07-16 07:32:45 +00:00
Vinson Lee	6df891afa6	meson: Add dep_thread dependency. Fix this build error on Ubuntu 18.04. /usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5' Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663 Suggested-by: Eric Engestrom <eric@@engestrom.ch> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `730ceeddb5`)	2019-07-15 17:31:08 +00:00
Eric Anholt	17dc693590	freedreno: Fix assertion failures in context setup in shader-db mode. Cherry-picks `a0d4d7febf` upstream The TTN path needs access to the screen to make the right decisions about lowering, but we didn't have pctx->screen set up at fdN_prog_init time. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-07-15 12:43:36 +02:00
Caio Marcelo de Oliveira Filho	14a2fba722	anv: Fix pool allocator when first alloc needs to grow When using softpin, the first allocation was not calculating the padding and offset correctly for the case the first allocation needed to grow. We were missing initialize the state.end right after expanding the pool for the first time. This is not a problem for non-softpin since there we don't use leftover padding so the ends would re-arrange incrementally. This fixes running dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 in SKL -- the test uses a shader larger than the initial size for the instruction pool. Fixes: `dfc9ab2ccd` "anv/allocator: Add padding information." Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `09c4037dda`)	2019-07-15 10:28:02 +00:00
Timothy Arceri	e4b7aa9e74	mesa: save/restore SSO flag when using ARB_get_program_binary Without this the restored program will fail the pipeline validation checks when we attempt to use an SSO program. Fixes: `c20fd744fe` ("mesa: Add Mesa ARB_get_program_binary helper functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111010 (cherry picked from commit `3043908ccb`)	2019-07-15 10:22:42 +00:00
Jason Ekstrand	24e7db0a36	anv: Set Stateless Data Port Access MOCS This is the MOCS setting used for the A64 stateless messages which we sometimes use for SSBO operations. Fixes: `48ed2a7bb0` "anv: Implement VK_EXT_buffer_device_address" Fixes: `79fb0d27f3` "anv: Implement SSBOs bindings with GPU addr..." Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `6a2ff217b8`)	2019-07-15 10:19:55 +00:00
Jason Ekstrand	28aec04659	nir/loop_analyze: Bail if we encounter swizzles None of the current code knows what to do with swizzles. Take the safe option for now and bail if we see one. This does have a small shader-db impact but it is at least safe. Shader-db results on Kaby Lake: total loops in shared programs: 4364 -> 4388 (0.55%) loops in affected programs: 5 -> 29 (480.00%) helped: 5 HURT: 29 Shader-db results on Haswell: total loops in shared programs: 4373 -> 4370 (-0.07%) loops in affected programs: 5 -> 2 (-60.00%) helped: 5 HURT: 2 Fixes: `6772a17acc` "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `9a3cb6f5fe`)	2019-07-15 10:17:31 +00:00
Jason Ekstrand	0b540a702a	nir/loop_analyze: Handle bit sizes correctly in calculate_iterations The current code assumes everything is 32-bit which is very likely true but not guaranteed by any means. Instead, use nir_eval_const_opcode to do the calculations in a bit-size-agnostic way. We also use the new constant constructors to build the correct size constants. Fixes: `6772a17acc` "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `268ad47c11`)	2019-07-15 10:14:43 +00:00
Jason Ekstrand	afaec581a8	nir: Add more helpers for working with const values Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `ce5581e23e`)	2019-07-15 10:09:44 +00:00
Jason Ekstrand	f5e70045e1	nir/loop_analyze: Fix phi-of-identical-alu detection One issue was that the original version didn't check that swizzles matched when comparing ALU instructions so it could end up matching very different instructions. Using the nir_instrs_equal function from nir_instr_set.c which we use for CSE should be much more reliable. Another was that the loop assumes it will only run two iterations which may not be true. If there's something which guarantees that this case only happens for phis after ifs, it wasn't documented. Fixes: `9e6b39e1d5` "nir: detect more induction variables" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `9f7ffe41dd`)	2019-07-15 10:00:59 +00:00
Jason Ekstrand	d76ab7d9fb	nir/instr_set: Expose nir_instrs_equal() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `6e984bcb92`)	2019-07-15 09:57:17 +00:00
Connor Abbott	8bc7397e02	nir: Add a helper to determine if an intrinsic can be reordered This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `a1c737927c`)	2019-07-15 09:34:37 +00:00
Marek Olšák	83c4597f19	radeonsi: don't set READ_ONLY for const_uploader to fix bindless texture hangs Bindless textures can update descriptors with WRITE_DATA. Cc: 19.1 <mesa-stable@lists.freedesktop.org> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie airlied@redhat.com (cherry picked from commit `5058d62b05`)	2019-07-10 11:00:51 +00:00
Lionel Landwerlin	1e3b877903	vulkan/overlay: fix crash on freeing NULL command buffer It is legal to call vkFreeCommandBuffers() on NULL command buffers. This fix requires `eb41ce1b01` ("util/hash_table: Properly handle the NULL key in hash_table_u64"). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4438188f49` ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `a72351cc76`)	2019-07-09 10:32:19 +00:00
Ian Romanick	87fc035c53	mesa: Set minimum possible GLSL version Set the absolute minimum possible GLSL version. API_OPENGL_CORE can mean an OpenGL 3.0 forward-compatible context, so that implies a minimum possible version of 1.30. Otherwise, the minimum possible version 1.20. Since Mesa unconditionally advertises GL_ARB_shading_language_100 and GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't advertise any extensions to enable any shader stages (e.g., GL_ARB_vertex_shader). Converts about 2,500 piglit tests from crash to skip on NV18. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110955 Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `0349bc3ce2`)	2019-07-09 10:29:50 +00:00
Ian Romanick	47d6b60127	nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_size This is important because, for example nir_op_fne has dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or 64-bits. Fixing this helps partial redundancy elimination for compares in a few more shaders. v2: Add unit tests for nir_opt_comparison_pre that are fixed by this commit. All Intel platforms had similar results. total instructions in shared programs: 17179408 -> 17179081 (<.01%) instructions in affected programs: 43958 -> 43631 (-0.74%) helped: 118 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2 helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for instructions value: -3.08 -2.37 95% mean confidence interval for instructions %-change: -1.30% -0.85% Instructions are helped. total cycles in shared programs: 360959066 -> 360942386 (<.01%) cycles in affected programs: 774274 -> 757594 (-2.15%) helped: 111 HURT: 4 helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36 helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24% HURT stats (abs) min: 1 max: 2068 x̄: 533.25 x̃: 32 HURT stats (rel) min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56% 95% mean confidence interval for cycles value: -200.61 -89.47 95% mean confidence interval for cycles %-change: -10.32% -6.58% Cycles are helped. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1] Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `be1cc3552b` ("nir: Add nir_const_value_negative_equal") (cherry picked from commit `0ac5ff9ecb`)	2019-07-09 10:23:12 +00:00
Ian Romanick	fb2c5dd98f	nir: Add unit tests for nir_opt_comparison_pre Each tests has a comment with the expected before and after NIR. The tests don't actually check this. The tests only check whether or not the optimization pass reported progress. I couldn't think of a robust, future-proof way to check the before and after code. Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `b08d704051`)	2019-07-09 10:18:37 +00:00
Ian Romanick	f6c032c615	intel/vec4: Reswizzle VF immediates too Previously, an instruction like mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F] would get rewritten as mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F] The latter does not produce the correct result. The VF immediate in the second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F]. This commit produces the former. Fixes: `1ee1d8ab46` ("i965/vec4: Reswizzle sources when necessary.") Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `47c2aa5b48`)	2019-07-09 10:14:02 +00:00
Chia-I Wu	e9e63bfba8	anv: fix VkExternalBufferProperties for host allocation It was reported as unsupported previously. It should be importable and is compatible with itself. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Fixes: `69cc6272fb` ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `5824130389`)	2019-07-09 10:12:40 +00:00
Chia-I Wu	84f76533e4	anv: fix VkExternalBufferProperties for unsupported handles compatibleHandleTypes must include the queried handle type. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `f3c7a02a62`)	2019-07-09 10:11:32 +00:00
Bas Nieuwenhuizen	e0d44fd4fe	radv: Handle cmask being disallowed by addrlib. alignment=0 does weird things with align64. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `e46b41b3ae`)	2019-07-09 10:10:28 +00:00
Lionel Landwerlin	5666f3b891	vulkan/overlay: fix command buffer stats Begin/Reset of command buffer both reset the content of the command buffer. Don't forget to wipe them on Begin. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4438188f49` ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit") Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `8f0f727fe4`)	2019-07-09 10:09:07 +00:00
Juan A. Suarez Romero	e42399f4de	docs: add sha256 checksums for 19.1.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-09 09:18:55 +00:00
Juan A. Suarez Romero	fe1f7b538b	docs: add release notes for 19.1.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-09 09:09:53 +00:00
Juan A. Suarez Romero	eea0045458	Update version to 19.1.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-09 09:04:10 +00:00
Jason Ekstrand	77598ddfac	iris: Use a uint16_t for key sizes sizeof(struct brw_vs_prog_key) == 324. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `4633298fd6`)	2019-07-05 08:47:31 +00:00
Bas Nieuwenhuizen	50c3dcd2f8	radv: Fix interactions between variable descriptor count and inline uniform blocks. Fixes: `d7e6541cc7` "radv: Only allocate supplied number of descriptors when variable." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `8a053254b8`)	2019-07-04 10:36:29 +02:00
Juan A. Suarez Romero	202eb29e55	intel: fix wrong format usage Do not use the view format when filling the surface state. Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.* Fixes: `fb1350c76f` ("intel: Add and use helpers for level0 extent") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `e06bc0b166`)	2019-07-04 10:35:16 +02:00
Caio Marcelo de Oliveira Filho	95cfcc3b43	spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1): For objects in the Uniform, StorageBuffer, or PushConstant storage classes, the element’s address or location is calculated using a stride, which will be the Base-type’s Array Stride when the Base type is decorated with ArrayStride. For all other objects, the implementation will calculate the element’s address or location. For non-CL shaders the driver should layout the Workgroup storage class, so override any explicitly set ArrayStride in the shader. This currently fixes only the lower_workgroup_access_to_offsets case, which is used by anv. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> (cherry picked from commit `050eb6389a`)	2019-07-03 10:13:10 +02:00
Arfrever Frehtes Taifersar Arahesis	cb3072488c	meson: Improve detection of Python when using Meson >=0.50. Previously, on systems where multiple versions of Python 3 (e.g. 3.6 and 3.7) are installed, wrong version of Python 3 could have been used. The proper fix requires availability of path() method in Meson's python module, which has been added in Meson 0.50: https://github.com/mesonbuild/meson/pull/4616 Distro Bug: https://bugs.gentoo.org/671308 Signed-off-by: Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> v2: - Add missing `endif` keyword (Dylan) (cherry picked from commit `b120a02b21`)	2019-07-02 10:12:55 +02:00
Jory Pratt	3d0e6d3cff	meson: Search for execinfo.h Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for execinfo.h presence, just check directly. This allows the build to work on musl. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `10e8d46601`)	2019-07-02 09:57:34 +02:00
Jory Pratt	6dca27fce6	util: Heap-allocate 256K zlib buffer The disk cache code tries to allocate a 256 Kbyte buffer on the stack. Since musl only gives 80 Kbyte of stack space per thread, this causes a trap. See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size (In musl-1.1.21 the default stack size has increased to 128K) [mattst88]: Original author unknown, but I think this is small enough that it is not copyrightable. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `fd7b7f14d8`)	2019-07-02 09:56:18 +02:00
Bas Nieuwenhuizen	334f0d3ead	radv: Only allocate supplied number of descriptors when variable. Fixes: `b5e04e9217` "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `d7e6541cc7`)	2019-07-02 09:53:58 +02:00
James Clarke	515f4b2f20	meson: GNU/kFreeBSD has DRM/KMS and requires -D_GNU_SOURCE This is a regression from the old autotools build system. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> (cherry picked from commit `7389bf9761`)	2019-07-01 11:43:32 +02:00
Gert Wollny	05af010f77	vl: Use CS composite shader only if TEX_LZ and DIV are supported Enable the compute shader copositer only when TEX_LZ is supported by the driver. v2: Also check whether DIV is supported. https://bugs.freedesktop.org/show_bug.cgi?id=110783 Fixes: `9364d66cb7` gallium/auxiliary/vl: Add video compositor compute shader render Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `75d8b4e795`)	2019-07-01 11:18:04 +02:00
Gert Wollny	5cfbe55184	gallium: Add CAP for opcode DIV Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able to check this. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `843723e2f7`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/docs/source/screen.rst src/gallium/include/pipe/p_defines.h	2019-07-01 10:34:27 +02:00
Lionel Landwerlin	d14939925e	intel/compiler: don't use byte operands for src1 on ICL The simulator complains about using byte operands, we also have documentation telling us. Note that add operations on bytes seems to work fine on HW (like ADD). Using dwords operands with CMP & SEL fixes the following tests : dEQP-VK.spirv_assembly.type.vec.i8. v2: Drop the GLK changes (Matt) Add validator tests (Matt) v3: Drop GLK ref (Matt) Don't mix float/integer in MAD (Matt) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com> BSpec: 3017 Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5847de6e9a`)	2019-07-01 10:13:43 +02:00
Dylan Baker	38dab50ec8	Revert "meson: Add support for using cmake for finding LLVM" This reverts commit `5157a42765`. There is a meson bug that causes llvm to always be statically linked, which is obviously not what we want. I haven't had time to look into it yet, but for now let's just revert it. (cherry picked from commit `97c2c4546c`)	2019-07-01 10:00:58 +02:00
Anuj Phogat	16ba6fecb2	Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `9c421d6b47`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `d96cba7754`)	2019-07-01 09:59:00 +02:00
Anuj Phogat	1bcdc5b4a6	Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `2be60e0c73`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `387e43b52f`)	2019-07-01 09:57:42 +02:00
Anuj Phogat	e17b17c2f5	Revert "i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `85ecd14ef6`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `7746d4edef`)	2019-07-01 09:55:01 +02:00
Pierre-Eric Pelloux-Prayer	ac3c9a4195	radeon/uvd: fix calc_ctx_size_h265_main10 Left shift was applied twice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110702 Reviewed-by: Leo Liu <leo.liu@amd.com> Tested-by: <irherder@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c81c784a4a`)	2019-07-01 09:53:20 +02:00
Pierre-Eric Pelloux-Prayer	22b21623f3	mesa: delete framebuffer texture attachment sampler views When a context is destroyed the destroy_tex_sampler_cb makes sure that all the sampler views created by that context are destroyed. This is done by walking the ctx->Shared->TexObjects hash table. In a multiple context environment the texture can be deleted by a different context, so it will be removed from the TexObjects table and will prevent the above mechanism to work. This can result in an assertion in st_save_zombie_sampler_view because the sampler_view owns a reference to a destroyed context. This issue occurs in blender 2.80. This commit fixes this by explicitly releasing sampler_view created by the destroyed context for all texture attachments. Fixes: `593e36f956` (st/mesa: implement "zombie" sampler views (v2)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110944 Signed-off-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `c37f03d464`)	2019-07-01 09:51:44 +02:00
Eric Engestrom	f6c959afaa	meson: bump required libdrm version to 2.4.81 `dbb4457d98` started using drmDevicesEqual(), which was introduced in libdrm 2.4.81 We could either copy the function locally, or bump the required version. Since the function is non-trivial and 2.4.81 is old enough already, I suggesting the latter. Fixes: `dbb4457d98` ("egl: add EGL_EXT_device_drm support") Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5819bc0e5c`)	2019-07-01 09:49:38 +02:00
Samuel Pitoiset	adbf808e0c	radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+ These two extensions are supported on GFX8 but the throughput of 16-bit floats/integers is same as 32-bit. Also, shaderInt16 is only enabled on GFX9+ for the same reason, be more consistent. This fixes a crash with Wolfenstein II because it expects shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is exposed. Note that AMDVLK only enables these extensions on GFX9+. Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `ef1787dbc9`) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/amd/vulkan/radv_extensions.py	2019-06-28 10:13:48 +02:00
Kenneth Graunke	d6b1b9158e	gallium: Make util_copy_image_view handle shader_access A while back, we added a new field, but failed to update the copier. I believe iris is the only current user of the new field, and it hasn't used the copier, so noone noticed. Fixes: `8b626a22b2` st/mesa: Record shader access qualifiers for images Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `255c71ec07`)	2019-06-28 10:06:02 +02:00
Nanley Chery	211bedcf4d	isl: Don't align phys_level0_sa by block dimension Aligning phys_level0_sa by the compression block dimension prior to mipmap layout causes the layout of compressed surfaces to differ from the sampler's expectations in certain cases. The hardware docs agree: From the BDW PRM, Vol. 5, Compressed Mipmap Layout, The compressed mipmaps are stored in a similar fashion to uncompressed mipmaps [...] The following exceptions apply to the layout of compressed (vs. uncompressed) mipmaps: * [...] * The dimensions of the mip maps are first determined by applying the sizing algorithm presented in Non-Power-of-Two Mipmaps above. Then, if necessary, they are padded out to compression block boundaries. The last bullet indicates that alignment should not be done for calculating a miplevel's dimensions, but rather for determining miplevel placement/padding. Comply with this text by removing the extra alignment. Fixes some fbo-generatemipmap-formats piglit failures on all tested platforms (SNB-KBL). v2: - Note fixed platforms. - Update some consumers via a helper function. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `02f6995d76`)	2019-06-28 10:03:42 +02:00
Nanley Chery	eef57b818b	intel: Add and use helpers for level0 extent Prepare for a bug fix by adding and using helpers which convert isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of surface elements. v2: - Update iris (Ken). - Update anv. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `fb1350c76f`)	2019-06-28 10:00:53 +02:00
Kenneth Graunke	97b43a8160	iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This makes CompressedTexSubImage from a PBO source do proper GPU rendering to upload instead of stalling to map the PBO source on the CPU (then copying it on the CPU). Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this functionality, and to Jason Ekstrand for writing the code I adapted. Vulkan only supports a single layer, however, and this code tries to support multiple layers as long as it's miplevel 0. Improves performance in Sid Meier's Civilization VI: Average frame time (ms): -3.67423% +/- 1.46201% (n=5) 99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5) (cherry picked from commit `a032a9665f`)	2019-06-28 09:59:05 +02:00
Dylan Baker	421aa4d162	meson: Add support for using cmake for finding LLVM Meson has support for using cmake as a finder for some dependencies, including LLVM. Using cmake has a lot of advantages: it needs less meson maintenance to keep working (even for llvm updates); it works more sanely for cross compiles (as llvm-config is a compiled binary not a shell script). Meson 0.51.0 also has a new generic variable getter that can be used to get information from either cmake, pkg-config, or config-tools dependencies, which is needed for cmake. We continue to support using llvm-config if you don't have cmake installed, or if cmake cannot find a suitable version. Fixes: `0d59459432` ("meson: Force the use of config-tool for llvm") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `5157a42765`)	2019-06-28 09:45:31 +02:00
Lionel Landwerlin	a0a6df95b4	intel/compiler: fix derivative on y axis implementation This rewrites the ddy in EXECUTE_4 mode with a loop to make it more obvious what is going on and also sets the group each of the 4 threads in the groups are supposed to execute. Fixes the following CTS tests : dEQP-VK.glsl.derivate.dfdyfine.dynamic_* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Co-Authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `2134ea3800` ("intel/compiler/fs: Implement ddy without using align16 for Gen11+") (cherry picked from commit `836225840c`)	2019-06-28 09:43:15 +02:00
Sagar Ghuge	6fbe0eea26	glsl: Fix round64 conversion function Fix round64 function to handle round to nearest even cases specially with positive and negative numbers with fraction part 0.5. v2: 1) Simplify unused bits (Elie Tournier) Fixes: KHR-GL45.gpu_shader_fp64.builtin.round_dvec2 KHR-GL45.gpu_shader_fp64.builtin.round_dvec3 KHR-GL45.gpu_shader_fp64.builtin.round_dvec4 KHR-GL45.gpu_shader_fp64.builtin.roundeven_double KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `06807e1948`)	2019-06-26 17:32:00 +00:00
Sergii Romantsov	3e1c46f233	i965: leaking of upload-BO with push constants In case of any enabled VS members from: uses_firstvertex, uses_baseinstance, uses_drawid, uses_is_indexed_draw leaks may happens. Call gen6_upload_push_constants allocates stage_stat->push_const_bo. It than takes pointer from push_const_bo to draw_params_bo (in the call brw_prepare_shader_draw_parameters by brw_upload_data) and do reference which finally haven't got unreferenced. Fixes leak: 136 bytes in 1 blocks are definitely lost in loss record 6 of 13 at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596) by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672) by 0xC314BB3: brw_upload_space (intel_upload.c:88) by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155) by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300) by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540) by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659) by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681) by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052) by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175) by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386) by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> (cherry picked from commit `1931c97a1d`)	2019-06-26 08:17:11 +00:00
Jason Ekstrand	77962816a5	anv/descriptor_set: Only write texture swizzles if we have an image view When immutable samplers are set we call write_image_view with a NULL image view. This causes issues on IVB where we have to fake texture swizzling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999 Fixes: `d2aa65eb18` "anv: Emulate texture swizzle in the shader when..." (cherry picked from commit `0a364a4a74`)	2019-06-26 07:16:56 +00:00
Ville Syrjälä	970cc023b0	anv/cmd_buffer: Reuse gen8 Cmd{Set, Reset}Event on gen7 Modern DXVK requires event support [1], but looks like it only uses vkCmdSetEvent() + vkGetEventStatus(). So we can just borrow the relevant code from gen8, leaving CmdWaitEvents still unimplemented. [1] `8c3900c533` v2: Also move CmdWaitEvents into genX_cmd_buffer.c (Jason) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `6230bfeb65`)	2019-06-25 16:06:33 +00:00
Rob Clark	2e83a64f64	freedreno/a5xx: fix batch leak in fd5 blitter path Fixes: `3d198926a4` freedreno: use fd_bc_alloc_batch instead of fd_batch_create. Signed-off-by: Rob Clark <robdclark@chromium.org> (cherry picked from commit `927fb50727`)	2019-06-25 11:47:59 +00:00
Ian Romanick	f59881898f	glsl: Don't increase the iteration count when there are no terminators Incrementing the iteration count was intended to fix an off-by-one error when the first terminator was superseded by a later terminator. If there is no first terminator or later terminator, there is no off-by-one error. Incrementing the loop count creates one. This can be seen in loops like: do { if (something) { // No breaks or continues here. } } while (false); Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Abel Briggs <abelbriggs1@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953 Fixes: `646621c66d` ("glsl: make loop unrolling more like the nir unrolling path") (cherry picked from commit `ee1c69fadd`)	2019-06-25 11:46:19 +00:00
Nataraj Deshpande	9171d2f19e	anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform gralloc module will select a format based on the usage flags provided by the camera device and the other endpoint of the stream. The patch fixes crash in vulkan when the test is run with camera stream set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED. Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering on chromebook with camera HAL3. v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan) Fixes: `f1654fa7e3` "anv/android: support creating images from external format" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `d94fca5420`)	2019-06-25 11:44:52 +00:00
Eric Anholt	e9660d3c3f	freedreno: Fix up end range of unaligned UBO loads. We need the constants uploaded to cover the NIR offset plus the size, not the aligned-down start of our upload range plus the size. Fixes mistaken UBO analysis with mat3 loads. Fixes: `893425a607` ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `56842d33d5`)	2019-06-25 11:43:26 +00:00
Eric Anholt	0741463bb4	freedreno: Fix UBO load range detection on booleans. NIR 1-bit bool dests will have a bit size of 1, and thus a calculated "bytes" of 0. load_ubo is always loading from dwords in the source. Fixes: `893425a607` ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `5e7c96b95d`)	2019-06-25 11:41:33 +00:00
Juan A. Suarez Romero	d54dc24d6d	docs: add sha256 checksums for 19.1.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-25 12:56:10 +02:00
Juan A. Suarez Romero	22eddd8b9d	docs: add release notes for 19.1.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-25 12:43:49 +02:00
Juan A. Suarez Romero	118c300536	Update version to 19.1.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-25 10:29:55 +00:00
Eric Engestrom	ebd90fc7e0	util/os_file: resize buffer to what was actually needed Fixes: `316964709e` "util: add os_read_file() helper" Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `955c63d364`)	2019-06-21 07:39:06 +00:00
Kenneth Graunke	25a34df614	iris: Fix iris_flush_and_dirty_history to actually dirty history. When I split iris_flush_and_dirty_history into two helper functions, I accidentally made it stop dirtying. Which was...sort of the point. Fixes: `21688a306b` iris: Split iris_flush_and_dirty_for_history into two helpers. (cherry picked from commit `64fb20ed32`) [Juan A. Suarez: resoved trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/drivers/iris/iris_resource.c	2019-06-21 09:36:09 +02:00
Eric Engestrom	c36e4bd7fa	glx: fix glvnd pointer types Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110709 Fixes: `22a9e00aab` ("glx: Implement the libglvnd interface.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `65b016b146`)	2019-06-21 07:31:50 +00:00
Samuel Pitoiset	14d7fc09cc	radv: disable viewport clamping even if FS doesn't write Z This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `0a313cc285`)	2019-06-21 07:28:35 +00:00
Bas Nieuwenhuizen	927ca86698	meson: Allow building radeonsi with just the android platform. Just as was allowed by autotools. Fixes: `108d257a16` "meson: build libEGL" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `d1c04835ab`)	2019-06-20 08:40:39 +00:00
Bas Nieuwenhuizen	867223cee1	anv: Fix vulkan build in meson. Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `755c633b8d`)	2019-06-20 08:36:39 +00:00
Bas Nieuwenhuizen	a5154fa69c	radv: Fix vulkan build in meson. Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `4c300bd328`)	2019-06-20 08:21:01 +00:00
Samuel Pitoiset	3fdf2b9645	radv: fix FMASK expand with SRGB formats Found while working on DCC for MSAA. Fixes: `6b976024a8` ("radv: add support for FMASK expand") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `a7f75377ab`)	2019-06-19 07:26:04 +00:00
Mathias Fröhlich	15f6bb5c6c	egl: Don't add hardware device if there is no render node v2. Do not offer a hardware drm backed egl device if no render node is available. The current implementation will fail on this egl device. On top it issues a warning that is actually missleading. There are finally more error paths that can fail on the way to a hardware backed egl device. Fixing all of them would kind of require opening the drm device and see if there is a usable driver associated with the device. The taken approach avoids a full probe and fixes at least this kind of problem on kvm virtualization hosts I observe here. Fixes: `dbb4457d98` ("egl: add EGL_EXT_device_drm support") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (cherry picked from commit `5743a36b2b`)	2019-06-19 07:24:37 +00:00
Dave Airlie	72eb587b97	nouveau: fix frees in unsupported IR error paths. This is pointless in that we won't ever hit those paths in real life, but coverity complains. Fixes: `f014ae3c7c` ("nouveau: add support for nir") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `93ba356544`)	2019-06-19 07:22:57 +00:00
Rob Clark	4de4c18841	freedreno/a6xx: un-swap X24S8_UINT The stencil is actually in the .w component, but we used to use SWAP to remap the channels. This doesn't work when tiled/ubwc. Fixes: dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube dEQP-GLES31.functional.stencil_texturing.misc.base_level dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil Signed-off-by: Rob Clark <robdclark@chromium.org> (cherry picked from commit `4e72abcd97`)	2019-06-18 15:39:57 +00:00
Kenneth Graunke	47f1f4f9e5	glsl: Fix out of bounds read in shader_cache_read_program_metadata The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: `6d830940f7` glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `3c10a2726b`)	2019-06-18 09:55:20 +00:00
Jason Ekstrand	db4850c631	anv: Set STATE_BASE_ADDRESS upper bounds on gen7 This should fix floating-point border color on all gen7 HW. Integer is still thoroughly busted on gen7 because it doesn't exist on IVB and it's crazy on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `9672b7044c`)	2019-06-18 09:52:51 +00:00
Gert Wollny	1702733645	virgl: Assume sRGB write control for older guest kernels or virglrenderer hosts When the host virglrenderer is an older version that doesn't check the sRGB write control feature, or when the guest kernel doesn't support CAPS v2, then the guest will only report support for GL 2.1 on a GL 3.3 host, even though it was supporting 3.3 with earlier guest mesa versions. By also checking the host feature check version this regression can be avoided. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110921 Fixes: `2845939d6a` virgl: Set sRGB write control CAP based on host capabilities Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (cherry picked from commit `2b87753a84`)	2019-06-18 09:51:20 +00:00
Bas Nieuwenhuizen	6f18adff0a	radv: Decompress DCC when the image format is not allowed for buffers. Otherwise the buffer loads/stores in the bufimage meta operations fail. If we decompress DCC then we can use the "canonical" format compatible with the not-supported format. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `4107590911`)	2019-06-18 09:47:32 +00:00
Haihao Xiang	eb1e6e6412	i965: support UYVY for external import only It is similar with YUYV Fixes: `165e704719` ("i965/i915: Add UYVY as the supported format") Signed-off-by: Haihao Xiang <haihao.xiang@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `8ead5bebdb`)	2019-06-17 07:42:52 +00:00
Lionel Landwerlin	0f8193cb18	intel/dump: fix segfault when the app hasn't accessed the device Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `f80679c8e8`)	2019-06-14 09:09:37 +00:00
Eduardo Lima Mitev	efc5518410	freedreno/a5xx: Fix indirect draw max_indices calculation The number of elements to draw should not be affected by the offset. A similar fix was submitted for a6xx at `79180a05`. Fixes these dEQP tests on a5xx: dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500 Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `3fb7b1fd35`)	2019-06-14 09:08:45 +00:00
Alejandro Piñeiro	80965709d0	v3d: fix checking twice auf flag Seems a C&P error, and should check for auf/muf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110902 Fixes: `8f065596d2` "v3d: Add an optimization pass for redundant flags updates." Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `17c2c9cd67`)	2019-06-14 09:06:36 +00:00
Bas Nieuwenhuizen	746025fd63	radv: Skip transitions coming from external queue. Transitions to external queue should do the transition & make sure it works on all queues. Fixes: `8ebc7dcb59` "radv: Allow fast clears with concurrent queue mask for some layouts." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `0667c1f14b`)	2019-06-14 09:05:30 +00:00
Kevin Strasser	a48ef364e1	st/mesa: Add rgbx handling for fp formats Add missing cases for fp32 and fp16 formats. Fixes: `c68334ffc0` "st/mesa: add floating point formats in st_new_renderbuffer_fb()" Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `845ec8576a`)	2019-06-14 09:03:49 +00:00
Kevin Strasser	be69033241	gallium/winsys/kms: Fix dumb buffer bpp The bpp in the dumb buffer creation request is hardcoded to 32, which is an incorrect assumption as the caller is free to pick any pipe format. Use the bpp supplied to us through util_format_get_blocksizebits(). Fixes: `3b176c441b` "gallium: Add a dumb drm/kms winsys backed swrast provider" Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `ec0a68e50d`)	2019-06-14 09:02:14 +00:00
Eric Engestrom	582b691062	util/futex: fix dangling pointer use Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110901 Fixes: `7dc2f47882` "util: emulate futex on FreeBSD using umtx" Cc: Greg V <greg@unrelenting.technology> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `9996ddbb27`)	2019-06-14 08:58:54 +00:00
Samuel Pitoiset	7e0b89caa9	radv: fix VK_EXT_memory_budget if one heap isn't available When the visible VRAM size is equal to the VRAM size only two heaps are exposed. This fixes dEQP-VK.api.info.device.memory_budget. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `d378151246`)	2019-06-14 08:57:10 +00:00
Samuel Pitoiset	90291b5db1	radv: fix occlusion queries on VegaM The number of render backends is 16 but the enabled mask is 0xaaaa. As noticed by Bas, allowing disabled render backends might break the OCCLUSION_QUERY packet. We don't use it yet but keep this in mind. This fixes dEQP-VK.query_pool.* and dEQP-VK.multiview.*. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `2ef9d2738c`)	2019-06-14 08:55:38 +00:00
Lionel Landwerlin	94e2228496	anv: do not parse genxml data without INTEL_DEBUG=bat This significantly slows down the CTS runs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `32ffd90002` ("anv: add support for INTEL_DEBUG=bat") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `93b93e5a9d`)	2019-06-14 08:54:08 +00:00
Richard Thier	5eccd8fa5a	r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added v1: Fix skipped slab allocators and the buffer cache. v2: Use only 1 domain for texture allocation v3: Added flag for the create_fence call too Based on Marek v1 and v2 proposed fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=1107812.patch Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `ffd2f948fe`)	2019-06-14 08:52:40 +00:00
Juan A. Suarez Romero	2a5b4e2b9f	docs: Add SHA256 sums for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-11 15:25:40 +00:00
Juan A. Suarez Romero	1517811f4f	docs: Add release notes for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-11 17:07:39 +02:00
Juan A. Suarez Romero	0d2ea312b7	Update version to 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-11 16:22:23 +02:00
Bas Nieuwenhuizen	49c17e845a	radv: Prevent out of bound shift on 32-bit builds. uintptr_t is 32-bits then and shifting it by 32 bits results in undefined behavior IIRC. Fixes: `b3c8de1c55` "radv: save all descriptor pointers into the trace BO" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `39c71e0025`)	2019-06-11 08:11:24 +00:00
Samuel Pitoiset	d058124201	radv: fix setting CB_SHADER_MASK for dual source blending CB_SHADER_MASK was computed without the second color buffer format which looks totally wrong to me. While we are at it, copy a comment from RadeonSI. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `e9316fdfd4`)	2019-06-11 08:05:33 +00:00
Emil Velikov	d4797ff15e	mapi: correctly handle the full offset table Earlier commit converted ES1 and ES2 to a new, much simpler, dispatch generator. At the same time, GL/glapi and the driver side are still using the old code. There is a hidden ABI between GL*.so and glapi.so, former referencing entry-points by offset in the _glapi_table. Hence earlier commit added the full table of entry-points, alongside a marker for other cases like indirect GL(X) and driver-size remapping. Yet the patches did not handle things fully, thus it was possible to get different interpretations of the dispatch table after the marker. This commit fixes that adding an indicative error message to catch future bugs. While here correct the marker (MAX_OFFSETS) comment. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302 Fixes: `cf317bf093` ("mapi: add all _glapi_table entrypoints tostatic_data.py") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `a379b1c0ee`)	2019-06-11 08:01:57 +00:00
Emil Velikov	eb532d1ae7	mapi: add static_date offset to MaxShaderCompilerThreadsKHR As elaborated in the next patch, there is some hidden ABI that effectively require most entrypoints to be listed in the file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302 Cc: Marek Olšák <maraeo@gmail.com> Fixes: `c5c38e831e` ("mesa: implement ARB/KHR_parallel_shader_compile") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `61960547df`)	2019-06-11 07:59:38 +00:00
Samuel Pitoiset	a7a2d403fd	radv: fix alpha-to-coverage when there is unused color attachments When alphaToCoverage is enabled, we should always write the alpha channel of MRT0 if it's unused. This now matches RadeonSI. This fixes the new CTS: dEQP-VK.pipeline.multisample.alpha_to_coverage_unused_attachment.samples_*.alpha_invisible Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl (cherry picked from commit `91aa25f462`)	2019-06-11 07:58:06 +00:00
Kenneth Graunke	84bd361217	egl/x11: calloc dri2_surf so it's properly zeroed Commit `2282ec0a` refactored drawable creation across various platforms into a new dri2_create_drawable helper function. The GBM code in platform_drm.c code passed in dri2_surf->gbm_surf as the loaderPrivate, while most other backends passed in dri2_surf directly. To try and handle this, the patch checked if dri2_surf->gbm_surf was non-NULL, and if so, presumed that the caller is the DRM platform and we should use the dri2_surf->gbm_surf pointer. This worked for most platforms, which calloc their dri2_surf structure, zeroing the data. Unfortunately, platform_x11.c used malloc, leaving most of the dri2_surf as garbage. In particular, dri2_surf->gbm_surf was often non-NULL, causing dri2_create_drawable to try and use it, passing a garbage pointer to the createNewDrawable hook, usually leading to a SIGBUS or SIGSEGV when trying to dereference that bad pointer. Since most callers calloc the data, make platform_x11.c follow suit. Fixes crashes with i915_dri.so when running dEQP-GLES2. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `4e3297f7d4`)	2019-06-09 16:52:59 +00:00
Eric Engestrom	c025240f6c	util/os_file: actually return the error read() gave us Fixes: `316964709e` "util: add os_read_file() helper" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `7e35f20d44`)	2019-06-09 16:51:36 +00:00
Rob Clark	3301eeee51	freedreno/a6xx: fix hangs with newer sqe fw With the newer (v1.76) fw, we were getting hangs (compared to older v1.66 fw). Re-work the GMEM code to structure things a bit closer to the blob. This moves some PKT7 packets from IB2 to IB1, which I think is what was confusing SQE and causing it to get stuck in an infinite loop. But in general structuring things at least closer to the same way blob does makes it easier to compare cmdstream. Note: this is a bit on the large side for what I'd normally consider for stable.. but right now it is looking like it is the newer fw that is headed for linux-firmware. This should defn have some soak time on master, but probably a good idea for this patch to end up in distro mesa builds by the time a630_sqe.fw hits linux-firmware. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (cherry picked from commit `958f6ffb60`)	2019-06-09 16:50:03 +00:00
Rob Clark	9f71165a1b	freedreno/a6xx: fix issues with gallium HUD In some cases the draw for the text wasn't working. This seems to be fixed by resyncing some of the "golded registers" from blob (initial values were based on somewhat older blob version). Perhaps good to have a bit of soak time on master, but would be good to eventually land in 19.x stable branches. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (cherry picked from commit `b820c09fa8`)	2019-06-09 16:49:12 +00:00
Nanley Chery	7ca66dc06b	anv/cmd_buffer: Initalize the clear color struct for CNL+ On CNL+, the clear color struct is composed of RGBA channel values and fields which are either reserved by the HW or used to control fast-clears. Currently anv initializes the channel values to zero and allows the other fields to be undefined. Satisfy the MBZ field requirements by removing an optimization that doesn't hold true for CNL+ and pulling in the number of dwords to initialize from ISL. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `b4198e792c`)	2019-06-09 16:47:13 +00:00
Charmaine Lee	6f44b7ebb0	svga: Remove unnecessary check for the pre flush bit for setting vertex buffers This fixes the missing rebind when the can_pre_flush bit is not set and the vertex buffers are the same as what have been sent. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Neha Bhende <bhenden@vmware.com> Signed-off-by: Charmaine Lee <charmainel@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> (cherry picked from commit `f29b8fde91`)	2019-06-09 16:46:05 +00:00
Deepak Rawat	28b72f5187	winsys/svga/drm: Fix 32-bit RPCI send message Depending on whether compiled with frame-pointer or not, the temporary memory location used for the bp parameter in these macros are referenced relative to the stack pointer or the frame pointer. Hence we can never reference that parameter when we've modified either the stack pointer or the frame pointer, because then the compiler would generate an incorrect stack reference. Fix this by pushing the temporary memory parameter on a known location on the stack before modifying the stack- and frame pointers. Also in case of failuire RPCI channel is not closed which lead to vmx running out of channels. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> (cherry picked from commit `72fc886826`)	2019-06-09 16:44:49 +00:00
Nataraj Deshpande	147d6693be	anv: Fix check for isl_fmt in assert Checking isl_fmt returned value in assert seems appropriate instead of format variable. Fixes: `f1654fa7e3` "anv/android: support creating images from external format" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> (cherry picked from commit `d6724471a5`)	2019-06-06 09:41:56 +00:00
Jason Ekstrand	1f40ef24cc	nir/propagate_invariant: Don't add NULL vars to the hash table Fixes: `8410cf66d` "nir/propagate_invariant: Skip unknown vars" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `d96878a66a`)	2019-06-06 09:37:29 +00:00
Lionel Landwerlin	90623adb16	intel/perf: improve dynamic loading config detection We're currently trying to detect dynamic loading config support by trying to remove to test config (hard coded in the i915 driver) and checking we get ENOENT. This can fail if the test config was updated in Mesa but not yet in i915. A better way to do this is to pick an invalid ID and check for ENOENT. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `c162127440`)	2019-06-06 09:34:23 +00:00
Lionel Landwerlin	971eeb93e6	intel/perf: fix EuThreadsCount value in performance equations EuThreadsCount is supposed to be the number of threads per EU, not the total number of threads in the whole device. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `1fc7b95127` ("i965: Add Gen8+ INTEL_performance_query support") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `0430c6d18a`)	2019-06-06 08:46:17 +00:00
Deepak Rawat	a5c864f6f8	winsys/drm: Fix out of scope variable usage In this particular instance, struct member were used outside of the block where it was defined. Fix this by moving the definition outside of block. Signed-off-by: Deepak Rawat <drawat@vmware.com> Fixes: `569f838987` ("winsys/svga: Add support for new surface ioctl, multisample pattern") Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `828e1b0b4c`)	2019-06-06 08:40:58 +00:00
Emil Velikov	626ea69627	egl/dri: flesh out and use dri2_create_drawable() Wrap the loader->createNewDrawable() dance into a helper and use it throughout the codebase. This addresses a cases like surfaceless (SL) on swrast (SL on kms_swrast is fine) where we'd attempt using the wrong driver and crash out. v2: fixup quirky GBM (Mathias) v3: fixup GBM for real (Marek) Cc: mesa-stable@lists.freedesktop.org Cc: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (v2) Signed-off-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2282ec0ad6`)	2019-06-06 08:25:47 +00:00
Juan A. Suarez Romero	9d8f104f39	Update version to 19.1.0-rc5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-05 16:23:45 +00:00
Vinson Lee	2a45ddd42d	freedreno: Fix GCC build error. ../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant .minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 }, ^ Suggested-by: Kristian Høgsberg <krh@bitplanet.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `d4e70be739`)	2019-06-05 09:00:53 +00:00
Marek Olšák	96fbd54398	ac: fix a typo in ac_build_wg_scan_bottom Cc: 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `c9b64b58de`)	2019-06-05 08:29:08 +00:00
Rhys Perry	60688cc393	ac/nir: mark some texture intrinsics as convergent Otherwise LLVM can sink them and their texture coordinate calculations into divergent branches. v2: simplify the conditions on which the intrinsic is marked as convergent v3: only mark as convergent in FS and CS with derivative groups Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `73dda85512`)	2019-06-05 08:27:14 +00:00
Samuel Pitoiset	38927a35a6	radv: do not use gfx fast depth clears for layered depth/stencil images The driver should only fast depth clears with the graphics path when the view covers all image layers, otherwise this might corrupt layers when HTILE is enabled. Cc: 19.0 19.1 mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `8a35eb0602`)	2019-06-04 15:06:46 +00:00
Sagar Ghuge	cf6472e780	intel/compiler: Fix assertions in brw_alu3 v2: Fix assertion for src1 (Ian Romanick) Fixes: `3b967e17` (intel/compiler: Avoid false positive assertions) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `3016756398`)	2019-06-04 15:06:46 +00:00
Pierre-Eric Pelloux-Prayer	5394f1578c	radeonsi: init sctx->dma_copy before using it Commit `a1378639ab` reordered context functions initializations but broke sctx->b.resource_copy_region init when using AMD_DEBUG=forcedma. In this case sctx->dma_copy was assigned a value after being used in: sctx->b.resource_copy_region = sctx->dma_copy; This commit moves the FORCE_DMA special case after sctx->dma_copy initialization. See https://bugs.freedesktop.org/show_bug.cgi?id=110422 Signed-off-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `4583f09caa`)	2019-06-04 15:06:46 +00:00
Timothy Arceri	51998d720b	st/glsl: make sure to propagate initialisers to driver storage This essentially reverts `20234cfe3a`. Fixes piglit test: tests/spec/arb_get_program_binary/execution/uniform-after-restore.shader_test Fixes: `20234cfe3a` "st/mesa: don't propagate uniforms when restoring from cache" Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110784 (cherry picked from commit `fea36a8f43`)	2019-06-04 15:06:46 +00:00
Axel Davy	8773e20238	d3dadapter9: Revert to old throttling limit value Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2 to 1: `20909284f2` No driver seems to overwrite the default value. One user reports severe regressions for some games. For now, revert to the value 2 for nine. Cc: "19.1" mesa-stable@lists.freedesktop.org Signed-off-by: Axel Davy <davyaxel0@gmail.com> (cherry picked from commit `5820ac6756`)	2019-06-04 15:06:46 +00:00
Marek Olšák	4524f09cc0	u_blitter: don't fail mipmap generation for depth formats containing stencil Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109754 Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (cherry picked from commit `4b11ed443b`)	2019-06-04 15:06:46 +00:00
Rob Clark	3fce389c8b	freedreno/a6xx: fix GPU crash on small render targets Fixes dEQP-GLES2.functional.multisampled_render_to_texture.readpixels Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `8eaa2d5021`)	2019-06-04 15:06:46 +00:00
Rob Clark	a37f10af7b	freedreno/ir3: set more barrier bits Blob is also setting the .l bit, and it seems to solve some intermittent failures with a couple of deqp's: dEQP-GLES31.functional.image_load_store.2d.qualifiers.coherent_r32i dEQP-GLES31.functional.image_load_store.2d.qualifiers.volatile_r32f Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `f9fa456e1d`)	2019-06-04 15:06:46 +00:00
Jonathan Marek	90d045f993	freedreno/ir3: fix input ncomp for vertex shaders ncomp is never set for vertex shaders, but a3xx and a4xx still use it. Fixes: `831f1a05c0` freedreno/ir3: rework varying packing Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> (cherry picked from commit `1db86d8b62`)	2019-06-03 08:20:25 +00:00
Bas Nieuwenhuizen	b2c5c16668	nir: Actually propagate progress in nir_opt_move_load_ubo. Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950). Fixes: `af355aaa07` "nir: add nir_opt_move_load_ubo() optimization pass" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `e24a7840f6`)	2019-06-03 08:15:53 +00:00
Jan Zielinski	fecdcce09c	swr/rast: fix 32-bit compilation on Linux Removing unused but problematic code from simdlib header to fix compilation problem on 32-bit Linux. Reviewed-by: Alok Hota <alok.hota@intel.com> (cherry picked from commit `cf673747ce`)	2019-05-31 17:03:55 +02:00
Jason Ekstrand	a13bda4957	nir/dead_cf: Call instructions aren't dead When we inlined cf_node_has_side_effects into node_is_dead, all the conditions flipped and we forgot to flip one. Fortunately, it doesn't matter right now because no one uses this pass on shaders with more than one function. Fixes: `b50465d197` "nir/dead_cf: Inline cf_node_has_side_effects" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `8948048c6f`)	2019-05-31 08:15:31 +00:00
Jason Ekstrand	c2a945771c	intel/fs: Do a stalling MFENCE in endInvocationInterlock() Fixes: `939312702e` "i965: Add ARB_fragment_shader_interlock support" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `9e403dc56e`)	2019-05-31 08:13:44 +00:00
Jason Ekstrand	92f4a16af8	intel/fs,vec4: Use g0 as the header for MFENCE We set header_present but then pass it some random garbage. Give it g0 instead. I'm not actually sure this does anything but g0 is the usual header data and this is what the windows driver does so it seems like a good idea. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `859de4a748`)	2019-05-31 08:11:35 +00:00
Jason Ekstrand	a19270007c	iris: Don't assume UBO indices are constant It will be true for the constant/system value buffer because they use a constant zero but it's not true in general. If we ever got here when the source wasn't constant, nir_src_as_uint would assert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `9dc57eebd5`)	2019-05-30 09:06:28 +00:00
Lionel Landwerlin	4c7dfaba9c	nir/lower_non_uniform: safely iterate over blocks This fixes a problem where the same instruction gets replaced twice. This was happening when the replaced instruction would be at the end of a block. Replacement of : if ssa_8 { .... intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf / / image_array=false / / format=34836 / / access=32 / } Would be : if ssa_8 { loop { vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_48 = ieq ssa_47, ssa_44 if ssa_48 { loop { vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_50 = ieq ssa_49, ssa_44 if ssa_50 { intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) / image_dim=Buf / / image_array=false / / format=34836 / / access=32 */ break } else { .... } Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bd5457641` ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `366811bedb`)	2019-05-30 09:01:40 +00:00
Samuel Pitoiset	411114c45c	radv: allocate more space in the CS when emitting events If the driver waits for CP DMA to be idle and emit an EOP event we need more space. This fixes a crash with Quake Champions. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `47a10edefb`)	2019-05-30 09:00:31 +00:00
Juan A. Suarez Romero	dd9635c1d2	Update version to 19.1.0-rc4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-29 16:44:45 +02:00
Timothy Arceri	0dcba748f9	Revert "st/mesa: expose 0 shader binary formats for compat profiles for Qt" This reverts commit `55376cb31e`. It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the original issue. It seems i965 only ever applied this workaround to the 18.0 branch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `11e16ca7ce`)	2019-05-28 07:13:40 +00:00
Lionel Landwerlin	fe7c45b97e	anv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors When using the binding tables to access arrays of YCbCr descriptors we did not consider the offset of the accessed element. We can't do a simple multiple because the binding table entries are tightly packed. For example element 0 of the array could use 2 entries/planes and element 1 could use 2 entries/planes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bb8768b9d` ("anv: toggle on support for VK_EXT_ycbcr_image_arrays") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `2042f22e28`)	2019-05-28 07:12:43 +00:00
Chenglei Ren	16eac8f754	anv/android: fix missing dependencies issue during parallel build The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure it gets generated as a dependency before building them. Signed-off-by: Chenglei Ren <chenglei.ren@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `13b38ca1e4`)	2019-05-28 07:11:10 +00:00
Qiang Yu	4b3c805b88	lima: fix render to non-zero level texture Current implementation won't respect level of surface to render. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> (cherry picked from commit `54490b0b36`)	2019-05-28 07:10:04 +00:00
Qiang Yu	87ac0bd86a	lima: fix lima_blit with non-zero level source resource lima_blit will do blit between resources with different levels. When blit from a level!=0 source, it will sample from that level of resource as texture. Current texture setup won't respect level when not mipmap filter. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> (cherry picked from commit `1dc593e9b9`)	2019-05-28 07:09:05 +00:00
Dave Airlie	74c5367612	Revert "mesa: unreference current winsys buffers when unbinding winsys buffers" This reverts commit `12bf7cfecf`. This commits caused lots of problems: https://bugs.freedesktop.org/show_bug.cgi?id=110721 https://bugs.freedesktop.org/show_bug.cgi?id=110761 Fixes: `12bf7cfecf` ("mesa: unreference current winsys buffers when unbinding winsys buffers") Pushing without review as we need to get it into next stable. (cherry picked from commit `7fe5a8e874`)	2019-05-27 08:31:05 +00:00
Christian Gmeiner	95ffe6323e	etnaviv: use the correct uniform dirty bits Found during code inspection. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> (cherry picked from commit `78fb5594be`)	2019-05-27 08:28:37 +00:00
Danylo Piliaiev	03fd344776	anv: Do not emulate texture swizzle for INPUT_ATTACHMENT, STORAGE_IMAGE If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE or VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT, the imageView member of each element of pImageInfo must have been created with the identity swizzle. Fixes: `d2aa65eb` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `c82dcf89ae`)	2019-05-27 08:26:47 +00:00
Lionel Landwerlin	9037cf26bb	vulkan: fix build dependency issue with generated files On machines with many cores, you can run into that issue : ../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory v2: Move declare_dependency around (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jan Ziak Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `cb7c9b2a93`)	2019-05-23 08:57:26 +00:00
Greg V	b02c6e8ee7	gallium: enable dmabuf on BSD as well The DRM_CONF_SHARE_FD code did not check for Linux, so the commit that introduced PIPE_CAP_DMABUF broke Wayland-EGL clients on FreeBSD. Fixes: `8ae50e60` (gallium: replace DRM_CONF_SHARE_FD with PIPE_CAP_DMABUF) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `506ebf55c0`)	2019-05-23 08:56:14 +00:00
Philipp Zabel	e13c13f54c	etnaviv: fill missing offset in etna_resource_get_handle Without this gbm_bo_get_offset() can return 0 where it shouldn't. Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `1ccb8a071b`)	2019-05-23 08:53:19 +00:00
Marek Olšák	60d524fd39	radeonsi: fix a regression in si_rebind_buffer Don't update non-buffer images. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110701 Fixes: `78e35df52a` "radeonsi: update buffer descriptors in all contexts after buffer invalidation" Cc: 19.1 <mesa-stable@lists.freedesktop.org> Tested-By: Gert Wollny <gert.wollny@collabora..com> (cherry picked from commit `d6053bf2a1`)	2019-05-23 08:51:16 +00:00
Lionel Landwerlin	ce2d68aace	vulkan/overlay: fix timestamp query emission with no pipeline stats The if (!pipe && timestamp) logic was broken. It should have been : if (!pipe && !timestamp) Let just drop this condition as the following code does the right thing for all cases. An error was appearing with the following variables : VK_INSTANCE_LAYERS=VK_LAYER_MESA_overlay VK_LAYER_MESA_OVERLAY_CONFIG=gpu_timing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ea7a6fa980` ("vulkan/overlay: add pipeline statistic & timestamps support") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `213d6527d4`)	2019-05-23 08:50:11 +00:00
Marek Olšák	c1d83ae9fb	radeonsi: update buffer descriptors in all contexts after buffer invalidation Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824 Cc: 19.1 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `78e35df52a`) [Juan: resolve trivial conflicts] [Juan: remove the commit from the ignored cherry-pick] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/gallium/drivers/radeonsi/si_state_draw.c	2019-05-23 08:48:21 +00:00
Juan A. Suarez Romero	1dd62eb6e2	Update version to 19.1.0-rc3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-21 14:09:14 +00:00
Caio Marcelo de Oliveira Filho	ab75e1e289	nir: Fix clone of nir_variable state slots When num_state_slots is 0, don't create the array. This was triggering the following assert when running vkcube with NIR_TEST_CLONE=1 vkcube: ../src/compiler/nir/nir_split_per_member_structs.c:66: split_variable: Assertion `var->state_slots == NULL' failed. Fixes: `9fbd390dd4` "nir: Add support for cloning shaders" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `005cc9ae37`)	2019-05-21 09:04:42 +00:00
Charmaine Lee	2153c3ae8e	mesa: unreference current winsys buffers when unbinding winsys buffers This fixes surface leak when no winsys buffers are bound. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `12bf7cfecf`)	2019-05-21 09:02:06 +00:00
Charmaine Lee	04e9d7bf8f	st/mesa: purge framebuffers with current context after unbinding winsys buffers With commit `c89e8470e5`, framebuffers are purged after unbinding context, but this change also introduces a heap corruption when running Rhino application on VMware svga device. Instead of purging the framebuffers after the context is unbound, this patch first ubinds the winsys buffers, then purges the framebuffers with the current context, and then finally unbinds the context. This fixes heap corruption. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `b480adfa5e`)	2019-05-21 08:59:28 +00:00
Juan A. Suarez Romero	857210b0dd	cherry-ignore: radeonsi: update buffer descriptors in all contexts after buffer invalidation stable: this commit causes issues in several systems. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-21 08:54:23 +00:00
Eric Engestrom	6bac1a041d	meson: expose glapi through osmesa Suggested-by: Pierre Guillou <pierre.guillou@lip6.fr> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659 Fixes: `f121a669c7` "meson: build gallium based osmesa" Fixes: `cbbd5bb889` "meson: build classic osmesa" Cc: Brian Paul <brianp@vmware.com> Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (cherry picked from commit `ccb8ea7acf`)	2019-05-21 08:42:32 +00:00
Jason Ekstrand	2040f10cb0	anv: Only consider minSampleShading when sampleShadingEnable is set From the Vulkan 1.1.107 spec: Sample shading is enabled for a graphics pipeline: - If the interface of the fragment shader entry point of the graphics pipeline includes an input variable decorated with SampleId or SamplePosition. In this case minSampleShadingFactor takes the value 1.0. - Else if the sampleShadingEnable member of the VkPipelineMultisampleStateCreateInfo structure specified when creating the graphics pipeline is set to VK_TRUE. In this case minSampleShadingFactor takes the value of VkPipelineMultisampleStateCreateInfo::minSampleShading. Otherwise, sample shading is considered disabled. In other words, if sampleShadingEnable is set to VK_FALSE, we should ignore minSampleShading. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `1c92358bd8`)	2019-05-21 08:42:32 +00:00
Jason Ekstrand	260f517d54	anv: Stop forcing bindless for images This was an unintended artifact of my testing of bindless images. We should be choosing bindless or not dynamically. Fixes: `c0d9926df7` "anv: Use bindless handles for images" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `8413fd136c`)	2019-05-21 08:42:32 +00:00
Neha Bhende	b6778c9f52	draw: fix memory leak introduced `7720ce32a` We need to free memory allocation PrimitiveOffsets in draw_gs_destroy(). This fixes memory leak found while running piglit on windows. Fixes: `7720ce32a` ("draw: add support to tgsi paths for geometry streams. (v2)") Tested with piglit Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `926a6a35cf`)	2019-05-21 08:42:32 +00:00
Jason Ekstrand	5d05324e65	anv: Emulate texture swizzle in the shader when needed Now that we have the descriptor buffer mechanism, emulated texture swizzle can be implemented in a very non-invasive way. Previous attempts all tried to extend the push constant based image param mechanism which was gross. This could, in theory, be done much faster with a magic back-end instruction which does indirect MOVs but Vulkan on IVB is already so slow this isn't going to matter much. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104355 Cc: "19.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `d2aa65eb18`)	2019-05-21 08:42:32 +00:00
Samuel Pitoiset	8dbdeb27f3	radv: add a workaround for Monster Hunter World and LLVM 7&8 The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. This is fixed with LLVM r361008, but we need a workaround for older LLVM versions unfortunately. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `d7501834cd`)	2019-05-21 08:42:32 +00:00
Gert Wollny	dab3945ff3	Revert "softpipe/buffer: load only as many components as the the buffer resource type provides" This reverts commit `865b9ddae4`. The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only one component would be supported. The original issue is still relevant, but the fix should be different. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `0f598ed7b3`)	2019-05-21 08:42:32 +00:00
Dave Airlie	d08fde8e7a	glsl: init packed in more constructors. src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls. from Coverity. Fixes: `659f333b3a` (glsl: add packed for struct types) Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `b2d4d08a5c`)	2019-05-21 08:42:32 +00:00
Nanley Chery	f69eb770cd	anv: Fix some depth buffer sampling cases on ICL+ Don't attempt sampling with HiZ if the sampler lacks support for it. On ICL, the HW docs state that sampling with HiZ is not supported and that instances of AUX_HIZ in the RENDER_SURFACE_STATE object will be interpreted as AUX_NONE. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `629806b55b`)	2019-05-21 08:42:32 +00:00
Caio Marcelo de Oliveira Filho	5bed00cf0f	nir: Fix nir_opt_idiv_const when negatives are involved First, allow the case for negative powers of two. Then ensure that we use the absolute value of the non-constant value to calculate the quotient -- this was hinted in the code by the name 'uq'. This fixes an issue when 'd' is positive and 'n' is negative. The ishr will propagate the negative sign and we'll use nir_ineg() again, incorrectly. v2: First version used only ishr, but that isn't sufficient, since it never can produce a zero as a result. (Jason) Allow negative powers of two. (Caio) Fixes: `74492ebad9` "nir: Add a pass for lowering integer division by constants" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `8a995f2b5e`)	2019-05-21 08:42:32 +00:00
Marek Olšák	b551be82a7	radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets This is a prerequisite for the next commit. Cc: 19.1 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `0f1b070bad`)	2019-05-17 07:41:15 +00:00
Eric Engestrom	7fa89fd959	util/os_file: always use the 'grow' mechanism Use fstat() only to pre-allocate a big enough buffer. This fixes a race where if the file grows between fstat() and read() we would be missing the end of the file, and if the file slims down read() would just fail. Fixes: `316964709e` "util: add os_read_file() helper" Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `22c1657d05`)	2019-05-16 17:20:13 +00:00
Lionel Landwerlin	5fcfcdb162	nir: lower_non_uniform_access: iterate over instructions safely This pass moves instructions around and adds control-flow in the middle of blocks. We need to use nir_foreach_instr_safe to ensure that we iterate over instructions correctly anyway. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bd5457641` ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `e04cf0b612`)	2019-05-16 17:18:34 +00:00
Lionel Landwerlin	d70d8b2ffa	vulkan/overlay: fix truncating error on 32bit platforms Non dispatchable handles can be uint64_t. When compiling the layer on a 32bit platform, this will lead to casting uint64_t into (void *) which is 32bit, leading to incorrect handles being mapped internally in the layer. v2: Use more HKEY() (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Józef Kucia <joseph.kucia@gmail.com> Fixes: `2d2927938f` ("vulkan/overlay-layer: fix cast errors") Reviewed-by: Józef Kucia <joseph.kucia@gmail.com> (cherry picked from commit `877b371cbb`) [Juan: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/vulkan/overlay-layer/overlay.cpp	2019-05-16 09:40:47 +02:00
Lionel Landwerlin	558a067d17	vulkan/overlay-layer: fix cast errors Not quite sure what version of GCC/Clang produces errors (8.3.0 locally was fine). v2: also fix an integer literal issue (Karol) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `2d2927938f`)	2019-05-16 07:36:57 +00:00
Lionel Landwerlin	51354d2bf5	nir: fix lower_non_uniform_access pass Obviously missing the instruction insertion into the SSA list. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bd5457641` ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `391a836e8f`)	2019-05-16 07:34:09 +00:00
Ian Romanick	06bf5428cf	Revert "nir: add late opt to turn inot/b2f combos back to bcsel" This reverts commit `7acc865226`. With these optimizations in place, the extra constant folding added in the next commit extends some live ranges of 0.0 and ±1.0 constants, and that causes several hundred shaders to have more spills and fills. I believe this optimization we made basically irrelevant by `7725d60938` "intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))". All Gen7.5+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17225303 -> 17224634 (<.01%) instructions in affected programs: 879402 -> 878733 (-0.08%) helped: 679 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.03% max: 0.93% x̄: 0.24% x̃: 0.05% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.02 -0.95 95% mean confidence interval for instructions %-change: -0.26% -0.22% Instructions are helped. total cycles in shared programs: 360842595 -> 360828542 (<.01%) cycles in affected programs: 110443594 -> 110429541 (-0.01%) helped: 389 HURT: 265 helped stats (abs) min: 1 max: 7525 x̄: 162.81 x̃: 28 helped stats (rel) min: <.01% max: 18.66% x̄: 1.11% x̃: 0.11% HURT stats (abs) min: 1 max: 7614 x̄: 185.96 x̃: 48 HURT stats (rel) min: <.01% max: 25.08% x̄: 0.95% x̃: 0.10% 95% mean confidence interval for cycles value: -75.65 32.67 95% mean confidence interval for cycles %-change: -0.49% -0.06% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 12159 -> 12161 (0.02%) spills in affected programs: 13 -> 15 (15.38%) helped: 0 HURT: 1 total fills in shared programs: 25207 -> 25208 (<.01%) fills in affected programs: 25 -> 26 (4.00%) helped: 0 HURT: 1 Ivy Bridge total instructions in shared programs: 12082019 -> 12082013 (<.01%) instructions in affected programs: 1033 -> 1027 (-0.58%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.41% max: 0.83% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.78% -0.45% Instructions are helped. total cycles in shared programs: 179849270 -> 179849157 (<.01%) cycles in affected programs: 4735 -> 4622 (-2.39%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 74 x̄: 28.25 x̃: 18 helped stats (rel) min: 0.13% max: 6.53% x̄: 2.85% x̃: 2.36% 95% mean confidence interval for cycles value: -82.73 26.23 95% mean confidence interval for cycles %-change: -7.98% 2.28% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10882750 -> 10882748 (<.01%) instructions in affected programs: 266 -> 264 (-0.75%) helped: 2 HURT: 0 Iron Lake total cycles in shared programs: 188609440 -> 188609448 (<.01%) cycles in affected programs: 4320 -> 4328 (0.19%) helped: 0 HURT: 2 GM45 total cycles in shared programs: 129016868 -> 129016872 (<.01%) cycles in affected programs: 2302 -> 2306 (0.17%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `d2a9ba03e3`) [Juan: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/compiler/nir/nir_opt_algebraic.py	2019-05-15 10:36:12 +02:00
Jason Ekstrand	75ea0eeed1	intel/fs/ra: Stop adding RA interference to too many SENDS nodes We only have one node per VGRF so this was adding way too much interference. No idea how we didn't catch this before. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355543197 (0.02%) cycles in affected programs: 2472492 -> 2547639 (3.04%) helped: 17 HURT: 20 Fixes: `014edff0d2` "intel/fs: Add interference between SENDS sources" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `096ad8a809`)	2019-05-15 08:28:06 +00:00
Jason Ekstrand	8cf49e1662	intel/fs/ra: Only add dest interference to sources that exist Fixes: `83dedb6354` "i965: Add src/dst interference for certain" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `88cac12230`)	2019-05-15 08:26:52 +00:00
Juan A. Suarez Romero	c03d9a7fa9	Update version to 19.1.0-rc2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-14 15:36:06 +02:00
Gert Wollny	9b51dcf1e2	softpipe/buffer: load only as many components as the the buffer resource type provides Otherwise we risk to read past the end of the buffer. In addition, change the loop counters to unsigned to be consistent with the types. Fixes: `afa8707ba9` softpipe: add SSBO/shader atomics support. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `865b9ddae4`)	2019-05-14 08:41:50 +00:00
Bas Nieuwenhuizen	914ac06e32	radv: Do not use extra descriptor space for the 3rd plane. While ImageFormatProperties returns the number of internal descriptors, it turns out that applications do not need to actually allocate more descriptors in the descriptor pool. So if we make descriptors with more planes larger we have to be convervative and always allocate space for the larger descriptors which is a waste given the low usage of this ext. So let us make use of the fact that 3plane formats all have the same formats & dimensions for the last two planes. This way we only need the first half of the descriptor of the 3rd plane and can share the second half of the second plane. This allows us to use 16 bytes for the descriptor which nicely fits into the 16 bytes that are unused right next to the sampler. Fixes: `5564c38212` "radv: Update descriptor sets for multiple planes." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `f53ebfb450`)	2019-05-13 10:47:26 +00:00
Józef Kucia	e2654c2379	radv: clear vertex bindings while resetting command buffer Only vertex inputs accessed by vertex shader must have valid buffers bound. Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `5010436e09` "radv: bail out when binding the same vertex buffers" (cherry picked from commit `24af0f1318`)	2019-05-13 10:45:10 +00:00
Marek Olšák	bb845df961	st/mesa: fix 2 crashes in st_tgsi_lower_yuv src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct tgsi_full_dst_register , const struct tgsi_full_dst_register , unsigned int): assertion "dst->Register.WriteMask" failed The second crash was due to insufficient allocated size for TGSI instructions. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `83435e748f`)	2019-05-13 10:44:06 +00:00
Kenneth Graunke	f7c0ca6d38	iris: Use full ways for L3 cache setup on Icelake. Anuj fixed this in i965 and anv, but the fix never landed in iris. Fixes tessellation corruption on Icelake. Thanks to Rafael for bisecting this and tracking it down. Fixes: `d0996d5fab` iris: Emit default L3 config for the render pipeline Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (cherry picked from commit `72ccefb529`)	2019-05-13 10:41:16 +00:00
Caio Marcelo de Oliveira Filho	38fdfdaff1	anv: Fix limits when VK_EXT_descriptor_indexing is used Update various limits in VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously zero to their values from VkPhysicalDeviceLimits. When using VK_EXT_descriptor_indexing, the former limits will apply to all the descriptor layout sets -- not only those using the new feature bits. For the reference, VK_EXT_descriptor_indexing says "There are new descriptor set layout and descriptor pool creation flags that are required to opt in to the update-after-bind functionality, and there are separate maxPerStage* and maxDescriptorSet* limits that apply to these descriptor set layouts which may be much higher than the pre-existing limits. The old limits only count descriptors in non-updateAfterBind descriptor set layouts, and the new limits count descriptors in all descriptor set layouts in the pipeline layout." Fixes: `6e230d7607` "anv: Implement VK_EXT_descriptor_indexing" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `3610081daa`)	2019-05-13 10:40:03 +00:00
Lionel Landwerlin	87722e0c42	vulkan/overlay: keep allocating draw data until it can be reused The original implementation assumed that we could allocate the same amount of command buffers as the number of images in the swapchain. But the application could potentially render much faster and rerender into images that have been submitted for presentation but not yet presented. This change keeps on allocating command buffers, vertex buffer, vertex indices as well as a semaphore and a fence for as long as we can't reuse a previously submitted one. This fixes rendering issues in the overlay at high frame rates. v2: Don't recreate semaphores constantly (Józef) v3: Drop useless surface & FreeCommandBuffers (Józef) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655 Cc: 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Józef Kucia <joseph.kucia@gmail.com> (cherry picked from commit `ad2b4aa378`) [Juan: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Conflicts: src/vulkan/overlay-layer/overlay.cpp	2019-05-13 12:37:08 +02:00
Kenneth Graunke	f0e147bd47	i965: Fix memory leaks in brw_upload_cs_work_groups_surface(). This was taking a reference to the 64kB upload buffer and never returning it, leaking a reference each time this atom triggered. This leaked lots of 64kB upload BOs, eventually running us out of of VMA space. This would usually happen when using mpv to watch a movie, after 20-40 minutes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134 Fixes: `63d7b33f51` i965/cs: Setup surface binding for gl_NumWorkGroups Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `3f60810de0`)	2019-05-13 10:31:35 +00:00
Eric Engestrom	1fc65774e9	travis: fix syntax, and drop unused stuff Fixes: `a988d95389` "ci: Delete autotools build jobs" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `6e5728e5c9`)	2019-05-13 10:30:30 +00:00
Leo Liu	349153f097	winsys/amdgpu: add VCN JPEG to no user fence group There is no user fence for JPEG, the bug triggering kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT) Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `ceba9ff294`)	2019-05-10 17:06:21 +00:00
Tomeu Vizoso	f8ec40e28b	panfrost: Only take the fast paths on buffers aligned to block size As the functions operate on 16-byte blocks. Fixes this Valgrind error: Invalid read of size 4 at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85) by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171) by 0x584F587: panfrost_tile_texture (pan_resource.c:489) by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525) by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516) by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515) by 0x5875F13: u_default_texture_subdata (u_transfer.c:80) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd at 0x483F5C8: malloc (vg_replace_malloc.c:299) by 0x584F47D: panfrost_transfer_map (pan_resource.c:467) by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243) by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2) Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: 19.1 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c3538ab570`)	2019-05-10 17:04:58 +00:00
Tomeu Vizoso	5e75803339	panfrost: Fix two uninitialized accesses in compiler Valgrind was complaining of those. NIR_PASS only sets progress to TRUE if there was progress. nir_const_load_to_arr() only sets as many constants as components has the instruction. This was causing some dEQP tests to flip-flop, such as: dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fixes: `14531d676b` ("nir: make nir_const_value scalar") (cherry picked from commit `554975bafa`)	2019-05-10 17:02:37 +00:00
Rob Clark	f1ab22209e	freedreno/ir3: fix rasterflat/glxgears Ofc legacy gl features that are broken don't trigger fails in deqp. I should remember to test glxgears more often. Fixes: `7ff6705b8d` freedreno/ir3: convert to "new style" frag inputs Signed-off-by: Rob Clark <robdclark@chromium.org> (cherry picked from commit `9faf218b8c`)	2019-05-10 17:00:35 +00:00
Lionel Landwerlin	e0c082d6eb	anv: Use corresponding type from the vector allocation We didn't notice this issue much because the 2 struct share a similar layout, expect for the additional fields... We run into that issue in Anv : ==15236== Invalid write of size 8 ==15236== at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211) ==15236== by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264) ==15236== by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312) ==15236== by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167) ==15236== by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190) ==15236== by 0x8CF60871: alloc_surface_state (anv_image.c:1122) ==15236== by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519) ==15236== by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358) ==15236== Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd ==15236== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==15236== by 0x8D2578E6: u_vector_init (u_vector.c:47) ==15236== by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168) ==15236== by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921) ==15236== by 0x8CF56517: anv_CreateDevice (anv_device.c:1909) ==15236== by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073) ==15236== by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so) ==15236== by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so) ==15236== by 0x8BCB35C6: loader_create_device_chain (loader.c:5449) ==15236== by 0x8BCBC230: vkCreateDevice (trampoline.c:838) v2: Rename mmap_cleanups to avoid confusion (Caio) v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `f2f6ac1c08`)	2019-05-10 16:58:53 +00:00
Samuel Pitoiset	a97f44ac1f	radv: fix setting the number of rectangles when it's dyanmic We need to know the number of rectangles. This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*. Fixes: `5db0bf9994` ("radv: Implement VK_EXT_discard_rectangles.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `53dfff1c4d`)	2019-05-09 10:44:18 +00:00
Dave Airlie	5d7d13d227	kmsro: add _dri.so to two of the kmsro drivers. Fixes: `8cfc17bdda` (kmsro: Add the rest of the current set of tinydrm drivers.) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `0a42d5b98b`)	2019-05-09 10:43:03 +00:00
Dylan Baker	4a7b0cc5e4	meson: Force the use of config-tool for llvm meson git now has a cmake find method for llvm, but it lacks a couple of features that we use from the config tool version. Until that reaches parity we need to use the config-tool version. CC: 19.0 19.1 <<mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `0d59459432`)	2019-05-09 10:40:22 +00:00
Lionel Landwerlin	d95797de61	anv: fix use after free Once mem->bo is removed from the cache, it is likely to be freed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b80930a6fe` ("anv: add support for VK_EXT_memory_budget") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `43596e5f34`)	2019-05-09 10:39:19 +00:00
Lionel Landwerlin	424b60dc70	anv: rework queries writes to ensure ordering memory writes We use a mix of MI & PIPE_CONTROL commands to write our queries' data (results & availability). Those commands' memory write order is not guaranteed with regard to their order in the command stream, unless CS stalls are inserted between them. This is problematic for 2 reasons : 1. We copy results from the device using MI commands even though the values are generated from PIPE_CONTROL, meaning we could copy unlanded values into the results and then copy the availability that is inconsistent with the values. 2. We allow the user to poll on the availability values of the query pool from the CPU. If the availability lands in memory before the values then we could return invalid values. This change does 2 things to address this problem : - We use either PIPE_CONTROL or MI commands to write both queries values and availability, so that the ordering of the memory writes guarantees that if availability is visible, results are also visible. - For the occlusion & timestamp queries we apply a CS stall before copying the results on the device, to ensure copying with MI commands see the correct values of previous PIPE_CONTROL writes of availability (required by the Vulkan spec). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `a07d06f103`)	2019-05-09 10:38:10 +00:00
Timothy Arceri	9d610c1cc3	Revert "glx: Fix synthetic error generation in __glXSendError" This reverts commit `e91ee763c3`. This seems to have broken a number of wine games. Lets revert everything for now and try again later. Acked-by: Adam Jackson <ajax@redhat.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590 (cherry picked from commit `a01b393c39`)	2019-05-08 10:32:39 +00:00
Kenneth Graunke	f770e81ba7	i965: leave the top 4Gb of the high heap VMA unused This ports commit `9e7b0988d6` from anv to i965. Thanks to Lionel for noticing that it was missing! Fixes: `01058a5522` i965: Add virtual memory allocator infrastructure to brw_bufmgr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `d568fcd0a0`)	2019-05-08 10:28:39 +00:00
Kenneth Graunke	faa7daa55e	i965: Force VMA alignment to be a multiple of the page size. This should happen regardless, but let's be paranoid. Fixes: `01058a5522` i965: Add virtual memory allocator infrastructure to brw_bufmgr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `17210c63a9`)	2019-05-08 10:27:20 +00:00
Kenneth Graunke	fd27561c9d	i965: Fix BRW_MEMZONE_LOW_4G heap size. The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages, and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB. So we can't use the top page. Fixes: `01058a5522` i965: Add virtual memory allocator infrastructure to brw_bufmgr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `15f134c628`)	2019-05-08 12:26:08 +02:00
Juan A. Suarez Romero	5d72a334e8	Update version to 19.1.0-rc1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-07 16:10:40 +00:00
Timothy Arceri	825ca9e42e	radeonsi: add config entry for Counter-Strike Global Offensive This fixes rendering issues with gun scopes which is rather important. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100239 (cherry picked from commit `49025292fb`)	2019-05-07 10:47:39 +00:00
Erik Faye-Lund	67f2be0fbf	draw: flush when setting stream-out targets We need to re-prepare the middle-end state to pick up changes to this state to react correctly to pausing/resuming stream-out. So let's add a flush here. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `ec8cbd79ac` "draw/softpipe: EXT_transform_feedback support (v2)" Reviewed-by: Roland Scheidegger <sroland@vmware.com> (cherry picked from commit `d84b85bc28`)	2019-05-07 10:46:02 +00:00
John Stultz	05faf6eb56	mesa: Makefile.sources: Add nir_lower_fb_read.c to Makefile.sources list In commit `a99c360a46` (nir: add pass to lower fb reads), a new file was added that needs to also be added to the Makefile.sources list used by the Android and SCons build system. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `a99c360a46` ("nir: add pass to lower fb reads") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org> (cherry picked from commit `c7f2145b4b`)	2019-05-06 17:04:24 +02:00
John Stultz	3495bdca13	mesa: Makefile.sources: Add ir3_nir_lower_load_barycentric_at_sample/offset to Makefile.sources In commit `2f0b9d2249` ("freedreno/ir3: lower load_barycentric_at_offset") a new file was added that needs to also be added to the Makefile.sources list used by Android and SCons build system. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `2f0b9d2249` ("freedreno/ir3: lower load_barycentric_at_offset") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org> (cherry picked from commit `d04f44a459`)	2019-05-06 17:03:09 +02:00
John Stultz	f93e1f92c4	mesa: android: freedreno: Fix build failure due to path change The ir3_nir_trig.py file was moved in a previous commit, `aa0fed10d3` (freedreno: move ir3 to common location), so update the Android.gen.mk file to match. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org> (cherry picked from commit `c935862127`)	2019-05-06 17:02:18 +02:00
Amit Pundir	8c0b80e08a	mesa: android: freedreno: build libfreedreno_{drm,ir3} static libs Add libfreedreno_drm/ir3 to the build Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `b4476138d5` ("freedreno: move drm to common location") Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> [jstultz: Tweaked to add extra ir3 files from master] Signed-off-by: John Stultz <john.stultz@linaro.org> (cherry picked from commit `88105375c9`)	2019-05-06 17:00:59 +02:00
Bas Nieuwenhuizen	070d763d5d	radv: Implement cosited_even sampling. Apparently cosited_even was the required one instead of midpoint. This adds slight offset of 0.5 pixels to the coordinates (+ we need the image size to convert to normalized coords) Fixes: `91702374d5` "radv: Add ycbcr lowering pass." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `5692351264`)	2019-05-06 16:59:58 +02:00
Bas Nieuwenhuizen	ed0d4eaa4c	radv: Disable subsampled formats. Broken on Polaris and since I discovered NV12 is not subsampled, but a 2-plane format I decided I don't really care. Work to do to re-enable: 1) Figure out which devices support it natively. 2) Write some software emulation for the others. Fixes: `52c1adda21` "radv: Add ycbcr format features." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `5cbe12ad1b`)	2019-05-06 16:59:00 +02:00
Timothy Arceri	6e52daa18c	util/drirc: add workarounds for bugs in Doom 3: BFG This makes the game playable on radeonsi. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110143 (cherry picked from commit `1af72fa4d6`)	2019-05-06 16:48:57 +02:00
Rob Clark	bdd273d873	freedreno: remove unused forward struct declaration Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 13:59:56 -07:00
Alyssa Rosenzweig	6823873246	panfrost/midgard: iabs cannot run on mul Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:51 +00:00
Alyssa Rosenzweig	cdd9189aad	panfrost/midgard: Lower mixed csel (NIR) Basically, when the conditions of a csel diverge, we scalarize to avoid going into weird code paths during emit. We could be doing better, but this case can't occur organically from GLSL as far as I can, though it does fix lowered atan2. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:51 +00:00
Alyssa Rosenzweig	58a1e1f86c	panfrost/midgard: Fix RA when temp_count = 0 A previous commit by Tomeu aborted RA early, which solves the memory corruption issue, but then generates an incorrect compile. This fixes that. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:51 +00:00
Alyssa Rosenzweig	3d7874c699	panfrost/midgard: Fix integer selection Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:51 +00:00
Alyssa Rosenzweig	31f5a43bf0	panfrost: Support RGB565 FBOs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	f8c7ffa07a	panfrost/midgard/disasm: Handle dest_override generalized Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	b6b534c733	panfrost/midgard/disasm: Stub out 64-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	8c36ecd4b1	panfrost/midgard/disasm: Print 8-bit sources This handles the usual case. 8-bit register access parallels 16-bit access, but with one major caveat: in 8-bit mode, only half of the register file is actually (directly) accessible as sources. In particular, for each 16-bit integer register (hrN), we can only index a single 8-bit integer (qrN), corresponding to the lower 8-bits. To get the upper 8-bits, it is required to do an explicit shift. For example, to add the bytes of a 16-bit integer hr0.x and get the result as an 8-bit qr0, you'd need to do something like: ilsr hr1.x, hr0.x, #8 iadd qr0.x, qr0.x, qr1.x This scheme diverges from 32-bit registers, in that both the upper and lower halves of a 32-bit register are individually accessible as a pair of half registers. For contrast, to add the lower and upper 16-bits of a 32-bit integer r0.x, you can just: iadd hr0.x, hr0.x, hr1.x Since hr1.x = upper 16-bit of r0.x. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	2800e822a4	panfrost/midgard/disasm: Support 8-bit destination Meanwhile, we're forced to disable dest_override, since it's not yet clear how this interacts with other bitnesses (it'll likely need to be overhauled in any case). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	d42c37e494	panfrost/midgard: Rename ilzcnt8 -> iclz Per OpenCL. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	9559280fc3	panfrost/midgard: Fix crash on unknown op Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	96eed4e04b	panfrost/midgard/disasm: Fill in .int mod Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	7469df70c8	panfrost/midgard/disasm: Extend print_reg to 8-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	055f6def30	panfrost/midgard/disasm: Catch mask errors We silently ignored certain bits of the mask, which causes issues when disassembly 8/64-bit ops. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Alyssa Rosenzweig	576a27fd55	panfrost/midgard: reg_mode_full -> reg_mode_32, etc In preparation for 8-bit and 64-bit operands, let's not reinforce the 32-bit-centric biases in the ISA. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-04 19:08:50 +00:00
Rob Clark	2da36dd0b6	freedreno/a6xx: deduplicate a few lines Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	555ca49d2b	freedreno: add ubwc_enabled helper Since it is dependent on the tile mode (ie. disabled for smaller mipmap levels), we should handle it a similar way to fd_resource_level_linear(). The code previously mostly did the right thing because the old helper took the tile mode. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	62c0b02717	freedreno: move UBWC color offset to fd_resource_offset() Best to keep it encapsulated in the helper which returns layer/level offset (and actually use that helper everywhere) rather than spreading the logic around the code. Also add a helper to find UBWC offset, to complete the encapsulation. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	a871b5ffaa	freedreno/a6xx: buffer resources cannot be compressed Small cleanup. They are just an array of data and only ever linear/ uncompressed. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	05f5122d4a	freedreno: mark imported resources as valid If someone is importing a buffer, we can't really know the state of it's contents, so assume it is valid. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	11583dc655	freedreno/a6xx: UBWC support for images There are still some fallbacks we'll need to handle before we can enable UBWC by default. I think we may need to fallback to uncompressed if image atomic operations are used. And we still need to sort out how to handle image and sampler views of compressed resources if the image/ sampler view is using a format that does not support compression. (I think the latter should hopefully be uncommon outside of deqp/piglit.) But at least this gets us to the point where supertuxkart works properly with UBWC enabled ;-) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	857d9f3b02	freedreno/a6xx: UBWC fixes A few fixes that get UBWC working for the games/benchmarks where I noticed problems before (in particular and manhattan, and stk (modulo image support for UBWC when compute shaders are used for post-process effects): + fix the size of the UBWC meta buffer (ie, the offset to color pixel data) that is returned by ->fill_ubwc_buffer_sizes() + correct size/layout for 8 and 16 byte per pixel formats + limit the supported formats.. Note all formats that can be tiled can be compressed. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	6ffb58726b	freedreno: update generated headers Corrects tex state ubwc pitch/size Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	fb1488a800	freedreno/a6xx: OUT_RELOC vs OUT_RELOCW fixes Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Rob Clark	8c97b3c546	freedreno/ir3: remove assert Fixes dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 and .20 `ca3eb5db66` went from silently truncating the constant state, which was also the wrong thing to do, to an assert. Which then showed up in a couple of dEQPs. Actually there is nothing wrong with larger constant file so just drop the assert. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-04 11:50:44 -07:00
Karol Herbst	7f85283103	spirv/cl: support vload/vstore Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-04 12:27:51 +02:00
Karol Herbst	d11b807da5	nir: Add nir_op_vec helper with that we can simplify code where nir vectors are created v2: merge both lines in nir_vec Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-04 12:27:51 +02:00
Karol Herbst	681fb7ea05	nir: Add a nir_builder_alu variant which takes an array of components v2: rename to nir_build_alu_src_arr Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-04 12:27:51 +02:00
Karol Herbst	c91ea6343f	vtn: handle bitcast with pointer src/dest v2: use vtn_push_ssa and vtn_ssa_value Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-04 12:27:51 +02:00
Mathias Fröhlich	c989661985	mesa: Leave aliasing of vertex and generic0 attribute to the dlist code. Now that dlist compilation again knows if it is inside glBegin/glEnd, we can leave the decision if aliasing should occur to the vertex attribute setter functions instead of doing that at glArrayElement time. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	c869387d8a	mesa: Correct the is_vertex_position decision for dlists. We have to use _mesa_inside_dlist_begin_end instead of _mesa_inside_begin_end to see if we are inside a glBegin/glEnd block in case of display lists. So split the is_vertex_position function used in vertex attribute processing into a imm and dlist variant and use the appropriate _mesa_inside_begin_end variant. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	5ad54217ff	mesa: Set CurrentSavePrimitive in vbo_save_NotifyBegin. That seems to be lost somewhere. Is needed for correct outside begin/end detection in display list compilation. And is needed for correct aliasing in dlists restablished in the next changes. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	0ed7603d97	mesa: Remove the _glapi_table argument from _mesa_array_element. The value is now unused. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	3b6f32907f	mesa: Constify static const array in api_arrayelt.c Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	68aaf0a4e3	mesa: Remove the now unused _NEW_ARRAY state change flag. Is no longer used, so we have less occasions where NewState is non zero. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	7af047c373	mesa: Rip out now unused gl_context::aelt_context. Now this part of gl_context state is unused and can be removed. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:35 +02:00
Mathias Fröhlich	b9de48581a	mesa: Implement _mesa_array_element by walking enabled arrays. In glArrayElement, use the bitmask trick to just walk the enabled vao arrays. This should be about equivalent in execution time to walk the prepare aelt_context list. Finally this will allow us to reduce the _mesa_update_state calls in a few patches. v2: Add comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:40:19 +02:00
Mathias Fröhlich	7a5dea6320	mesa: Use glVertexAttribNV functions for fixed function attribs. In the glArrayElement implementation, use glVertexAttribNV type functions for fixed function attributes. We do the same in display execution when the list is replayed using immediate mode attribute functions. Using a single set of function pointers enables to use a unified loop to walk the vertex array attributes. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:39:42 +02:00
Mathias Fröhlich	60076a6171	mesa: Factor out index function that will have multiple use. For access to glArrayElement methods factor out a function to get the table lookup index for normalized/integer/double access. The function will be used in the next patch at least twice. v2: Use vertex_format_to_index instead of NORM_IDX. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-04 07:39:18 +02:00
Jason Ekstrand	91899495a1	nir: Add a SSA type gathering pass This new pass (which isn't even compile-tested) attempts to determine the ALU type of all the SSA values in a function impl. It takes a greedy approach and assigns intness or floatness to everything it thinks can possibly contain an int or a float. Some values will be labled as both int and float and some will be labled as neither and it is up to the caller to decide what to do with this information. However, for a "nice" shader where the original source contained no bit-casts and no implicit bit-casts were introduced by optimizations, there shouldn't be any overlap in the two sets save for the odd CSEd zero constant. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-04 03:52:05 +00:00
Kenneth Graunke	694d1a08d3	iris: Delete bucketing allocators These add a lot of complexity, and I currently can't measure any performance benefit from having them. In the past, I seem to recall seeing a benefit in drawoverhead scores, but currently it looks like dropping them is either a wash or 1-2% faster. Drop them to simplify allocations.	2019-05-03 19:50:26 -07:00
Kenneth Graunke	bd4b18d255	iris: Force VMA alignment to be a multiple of the page size. This should happen regardless, but let's be paranoid.	2019-05-03 19:48:37 -07:00
Kenneth Graunke	068a700195	iris: leave the top 4Gb of the high heap VMA unused This ports commit `9e7b0988d6` from anv to iris. Thanks to Lionel for noticing that it was missing!	2019-05-03 19:48:37 -07:00
Kenneth Graunke	21062e21d9	iris: Fix 4GB memory zone heap sizes. The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages, and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB. So we can't use the top page.	2019-05-03 19:48:37 -07:00
Julien Isorce	8cd71f399e	st/va: check resource_get_info nullity in vlVaDeriveImage This pipe_screen function is not implemented by all backends. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-05-03 16:11:55 -07:00
Jason Ekstrand	30fa15e36b	anv,i965: Stop warning about incomplete gen11 support Both drivers are feature-complete and should be running more-or-less at perf at this point. Drop the warning. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-05-03 22:57:35 +00:00
Connor Abbott	d0ea9877b8	nir/algebraic: Don't emit empty initializers for MSVC Just don't emit the transform array at all if there are no transforms v2: - Don't use len(array) > 0 (Dylan) - Keep using ARRAY_SIZE to make the generated C code easier to read (Jason).	2019-05-04 00:13:21 +02:00
Kenneth Graunke	8987152ac1	iris: Resolve textures used by the program, not merely bound textures st/mesa's PBO upload path binds a vertex shader that doesn't use any textures, but leaves the existing sampler views bound in place. This was tricking us into thinking the PBO destination might be bound for texturing in some cases. In Civilization VI, this fixes a false self- dependency issue that was preventing CCS_E compression on upload. Fixing this slightly improves frame times.	2019-05-03 13:03:22 -07:00
Dylan Baker	c613861b23	meson: Don't build glsl cache_test when shader cache is disabled v2: - Use new with_shader_cache variable instead of host_machine.system() == 'windows' Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:58:31 -07:00
Dylan Baker	a216aea7af	tests/vma: fix build with MSVC Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:58:27 -07:00
Dylan Baker	5eb0f33e4f	glsl/tests: define ssize_t on windows Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:58:24 -07:00
Dylan Baker	76338933e9	util/tests: Use define instead of VLA To allow the this test to be built with MSVC, which doesn't support VLAs. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:58:17 -07:00
Dylan Baker	ff9bf223c2	meson: make nm binary optional This makes nm not required, but used if found. In general I imagine that this means that on windows nm wont be found, and on other platforms it will. v2: - fix gbm and egl symbols check tests to only be run if nm is found - reword commit message to reflect the code change Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:58:05 -07:00
Dylan Baker	f5eafc2dc6	meson: Make shader-cache a trillean instead of boolean So that it can be implicitly disabled on windows, where it doesn't compile. v2: - Use an auto-option rather than automagic. - fix shader_cache check (== -> !=) v4: - Use new with_shader_cache instead of get_option('shader-cache') elsewhere in the meson build Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:57:36 -07:00
Dylan Baker	ddc15fba2b	meson: switch gles1 and gles2 to auto options This allows them to default to false on windows, but default to true elsewhere. As a side effect turning off shared-glapi now automatically turns off gles. Shared glapi remains a boolean defaulting to true. v5: - new in this version Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:57:19 -07:00
Dylan Baker	113bb8d448	glsl: fix general_ir_test with mingw Somewhere down in the depths of the mingw headers 'interface' is defined, change it to iface like a similar patch did. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:57:17 -07:00
Dylan Baker	f1d5f2aff3	meson: always define libglapi This allows the identifier to be used even if shared-glapi isn't build, which simplifies a bunch of things. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-03 10:57:10 -07:00
Chuck Atkins	a381dbf253	meson: Fix missing glproto dependency for gallium-glx Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Cc: mesa-stable <mesa-stable@lists.freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-03 13:36:25 -04:00
Samuel Pitoiset	4f18c43d1d	radv: apply the indexing workaround for atomic buffer operations on GFX9 Because the new raw/struct intrinsics are buggy with LLVM 8 (they weren't marked as source of divergence), we fallback to the old instrinsics for atomic buffer operations only. This means we need to apply the indexing workaround for GFX9. The load/store operations still use the new LLVM 8 intrinsics. The fact that we need another workaround is painful but we should be able to clean up that a bit once LLVM 7 support will be dropped. This fixes a GPU hang with AC Odyssey and some rendering problems with Nioh. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 Fixes: `31164cf5f7` ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-03 17:59:12 +02:00
Alyssa Ross	e340d7beef	get_reviewer.pl: improve portability Not all package managers / users will install perl into /usr/bin, but /usr/bin/env /should/ always be present. Using /usr/bin/env means that we can't give the -w argument to Perl, so I added `use warnings' in the script. Reviewed-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-03 14:32:44 +01:00
Lionel Landwerlin	80dc78407d	anv: fix crash when application does not provide push constants Found while running Talos Principle. As far as I can tell running a draw call with a pipeline having push constants without the application having called vkCmdPushConstants gives undefined push constant values. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2019-05-03 10:21:40 +01:00
Samuel Pitoiset	e68d7bec67	radv: fix radv_get_aspect_format() for D+S formats This restores the previous behaviour before YCBCR landed. For D+S formats, it returns the depth format. This fixes an assertion with Thrones of Britannia. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110540 Fixes: `66507cc656` ("radv: Add single plane image views & meta operations") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-03 09:01:10 +02:00
Caio Marcelo de Oliveira Filho	aa675cef5e	intel/fs: Assert when brw_fs_nir sees a nir_deref_instr Since `09f1de97a7` "anv,i965: Lower away image derefs in the driver" the backend compiler is not expected to handle any derefs, so let's assert on it. This helps identifying problems when a deref is not lowered and "leaks" into the backend compiler. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-02 23:25:30 -07:00
Julien Isorce	a77512635e	r600: implement resource_get_info Factoring code with resource_get_handle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Dave Airlie airlied@redhat.com	2019-05-03 05:54:28 +00:00
Dave Airlie	512a31a412	util/bitset: fix bitset range mask calculations. The MASK macro is used in the RANGE macro, and it should return the pre-bitset word mask for the (b) value. i.e. BITSET_MASK(0) should be undefined since it's meaningless. BITSET_MASK(31) should give 0x7fffffff BITSET_MASK(32) should give 0xffffffff BITSET_MASK(33) should give 0x00000001 BITSET_MASK(64) should give 0xffffffff However then BITSET_RANGE ends up broken for cases where it's (b) value is the 0,32,64 value as in that case the lower mask would be 0 not 0xffffffff. This fixes the unit tests that I've added, and my code that uses bitsets. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `bb38cadb1c` "More GLSL code" Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-03 15:23:04 +10:00
Dave Airlie	18973a450e	util/tests: add basic unit tests for bitset The last test here currently fails as there is a bug in bitset.h Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-03 15:23:04 +10:00
Dave Airlie	6fd6246d92	nir: fix lower vars to ssa for larger vector sizes. This has a couple of hardcoded vec4 limits in it, change them to the proper sizing to avoid future issues. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-03 15:23:00 +10:00
Dave Airlie	2774d39366	spirv: fix SpvOpBitSize return value. The spir-v spec says this returns a bool. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-03 15:22:57 +10:00
Kenneth Graunke	5ff5d0a895	iris: Disable dual source blending when shader doesn't handle it This is a port of Danylo's `eca4a6548d` which fixed the hang on i965. It fixes GPU hangs in his new Piglit test, arb_blend_func_extended-dual-src-blending-discard-without-src1. I avoided my own review feedback here, and decided to simply adjust 3DSTATE_PS_BLEND rather than BLEND_STATE_ENTRY[0]. It has never been clear to me which the hardware uses in every case. However, whacking the enable in 3DSTATE_PS_BLEND seems to be sufficient to fix the hang, and that packet is already dynamic, so it's easy to handle. I'd rather avoid making BLEND_STATE_ENTRY[0] dynamic unless I have to.	2019-05-02 21:14:49 -07:00
Jason Ekstrand	be7e9870d6	anv: Stop including POS in FS input limits It is an input but it comes in as part of the shader payload and doesn't count towards the limits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-02 18:56:51 -05:00
Rob Clark	b73dd91f60	nir: fix nir tex print harder Fixes: `691d5a825a` nir: rework tex instruction printing Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-02 15:06:01 -07:00
Erik Faye-Lund	96924aa92e	docs: fixup mistake in contents During a rebase, it seems I accidentally broke the contents-menu, leading to a duplicate link to freedesktop.org. This was obviously not intended. Let's fix this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `7eee13c467` ("docs: use dl/dd instead of blockquote for freedesktop link") Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-05-02 23:23:15 +02:00
Erico Nunes	568e8fc736	lima/ppir: support nir_op_ftrunc Support nir_op_ftrunc by turning it into a mov with a round to integer output modifier. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-02 20:55:56 +00:00
Eric Engestrom	1291c68c9c	gitlab-ci: merge meson-glvnd into meson-swr There's no need to have a whole build just for that flag, we can add it to any build. v2: Add a note about why we put glvnd where we did (by anholt). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2) Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 20:39:12 +00:00
Eric Engestrom	043b54a35d	gitlab-ci: simplify meson job names Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 20:39:12 +00:00
Eric Engestrom	43f1546420	gitlab-ci: meson-gallium-radeonsi was a subset of meson-gallium-clover-llvm Let's just drop it. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 20:39:12 +00:00
Eric Engestrom	41407c602c	gitlab-ci: merge several meson jobs Merge the following into `meson-main`/`meson-loader-classic-dri`/ `meson-gallium-swr`: - meson-vulkan - meson-gallium-drivers-other - meson-gallium-st-other Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> [ Michel Dänzer ] * Rebase and fix up commit log. * Don't set VULKAN_DRIVERS in meson-loader-classic-dri. * Remove extraneous whitespace. * Squash in follow-up fixes. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> [ anholt] * Add a note why nine and swrast landed where they did. * Switch from s/meson-vulkan/meson-main/ to s/meson-loader-classic-dri/meson-main/ which I think was the original intent Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (anholt changes) Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 20:39:12 +00:00
Heinrich	9b80322532	gbm: Improve documentation of BO import - Add GBM_BO_IMPORT_FD_MODIFIER to documentation of supported foreign object types - Add newline before documentation block - Improve language Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-05-02 20:36:38 +00:00
Samuel Pitoiset	62001f3dff	radv: only need to force emit the TCS regs on Vega10 and Raven1 Other GFX9 chips aren't affected. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 22:29:01 +02:00
Marek Olšák	b3a26d4628	glsl: fix and clean up NV_compute_shader_derivatives support - make sure compute shader derivatives are exposed for all extensions - unify duplicated code Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-02 16:09:24 -04:00
Marek Olšák	20909284f2	st/dri: decrease input lag by syncing sooner in SwapBuffers It's done by: - decrease the number of frames in flight by 1 - flush before throttling in SwapBuffers (instead of wait-then-flush, do flush-then-wait) The improvement is apparent with Unigine Heaven. Previously: draw frame 2 wait frame 0 flush frame 2 present frame 2 The input lag is 2 frames. Now: draw frame 2 flush frame 2 wait frame 1 present frame 2 The input lag is 1 frame. Flushing is done before waiting, because otherwise the device would be idle after waiting. Nine is affected because it also uses the pipe cap.	2019-05-02 16:09:24 -04:00
Erik Faye-Lund	d30ce03bc0	meson: add build-summary This roughly mirrors what we get from autotools. There's a few differences, though: 1. The "exec_prefix" output has been dropped. Meson doesn't support this, so it makes no sense here. 2. The "llvm-config" output has been dropped. Meson abstracts dependency discovery a bit more than our autotools build-system does, so it's not easy to get this information as-is. 3. HUD extra stats, SWR archs, Shared/Static libs and CFLAGS / CXXFLAGS / LDFLAGS has been dropped. These can be inspected by "meson configure". 4. How we set defines works quite differently in our Meson build-system, and the result isn't quite the same. In particular, the DEFINES output has been dropped, to avoid having to refactor the code too much. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109326 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 18:30:29 +00:00
Erik Faye-Lund	2127403439	meson: give dri- and gallium-drivers separate vars Variables are cheap, and there's little reason for the dri and gallium drivers to work on the same variable for the driver list. So let's split these in two separate lists instead. This makes it easier to inspect these after-the fact, for instance for generating a summary of build-settings. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 18:30:29 +00:00
Erik Faye-Lund	28f18915b8	meson: lift driver-collection out into parent build-file This way we can mark the dri_drivers and dri_link arrays as temporary, as all knowledge about them are contained in a single build-file with clearly visible limited life-span. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-02 18:30:29 +00:00
Rob Clark	c14b13d0ff	docs: mark KHR_blend_equation_advanced done on a6xx Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-02 11:19:22 -07:00
Rob Clark	8c77e669a8	freedreno/a6xx: smaller hammer for fb barrier We just need to do a sequence of commands to flush the cache. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	6fa8a6d60f	freedreno/a6xx: KHR_blend_equation_advanced support Wire up support to sample from the fb (and force GMEM rendering when we have fb reads). The existing GLSL IR lowering for blend_equation_advanced does the rest. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	650246523b	freedreno/ir3: fb read support Lower load_output to txf_ms_fb and add support for the new texture fetch instruction. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	0704ddb2e5	freedreno/drm: expose GMEM_BASE address Needed for sampling from tile buffer (GMEM). Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	a99c360a46	nir: add pass to lower fb reads Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	a2c89a85f4	nir: fix lower_wpos_ytransform in load_frag_coord case Apparently we never hit this path. Or at least haven't for a rather long time. But in either case (load_deref or load_frag_coord), we can just directly use the intrinsic's ssa dest. So stop passing the nir_variable (which would be NULL in the load_frag_coord case) around and instead just use &intr->dest.ssa. (This ofc means we need to setup the cursor to insert after the instruction, which seems to be another bug of the original implementation.) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	691d5a825a	nir: rework tex instruction printing The extra comma at the end was annoying me. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Rob Clark	ca3eb5db66	freedreno/ir3: add some ubo range related asserts And a comment.. since we are mixing units of bytes/dwords/vec4, hopefully this will avoid some unit confusion. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-02 11:19:22 -07:00
Rob Clark	e941faf3e8	freedreno/ir3: add IR3_SHADER_DEBUG flag to disable ubo lowering It isn't quite as simple as not running the pass, since with packed varyings we get load_ubo for block==0 (ie. the "real" uniforms). So instead run the pass normally but decline to lower anything in block > 0 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-02 11:19:22 -07:00
Rob Clark	f697f61590	freedreno/ir3: fix lowered ubo region alignment Since we emit UBO regions INDIRECTly (ie. not copied into cmdstream but emit by EXT_SRC_ADDR) we need to keep them 4*vec4 aligned. Which the code already mostly did, except for aligning the first UBO region itself (ie. the one after block==0 which is the "real" uniforms). Fixes: `893425a607` freedreno/ir3: Push UBOs to constant file Fixes: `3c8779af32` freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-02 11:19:22 -07:00
Rob Clark	32925f4072	freedreno/ir3: fix shader variants vs UBO analysis Otherwise we zero out the state again, but all the UBO loads that we could lower are already lowered. End result is that we didn't emit the uniforms for lowered UBO access in any case where multiple shader variants are used. Fixes: `893425a607` freedreno/ir3: Push UBOs to constant file Fixes: `3c8779af32` freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-02 11:19:22 -07:00
Lionel Landwerlin	ff4168c418	vulkan/overlay: add TODO list Keen on having other people contribute. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 17:02:57 +01:00
Lionel Landwerlin	99cb2d325f	vulkan/overlay: make overriden functions static And fix the unused CmdDrawIndirect. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:57 +01:00
Lionel Landwerlin	f2afd6bd76	vulkan/overlay: make overlay size configurable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:55 +01:00
Lionel Landwerlin	7d908038ad	vulkan/overlay: add a frame counter option This is useful to normalize the numbers written into the output file as those number are accumulated over a period of time and number of frames. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:35 +01:00
Lionel Landwerlin	81fd6ba7cc	vulkan/overlay: record all select metrics into output file The output looks something like this (csv style) : fps, frame, frame_timing(us), submit, draw_indexed, pipeline_graphics, acquire_timing(us), vert_invocations, frag_invocations, gpu_timing(ns) 480.55, 242, 501512, 247, 1444, 1204, 714, 5827272, 113043296, 121424174 467.80, 234, 500214, 234, 1412, 1176, 648, 5635680, 109436188, 117743760 424.37, 213, 501923, 213, 2130, 1704, 623, 5132448, 99657292, 105474683 472.15, 237, 501962, 237, 2370, 1896, 667, 5710752, 110924644, 122226004 411.32, 206, 500826, 206, 2060, 1648, 709, 4963776, 96491764, 95333273 458.87, 230, 501228, 230, 2300, 1840, 634, 5542080, 107758204, 123112090 475.01, 238, 501044, 238, 2380, 1904, 631, 5734848, 111477480, 122087426 471.08, 236, 500972, 236, 2360, 1888, 655, 5686656, 110498496, 114816162 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:34 +01:00
Lionel Landwerlin	74a9fdd8a2	vulkan/overlay: add a margin to the size of the window Looks a bit better. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:07 +01:00
Lionel Landwerlin	7ba50d8040	vulkan/overlay: add no display option In case you're just interested in data being record to the output file. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:07 +01:00
Lionel Landwerlin	ea7a6fa980	vulkan/overlay: add pipeline statistic & timestamps support v2: switch to VkBase{In,Out}Structure v3: Add timestamps at begin/end of primary command buffers to estimate gpu time spent per submission (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)	2019-05-02 17:02:06 +01:00
Lionel Landwerlin	4438188f49	vulkan/overlay: record stats in command buffers and accumulate on exec/submit This significantly reworks how numbers displayed are computed. We accumulate operations written into command buffers and add those to the device when submitted to a queue. These collected values are then used to compute per frame overlay data. We also accumulate the data over the sampling fps period to produce numbers for that period of time. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:06 +01:00
Lionel Landwerlin	9eddceef44	vulkan/overlay: update help printout Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 17:02:06 +01:00
Lionel Landwerlin	a1e6b5e9be	vulkan/util: generate a helper function to return pNext struct sizes This will be used to copy chains of structures so that we can alterate some of them. v2: Drop vk_util.h include (Eric) Use VkBaseInStructure directly (Eric) v3: Drop --platforms= param to generator script, instead produce a file with #ifdef based what platforms are compiled. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 17:02:02 +01:00
Tomeu Vizoso	ad7c9ba0ec	panfrost/midgard: Skip liveness analysis for instructions without dest [Alyssa: Add comment explanation] Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-02 15:29:48 +00:00
Tomeu Vizoso	a5dddc2d42	panfrost/midgard: Skip register allocation if there's no work to do Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-02 15:29:41 +00:00
Eric Engestrom	7c15a87aea	gitlab-ci: add scons windows build using mingw Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 15:10:59 +00:00
Eric Engestrom	a34ee4dec7	egl: hard-code destroy function instead of passing it around as a pointer Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-02 14:44:16 +00:00
Connor Abbott	6ec4ed48fc	nir/search: Add debugging code to dump the pattern matched This was useful while debugging the previous commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-02 16:14:06 +02:00
Connor Abbott	7ce86e6938	nir/search: Add automaton-based pre-searching nir_opt_algebraic is currently one of the most expensive NIR passes, because of the many different patterns we've added over the years. Even though patterns are already sorted by opcode, there are still way too many patterns for common opcodes like bcsel and fadd, which means that many patterns are tried but only a few actually match. One way to fix this is to add a pre-pass over the code that scans it using an automaton constructed beforehand, similar to the automatons produced by lex and yacc for parsing source code. This automaton has to walk the SSA graph and recognize possible pattern matches. It turns out that the theory to do this is quite mature already, having been developed for instruction selection as well as other non-compiler things. I followed the presentation in the dissertation cited in the code, "Tree algorithms: Two Taxonomies and a Toolkit," trying to keep the naming similar. To create the automaton, we have to perform something like the classical NFA to DFA subset construction used by lex, but it turns out that actually computing the transition table for all possible states would be way too expensive, with the dissertation reporting times of almost half an hour for an example of size similar to nir_opt_algebraic. Instead, we adopt one of the "filter" approaches explained in the dissertation, which trade much faster table generation and table size for a few more table lookups per instruction at runtime. I chose the filter which resulted the fastest table generation time, with medium table size. Right now, the table generation takes around .5 seconds, despite being implemented in pure Python, which I think is good enough. Based on the numbers in the dissertation, the other choice might make table compilation time 25x slower to get 4x smaller table size, but I don't think that's worth it. As of now, we get the following binary size before and after this patch: text data bss dec hex filename 11979455 464720 730864 13175039 c908ff before i965_dri.so text data bss dec hex filename 12037835 616244 791792 13445871 cd2aef after i965_dri.so There are a number of places where I've simplified the automaton by getting rid of details in the LHS patterns rather than complicate things to deal with them. For example, right now the automaton doesn't distinguish between constants with different values. This means that it isn't as precise as it could be, but the decrease in compile time is still worth it -- these are the compilation time numbers for a shader-db run with my (admittedly old) database on Intel skylake: Difference at 95.0% confidence -42.3485 +/- 1.375 -7.20383% +/- 0.229926% (Student's t, pooled s = 1.69843) We can always experiment with making it more precise later. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-02 16:14:06 +02:00
Samuel Pitoiset	08be23bfde	radv: set WD_SWITCH_ON_EOP=1 when drawing primitives from a stream output buffer According to RadeonSI, this seems to be required by the hardware to avoid GPU hangs. I think I just forgot to set that bit when I implemented VK_EXT_transform_feedback. This fixes a GPU hang with Space Engineers and DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110291 Fixes: `b4eb029062` ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 15:55:46 +02:00
Brian Paul	48107b5a2b	glsl: fix typo in #warning message Trivial. Spotted by Eric Engestrom.	2019-05-02 06:32:57 -06:00
Brian Paul	f0f7c3b03a	svga: add SVGA_NO_LOGGING env var (v2) valgrind crashes when we try to initialize host logging. This env var can be used to disable logging. v2: rebase onto "svga: move host logging to winsys". Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-02 06:09:35 -06:00
Charmaine Lee	9c5f407b0b	svga: move host logging to winsys This patch adds a host_log interface to svga_winsys and moves the host logging code to the winsys layer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-02 06:09:35 -06:00
Eric Engestrom	da8d9e2d88	wsi/wayland: document lack of vkAcquireNextImageKHR timeout support Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:51:03 +00:00
Daniel Stone	9826e04eca	vulkan/wsi/wayland: Respect non-blocking AcquireNextImage If the client has requested that AcquireNextImage not block at all, with a timeout of 0, then don't make any non-blocking calls. This will still potentially block infinitely given a non-infinte timeout, but the fix for that is much more involved. Signed-off-by: Daniel Stone <daniels@collabora.com> Cc: mesa-stable@lists.freedesktop.org Cc: Chad Versace <chadversary@chromium.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108540 Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:51:03 +00:00
Erik Faye-Lund	8a67e4d30a	docs: reorder heading and notice All other pages has the heading as ghe first thing in the article. Let's clean this up for consistency. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	561c2b9bfa	docs: drop centered heading for faq The FAQ is the only article we have that uses a centered heading, which makes it look odd compared to the other articles. Let's drop the centering for consistency. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	da4994f252	docs: turn faq-index into an ordered list HTML already have a way of doing automatically ordered lists, so let's use that instead of open-coding one. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	afda72dc10	docs: replace empty list with a none-paragraph Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	a4ee15d5fe	docs: fix closing of list-items Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	b9eaeffaba	docs: fixup list-item tags The list items needs to contain everything part of the item, not just the first paragraph. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	830821aaa4	docs: fix closing of paragraphs Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	02a5698017	docs: add missing lists Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	767c517816	docs: fixup bad paragraphing This markup seems to assume paragraphs survive across block-elements, which isn't the case. Let's rectify that. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	b877722d75	docs: remove stray list-start Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	c61e9aef76	docs: don't pointlessly close and re-start definition lists Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	0ea4ef2473	docs: fix incorrectly closed paragraph Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	d69c790c22	docs: drop paragraph around preformatted text Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	8ef86c9240	docs: start paragraph before closing it Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	41573d486f	docs: close paragraphs before preformatted text It's illegal to nest block-level elements such as <pre> inside <p> in HTML. This means that when the paragraphs gets closed after a <pre>-tag, we end up closing a non-existent tag, so the browser inserts a dummy <p>-tag. This is entirely pointless, so let's just close these tags before the <pre>-tag instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:16 +00:00
Erik Faye-Lund	5630540a27	docs: remove stray paragraph-close This isn't matching any paragraph-open tags, so let's get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	3bda82b2e5	docs: close lists These lists never got closed. Let's fix that to avoid issues with bad parsers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	92917e82e8	docs: close paragraphs before lists paragraphs can't contain lists, and attempting to close them after the list just cause an extra, empty paragraph to be created. We don't want that, so let's close the paragraphs before the list intead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	0c3bab7761	docs: open list-item before closing it A list-item must be openened before it can be closed. So let's replace this closing tag with an opening tag. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	7eee13c467	docs: use dl/dd instead of blockquote for freedesktop link The blockquote happens to match the indentation of the other lists for most browsers, but this isn't a guarantee. Let's instead use a definition-list, which is more strongly connected to a list, so it's more likely to have the same indention. This also makes sure that we don't have similar padding on the right-hand side, in case we change the text-size. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	3f0568d7e5	docs: use h2 instead of b-tag for headings <b>-tags aren't allowed in the root of <body>, so let's replace these with <h2>-tags with some CSS to make them appear as bold. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	e81b6aa311	docs: remove stray paragraph-close This tag tries to close a non-existent paragraph. Let's get rid of it! Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	5b2a7062ff	docs: properly escape ampersand Even in preformatted blocks, ampersands should be escaped. Let's correct this, in case of strict parsers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Erik Faye-Lund	13b990000f	docs: properly escape '>' The '>'-symbol should usually be escaped to avoid confusing strict parsers. While it's very unlikely to cause issues as-is, let's quite it for good measure. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:09:15 +00:00
Rhys Perry	13c423629e	radv: fix set_output_usage_mask() with composite and 64-bit types It previously used var->type instead of deref_instr->type and didn't handle 64-bit outputs. This fixes lots of transform feedback CTS tests involving transform feedback and geometry shaders (mostly dEQP-VK.transform_feedback.fuzz.random_geometry.*) v2: fix writemask widening when comp != 0 v3: fix 64-bit variables when comp != 0, again Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 10:24:20 +01:00
Erik Faye-Lund	8194d3887e	docs: do not hard-code header-height It's generally nicer to do this in terms of em units, as that scales better with text-sizes, if we ever decide to change them. The result is slightly larger than before, but only by a couple of pixels. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	5ffe4879b6	docs: simplify css-centering With "display: flex;" we can make this a bit more automatic, not requiring a bunch of values to be of specific values to get the right centering. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	130400b904	docs: use multiple background-images for header This is a bit tidier than to set a background on the h1-text, requiring it to be full height and all. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	cb0123e37a	docs: remove spurious newline Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	3eec974143	docs: avoid repeating the color The color attribute is inherited in CSS, so there's no point in repeating this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	86e38330d3	docs: avoid repeating the font The font attribute is inherited in CSS, so there's no point in repeating this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	755c118a4f	docs: add missing semicolon While it's legal to omit the last semicolon in a CSS block, it's generally not considered good style, as it makes it harder to add new lines. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	3085eb90a0	docs: remove long commented out css These attributes has been commented out since 2005; I don't think there's a big chance of them making a return as-is. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	b6321d2f67	docs: remove non-existent css attribute There's no CSS-attribute named "link", so let's remove it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Erik Faye-Lund	a2b0000d3c	docs: normaize css-indent style Tabs has been around as the indention style of this file since it was created. Some newer CSS has added double-spaces, but let's keep it consistent. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 08:45:57 +00:00
Thomas Hellstrom	20b7839392	winsys/svga: Don't abort on EBUSY errors from execbuffer This error code typically indicated that a buffer object that was referenced by the command stream was being used for CPU access by another client. The correct action here is to retry after a while. Use usleep() until we have proper kernel support for this wait. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:51:15 +02:00
Thomas Hellstrom	c69557c4a2	winsys/svga: Update the drm interface file The file vmwgfx_drm.h was a bit outdated. Update to a recent version, including defines supporting coherent memory. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:51:07 +02:00
Thomas Hellstrom	978d66e4d5	svga: Avoid bouncing buffer data in malloced buffers Some constant- and texture upload buffer data may bounce in malloced buffers before being transferred to hardware buffers. In the case of texture upload buffers this seems to be an oversight. In the case of constant buffers, code comments indicate that we want to avoid mapping hardware buffers for reading when copying out of buffers that need modification before being passed to hardware. In this case we avoid data bouncing for upload manager buffers but make sure buffers that we read out from stay in malloced memory. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:51:00 +02:00
Thomas Hellstrom	5961189f4e	winsys/svga: Enable the transfer_from_buffer GPU command for vgpu10 We didn't have the path using this command enabled as typically we take an alternate path using DMA uploads. Emable it so that we can exercise that code-path by turning off the DMA path. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:50:52 +02:00
Thomas Hellstrom	50e58966fa	winsys/svga: Add an environment variable to force host-backed operation The vmwgfx kernel module has a compatibility mode for user-space that is not guest-backed resource aware. Add an environment variable to facilitate testing of this mode on guest-backed aware kernels: if the environment variable SVGA_FORCE_HOST_BACKED is defined, the driver will use host-backed operation. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:50:22 +02:00
Samuel Pitoiset	492e828848	ac: tidy up ac_build_llvm8_tbuffer_{load,store} For consistency with ac_build_llvm8_buffer_{load,store}_common helpers and that will help a bit for removing the vec3 restriction. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Samuel Pitoiset	6ac10e07c2	radv: implement a workaround for VK_EXT_conditional_rendering Per the Vulkan spec 1.1.107, the predicate is a 32-bit value. Though the AMD hardware treats it as a 64-bit value which means it might fail to discard. I don't know why this extension has been drafted like that but this definitely not fit with AMD. The hardware doesn't seem to support a 32-bit value for the predicate, so we need to implement a workaround. This fixes an issue when DXVK enables conditional rendering with RADV, this also fixes the Sasha conditionalrender demo. Fixes: `e45ba51ea4` ("radv: add support for VK_EXT_conditional_rendering") Reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Samuel Pitoiset	e03e7c510f	radv: fix color conversions for normalized uint/sint formats The hardware actually rounds before conversion. This now matches what values are used when performing fast clears vs slow clears. This fixes a rendering issue with Far Cry 3&4. This also fixes a bunch of CTS tests that use a 8-bit UNORM format (only when the 512*512 image size hint is manually disabled). Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Samuel Pitoiset	6162543999	radv: do not need to force emit the TCS regs on Vega20 This chip doesn't need the fixup. This fixes a bunch of dEQP-VK.tessellation tests and avoid random GPU hangs. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Jason Ekstrand	bf774b56be	util/bitset: Return an actual bool from test macros I want to be able to do BITSET_TEST() != BITSET_TEST() and this isn't currently possible because BITSET_TEST() returns a random bit. Compare to zero to get an actual Boolean. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-02 03:12:54 +00:00
Brian Paul	413e55b5b9	glsl: work around MinGW 7.x compiler bug I'm not sure what triggered this, but building with scons platform=windows toolchain=crossmingw machine=x86 build=profile with MinGW g++ 7.3 or 7.4 causes an internal compiler error. We can work around it by forcing -O1 optimization. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-01 20:06:54 -06:00
Brian Paul	96540e4f0a	llvmpipe: init some vars to NULL to silence MinGW compiler warnings Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-01 20:06:54 -06:00
Marek Olšák	2d48a6959f	radeonsi: set sampler state and view functions for compute-only contexts	2019-05-01 21:16:13 -04:00
Marek Olšák	bfd3d50487	radeonsi: use new atomic LLVM helpers This depends on "ac,ac/nir: use a better sync scope for shared atomics"	2019-05-01 21:16:13 -04:00
Marek Olšák	181dcf0792	st/mesa: don't flush the front buffer if it's a pbuffer This is the best guess I can make here. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-01 21:15:33 -04:00
Marek Olšák	35294f2eca	mesa: fix pbuffers because internally they are front buffers This fixes the egl_ext_device_base piglit test, which uses EGL pbuffers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-01 21:15:33 -04:00
Marek Olšák	f753f913f5	mesa: rework error handling in glDrawBuffers It's needed by the next pbuffer fix, which changes the behavior of draw_buffer_enum_to_bitmask, so it can't be used to help with error checking. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-01 21:15:33 -04:00
Bas Nieuwenhuizen	0c99b5ace8	radv: Restrict YUVY formats to 1 layer. Fixes: `8bb3cec7c9` "radv: Expose VK_EXT_ycbcr_image_arrays." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 02:29:51 +02:00
Bas Nieuwenhuizen	aab201635e	radv: Set is_array in lowered ycbcr tex instructions. Fixes array tests. Fixes: `91702374d5` "radv: Add ycbcr lowering pass." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 02:29:51 +02:00
Bas Nieuwenhuizen	2c57d3361a	radv: Fix hang width YCBCR array textures. Forgot to apply the width/height divisor for CB writes resulting in the CB using larger than expected slice sizes. Fixes: `42d159f276` "radv: Add multiple planes to images." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110530 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110526 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 02:29:51 +02:00
Erico Nunes	257a9b0a94	lima/gpir: add limit of max 512 instructions It has been noted that the lima GP has a limit of 512 instructions, after which the shaders don't work and fail silently. This commit adds a check to make the shader compilation abort when the shader exceeds this limit, so that we get a clear reason for why the program will not work. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-02 00:02:58 +00:00
Alyssa Rosenzweig	09c669260f	panfrost: Fix blend shader upload Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-01 23:20:51 +00:00
Alyssa Rosenzweig	910608b29a	panfrost/decode: Hit MRT blend shader enable bits Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-01 23:20:50 +00:00
Alyssa Rosenzweig	b304b30f2c	panfrost: Remove shader dump Redundant via the midgard shader dump. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-01 23:20:48 +00:00
David Riley	dec68e32ea	virgl: Re-use and extend queue transfers for intersecting buffer subdatas. Small buffer subdatas which are essentially doing a memcpy were getting bogged down by all the overhead of creating new transfers. Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-01 15:48:51 -07:00
David Riley	a54c231b56	virgl: Allow transfer queue entries to be found and extended. Intersecting transfer queue entries allow for the possibility of extending an existing transfer instead of creating a new one (and all the associated mappign/unmapping). Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-01 15:48:46 -07:00
David Riley	e94a9a7f38	virgl: Store mapped hw resource with transfer object. Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-01 15:48:28 -07:00
Kenneth Graunke	ebbb05b3c9	iris: Fix imageBuffer and PBO download. Recently we added checks to try and deny multisampled shader images. Unfortunately, this messed up imageBuffers, which have sample_count = 0, which are also used in PBO download, causing us hit CPU map fallbacks. Fixes: `b15f5cfd20` iris: Do not advertise multisampled image load/store. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-01 14:37:46 -07:00
Dave Airlie	e2fecf57e3	r600: reset tex array override even when no view bound If no view is bound we still should reset the override to 0 and array mode. This should fix misrendering in firefox WebRender since the pbo sampler was removed. Fixes: `1250383e36` (st/mesa: remove sampler associated with buffer texture in pbo logic)	2019-05-02 07:34:32 +10:00
Ian Romanick	85e6865ff6	nir: Saturating integer arithmetic is not associative In 8-bits, iadd_sat(iadd_sat(0x7f, 0x7f), -1) = iadd_sat(0x7f, -1) = 0x7e but, iadd_sat(0x7f, iadd_sat(0x7f, -1)) = iadd_sat(0x7f, 0x7e) = 0x7f Fixes: `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-01 09:07:47 -07:00
Eric Engestrom	70da00ffd6	util: move #include out of #if linux This #include is needed for `NULL`, which is used on all OSes, not just Linux. Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `316964709e` "util: add os_read_file() helper" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-05-01 15:45:47 +00:00
Alok Hota	a44420d9cc	swr/rast: Add general SWTag statistics Update Archrast parser to use stats, used with an internal tool Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00
Alok Hota	b8adb540a0	swr/rast: Add string handling to AR event framework For use by an internal tool Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00
Alok Hota	f355f03388	swr/rast: Add initial SWTag proto definitions Update gen_archrast.py to properly generate event IDs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00
Alok Hota	396831adf8	swr/rast: Cleanup and generalize gen_archrast - Update meson.build - Includes current_build_dir() fix meson/swr: replace hard-coded path with current_build_dir() Fixes: `93cd9905c8` "swr/rast: Cleanup and generalize gen_archrast" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alok Hota <alok.hota@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> - Clean up meson.build (remove foreach loop, replace with single call) - Update SConscript - use `$SOURCES` to call `CodeGenerate` with multiple source files Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00
Eric Engestrom	47f419d0b3	gitlab-ci: build vulkan drivers in clang build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-05-01 14:37:31 +00:00
Erik Faye-Lund	f753ac355e	softpipe: setup pixel_offset for all primitive types If we don't update this for all primitive-types, we end up rendering slightly offset points and lines up until the point where the first triangle gets drawn. This is obviously not correct, and violates OpenGL's repeatability rule. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `ca9c413647` ("softpipe: Respect gl_rasterization_rules in primitive setup.") Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-05-01 13:53:02 +00:00
Jonathan Marek	0c6702cfa5	nir: improve convert_yuv_to_rgb Use a different arrangement of constants to allow more ffma. A vec4 backend will now use 3 fma for yuv_to_rgb. On freedreno/ir3, it is down from 10 to 7 alu (4 fma, 3 mul, 3 add to 7 fma). Other backends shouldn't be hurt. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-01 04:13:36 -07:00
Gert Wollny	becd192801	doc: Update feature matrix Since softpipe doesn't truely support multisample, I've not added softpipe to the "Enhanced per-sample shading" even though with the advertised GLSL level ARB_gpu_shader5 is advertised. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:40:46 +02:00
Gert Wollny	6162ce6c60	softpipe: Increase the GLSL feature level This will enable calls to the interpolateAt* functions, but also a bunch of other features. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:40:39 +02:00
Gert Wollny	338017c58a	softpipe: Add support for TGSI_OPCODE_INTERP_CENTROID Like with interpolatAtSample this is also not really implementing the according sampling and will only work correctly for pixels that are fully covered, but since softpipe only supports one sample this is good enough for now. v2: Correct spelling (Roland Scheidegger) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:40:20 +02:00
Gert Wollny	c3df4e0601	softpipe: Add support for TGSI_OPCODE_INTERP_OFFSET Since for this opcode the offsets are given manually the function should actually also work for non-zero offsets, but the related piglits only ever test with offset 0. Accordingly the patch satisfies "fs-interpolateatoffset-*". Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:40:16 +02:00
Gert Wollny	27bfd57bc7	softpipe: Add (fake) support for TGSI_OPCODE_INTERP_SAMPLE Softpipe doesn't support more than one sample, so this function implements the interpolation at sample 0 and adds a stub to make it possible to interpolate at other samples. As it is this makes the piglits "fs-interpolateatsample-*" pass, but they only ever test sample 0 anyway. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:40:10 +02:00
Gert Wollny	e405e32d36	softpipe: Add an per-input array for interpolator correctors to machine This adds entry points for correcting the interpolation values if the interpolation is done by using one of the interpolateAt* functions. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:40:06 +02:00
Gert Wollny	5f0959f8df	softpipe: Factor out evaluation of the source indices We will need these for per sample interpolation as well Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:39:58 +02:00
Gert Wollny	7d5c8d3589	softpipe: evaluate cube the faces on a per sample bases Now that the LOD is evaluated up front the cube faces can also be evauate on a per sample basis instead of using the quad. This fixes a large number of deqp gles 3 and 31 cube texture tests. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:23:23 +02:00
Gert Wollny	aacdce2879	softpipe: keep input lod for explicite derivatives This only affects anisotropic interpolation. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:23:19 +02:00
Gert Wollny	d4b6ae223f	softpipe: tie in new code path for lod evaluation This enables the use of explicit gradients. Also remove an unused parameter when changing the interfaces. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:23:07 +02:00
Gert Wollny	9e26a0ed8f	softpipe: Move selection of shadow values up and clean parameter list The shadow evaluation compare parameter is stored in different locations, depending on the texture type. Move the values to a common location free the lod storage and to be able to reduce the number of parameters. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:23:02 +02:00
Gert Wollny	41dc16b928	softpipe: Pipe gather_comp through from st_tgsi_get_samples The value is stored in the lod components and this will be overwritten when swithcing to the new code path. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:22:56 +02:00
Gert Wollny	724a73509e	softpipe: Prepare handling explicit gradients This only adds corde that is not yet enabled. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:22:47 +02:00
Gert Wollny	7c004d093a	softpipe: Factor gradient evaluation out of the lambda evaluation this is useful when we want to use explicit gradients. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-01 08:22:28 +02:00
Andrii Simiklit	5c581b3dd6	egl: return correct error code for a case req ver < 3 with forward-compatible The EGL_KHR_create_context spec says: "If an OpenGL context is requested and the values for attributes EGL_CONTEXT_MAJOR_VERSION_KHR and EGL_CONTEXT_MINOR_VERSION_KHR, when considered together with the value for attribute EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR, specify an OpenGL version and feature set that are not defined, than an EGL_BAD_MATCH error is generated." This case is already correctly handled a bit below in the same source file. The correct handling was added by commit: `63beb3df` Reported-by: Ian Romanick <idr@freedesktop.org> Here: https://bugzilla.freedesktop.org/show_bug.cgi?id=92552#c9 Fixes: `11cabc45b7` "egl: rework handling EGL_CONTEXT_FLAGS" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-05-01 00:14:00 +00:00
Timothy Arceri	90f3bf7437	radeonsi/nir: call radeonsi nir opts before the scan pass Some of the opts are not called in the general optimastion loop in the state trackers glsl -> nir conversion. We need to call the radeonsi specific optimisation once before scanning over the nir otherwise we can end up gathering info on code that is later removed. Fixes an assert in the piglit test: ./bin/varying-struct-centroid_gles3 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-01 09:41:07 +10:00
Timothy Arceri	a004e95dd7	radeonsi/nir: create si_nir_opts() helper We will make use of this in the following commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-01 09:41:07 +10:00
Alok Hota	4c68acba37	swr/rast: early exit on empty triangle mask Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-30 19:48:12 +00:00
Alok Hota	e7f381e9ca	swr/rast: add guards for cpuid on Linux Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-30 19:48:12 +00:00
Alok Hota	ae436203d9	swr/rast: add flat shading Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-30 19:48:12 +00:00
Alok Hota	9d01f4d631	swr/rast: add SWR_STATIC_ASSERT() macro Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-30 19:48:12 +00:00
Alok Hota	3851c6c9bf	swr/rast: update guardband rects at draw setup It's dependent on other state fields Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-30 19:48:12 +00:00
Alok Hota	2729d847ce	swr/rast: add more llvm intrinsics Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-30 19:48:12 +00:00
Julien Isorce	0e3a348bec	st/va: properly set stride and offset in vlVaDeriveImage Using the new resource_get_info function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-04-30 17:53:12 +00:00
Julien Isorce	1cec049d4d	radeonsi: implement resource_get_info Re-use existing si_texture_get_offset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-30 17:53:12 +00:00
Julien Isorce	a3c202de0a	gallium: add resource_get_info to pipe_screen Generic plumbing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-30 17:53:12 +00:00
Rob Clark	ec6c229763	freedreno/ir3: fixes for half reg in/out Needs to update max_half_reg, or be remapped to full reg and update max_reg accordingly, depending on generation.. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-30 10:39:24 -07:00
Axel Davy	ce57f4f7c4	st/nine: Check discard_delayed_release is set before allocating more When discard_delayed_release is set (default), we allocate more buffers and use a different buffer wait path. Check if it is set, and use the old paths if not (the alternative buffer wait path could still be used, but there is no advantage to using it in this case). Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	b71c300c70	st/nine: Throttle rendering similarly for thread_submit thread_submit's throttling depending on the number of internal back buffers, and wasn't affected by the driver requested throttling value. Now it is. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	562f5a35c8	st/nine: Optimize a bit writeonly buffers Optimize writeonly by passing PIPE_TRANSFER_WRITE for these buffers instead of the safer PIPE_TRANSFER_READ_WRITE. This seems to improve the performance of d3d8 games using d3d8to9. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	92117c989c	st/nine: Use TGSI_SEMANTIC_GENERIC for fog We used TGSI_SEMANTIC_FOG for fog, however on vs/ps 3, fog is allowed to have 4 components (even on the ff pipeline according to a wine test). Since gallium's TGSI_SEMANTIC_FOG has only one component, use TGSI_SEMANTIC_GENERIC instead. Fixes: https://github.com/iXit/Mesa-3D/issues/346 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	bade3bf615	st/nine: Enable computing const_ranges All the pieces for constant compact are ready, thus enable the path. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	5c67db6889	st/nine: Handle const_ranges in nine_state Handle slot mapping if there is one. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	9942ba2ea3	st/nine: Cache constant buffer size The shader constant buffer size with the constant compaction code can vary depending on the shader variant compiled (for example if fog constants are required, etc). Thus instead of using fixed size for the shader, add in the variant cache the size required, pass it to the context, and use this value. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	a3cdc466e7	st/nine: Propagate const_range to context As with the constant compaction we map the constant slots to new slots, we need to pass that information to the context which is in charge of uploading the constants. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	7761cda686	st/nine: Prepare constant compaction in nine_shader When indirect addressing is not used, we know exactly which constants are accessed, and thus can have them located in consecutive slots. We thus parse again the shader with a slot map for compaction. The path contains the work inside nine_shader.c for this path, but it needs some other commits to work, and thus is not enabled yet by this commit. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:52 +02:00
Axel Davy	db404507b4	st/nine: Refactor counting of constants Track the number of slots used Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	737df40a63	st/nine: Track constant slots used This tracking will be useful for constant compaction Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	d2cab4562c	st/nine: Refactor ct_ctor The refactoring will make it easier to parse the shader twice for the constant compaction path. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	6f3da226e6	st/nine: Make swvp_on imply IS_VS swvp cannot happen with ps, thus it makes sense to force it to false with ps. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	d57d1436d3	st/nine: Refactor shader constants ureg_src computation Put the shader constant code in one place to better change that code in future commits. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	6d86292f8a	st/nine: Manually upload vs and ps constants In future commits we will introduce more fine-grained uploads Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	1ddeb43537	st/nine: use helper ureg_DECL_sampler everywhere Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	3717ec4157	st/nine: Compact pixel shader key Compact the shader key to make room for new elements. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	2acbd977d7	st/nine: Compact nine_ff_get_projected_key Only the first four sampler slots can be used by ff ps < 0x14, thus the size of the key can be reduced. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	a92a43d41d	st/nine: Refactor param->rel Refactor param->rel to enable different paths for constants and inputs relative addressing. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	5974401a4a	st/nine: Regroup param->rel tests Regroup all the param->rel assertions into one assertion for better clarity and better covering. param->rel on an input can only happen with float constants for vs, or with inputs on vs/ps 3.0. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	12654a2fda	st/nine: Control shader constant inlining with drirc Until we use async shader compilation for constant inlining, don't enable it unless user asks for it. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	95f25bef54	st/nine: Recompile optimized shaders based on b/i consts Boolean and Integer constants are used in d3d9 for flow control. Boolean are used for if/then/else and Integer constants for loops. The compilers can generate better code if these values are known at compilation. I haven't met so far a game that would change the values of these constants frequently (and when they do, they set to the values used for the previous draw call, and thus the changes get filtered out). Thus it makes sense to inline these constants and recompile the shaders. The commit sets a bound to the number of variants for a given shader to avoid too many shaders to be generated. One drawback is it means more shader compilations. It would probably make sense to compile these shaders asynchronously or let the user control the behaviour with an env var, but this is not done here. The games I tested hit very few shader variants, and the performance impact was negligible, but it could help for games with uber shaders. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	e57267a09e	drirc: Add Gallium nine workaround for Rayman Legends The game requires it to display many textures properly. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	c097ff3617	st/nine: Add drirc option to use data_internal for dynamic textures dynamic textures seem to have predictable stride. This stride should be the same as for a ram buffer. It seems some game don't check the actual stride value, assuming it to be the expected one. Thus this workaround (protected by drirc option) is to use an intermediate ram buffer. Fixes Rayman Legends texture issues when enabled. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	7dcc85b46e	st/nine: Support internal compressed format for volumes Reuse the generic path to support compressed formats. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	1b0a7d0557	st/nine: Support internal compressed format for surfaces Reuse the generic path to support compressed formats. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	22c41d2d81	st/nine: Refactor volume GetSystemMemPointer It will make it easier to reuse in another place. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:51 +02:00
Axel Davy	85c9d92067	st/nine: Refactor surface GetSystemMemPointer It will make it easier to reuse in another place. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	8ba4f73911	st/nine: rename _conversion to _internal Rename these variables to a new name which will fit new usages introduced in later commits. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	4a51a7c1da	st/nine: Optimize volume upload with conversion Use nine_context_box_upload instead of locking the pipe for volume upload with format conversion. nine_context_box_upload already handles format conversion. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	fac3f99377	st/nine: Optimize surface upload with conversion Use nine_context_box_upload instead of locking the pipe for surface upload with format conversion. nine_context_box_upload already handles format conversion. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	4ca6b1dfd1	st/nine: Fix SINCOS input SINCOS takes an input with replicated swizzle. the swizzle can be on any component, not just x. Enable it to read from any component, but also use a temporary register to avoid dst/src aliasing. No known game is fixed by this change as it seems the input swizzle is commonly on x for this instruction, and src and dst don't alias. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	f4ae483c44	st/nine: Ignore nooverwrite for systemmem Systemmem has a specific behaviour we don't mimick exactly. That makes Halo feel free to use nooverwrite with it all the time, even when reading again at the same location. Ignore nooverwrite to have proper synchronization. Fixes: https://github.com/iXit/Mesa-3D/issues/348 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	fd3a870401	st/nine: Enable modifiers on ps 1.X texcoords For many ps 1.X instructions, we were reading the texcoords directly, instead of through tx_src_param, resulting in modifiers getting ignored. Use tx_src_param for all these instructions. Fixes: https://github.com/iXit/Mesa-3D/issues/337 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	1fc0714039	st/nine: Always return OK on SetSoftwareVertexProcessing This would need more tests to know exactly if INVALIDCALL can be returned in some situations. It seems some games expect D3D_OK, even when noop and illegal. Fixes: https://github.com/iXit/Mesa-3D/issues/302 https://github.com/iXit/Mesa-3D/issues/338 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	d9a4025fa3	st/nine: Finish if nooverwrite after normal mapping d3d's nooverwrite and gallium's unsynchronized have different semantics. Indeed nooverwrite says the applications won't write to locations needed by previous draws, which is less strong than unsynchronized which won't synchronize previous writes. Thus in case app is locking without discard/nooverwrite, then using nooverwrite, we need to add a synchronization. Fixes: https://github.com/iXit/wine-nine-standalone/issues/29 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	e502c4d892	st/nine: Fix buffer/texture unbinding in nine_state_clear Previously nine_state_clear was not using NineBindBufferToDevice and NineBindTextureToDevice to unbind buffers and textures (but used nine_bind) This was resulting in an uncorrect bind count for these resources. Combined with `0ec4e5f630` Some buffers were scheduled to be uploaded directly after they were locked (because the bind count incorrectly assumed they were needed for the next draw call), which resulted in uploads before the data was written. To simplify a bit the code (and because I needed to add a pointer to device), remove the stateblock usage from nine_state_clear and rename to nine_device_state_clear. Fixes: https://github.com/iXit/Mesa-3D/issues/345 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	bb3b8f8e01	st/nine: Upload managed buffers only at draw using them When a draw call is emited, buffers in the device->update_buffers list are uploaded. This patch removes buffers from the list if they are not bound anymore. Behaviour found studying: https://github.com/iXit/Mesa-3D/issues/345 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	5df96995ef	st/nine: Upload managed textures only at draw using them When a draw call is emited, textures in the device->update_textures list are uploaded. This patch removes textures from the list if they are not bound anymore. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:50 +02:00
Axel Davy	394420ebb3	st/nine: Use FLT_MAX/2 for RCP clamping This seems to fix Rayman (which adds things to the RCP result, and thus gets an Inf), while not having regressions. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:49 +02:00
Axel Davy	64a45ba7f8	st/nine: Fix D3DWindowBuffer_release for old wine nine support No-one reported bugs for that, but is seems `c442dd7890` and previous commits used APIs not defined until nine minor version 3. This patch should prevent crash in this case. Also turn off the resize feature in this case, as we won't prevent a buffer leak anymore. Cc: "19.0" mesa-stable@lists.freedesktop.org Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-04-30 19:18:49 +02:00
Eric Engestrom	0cff98c8a0	turnip: update to use the new features struct names These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. See also: `90108deb27` "anv: Update to use the new features struct names" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-04-30 16:55:38 +01:00
Eric Engestrom	941b2f4dcd	radv: update to use the new features struct names These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. See also: `90108deb27` "anv: Update to use the new features struct names" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-30 16:55:18 +01:00
Eric Engestrom	b80930a6fe	anv: add support for VK_EXT_memory_budget Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-30 15:40:33 +00:00
Eric Engestrom	316964709e	util: add os_read_file() helper readN() taken from igt. os_read_file() inspired by igt_sysfs_get() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-30 15:40:33 +00:00
Rafael Antognolli	2fae99bcbd	iris: Enable fast clear colors on gen11. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-30 08:31:44 -07:00
Rafael Antognolli	cf3cadacdf	iris: Update the surface state clear color address when available. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-30 08:31:44 -07:00
Rafael Antognolli	91bcbfc351	iris: Use the linear version of the surface format during fast clears. Newer gens (> 9) will start doing the linear -> sRGB conversion of the clear color for us, if we use a sRGB surface format. So let's make sure that doesn't happen and keep the same semantics as before. Even though the hardware could convert the clear color for us during fast clear, that converted color is only used for sampling. For resolve, the original color would be used (without the conversion). So we convert it ourselves and the same converted color gets used for both sampling and resolving, simplifying the whole logic. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-30 08:31:44 -07:00
Rafael Antognolli	56927a8cf5	iris: Support sRGB fast clears even if the colorspaces differ. We were disabling fast clears if the view format had a different colorspace than the resource format (sRGB vs linear or vice-versa). But we actually support them if we use the view format to decide if we should encode the clear color into sRGB colorspace. Also add a missing linear -> sRGB surface format conversion (we don't want the clear color to be encoded to sRGB again during resolve). v2: Do not track sRGB colorspace during fast clears (Nanley). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-30 08:31:44 -07:00
Eric Engestrom	abb2c7c9d3	egl: fixup autotools-specific wording Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 15:25:40 +00:00
Eric Engestrom	fe73c74691	docs: haiku can be built using meson Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 15:25:40 +00:00
Eric Engestrom	88ed5f611d	docs: use past tense when talking about autotools Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 15:25:40 +00:00
Eric Engestrom	46d6883a13	docs: replace autotools intructions with meson equivalent Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 15:25:40 +00:00
Eric Engestrom	1936bad9ec	docs: drop autotools python information Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 15:25:40 +00:00
Eric Engestrom	8c7b8fcd0c	docs: remove unsupported GL function name mangling This was only supported in autotools, which has since been deleted. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 15:25:40 +00:00
Ian Romanick	bfc6486819	mesa: Add missing display list support for GL_FOG_COORDINATE_SOURCE Fixes: `fe5d67d95f` ("Implement EXT_fog_coord and EXT_secondary_color.") Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Cc: Brian Paul <brianp@vmware.com>	2019-04-30 07:52:59 -07:00
Alejandro Piñeiro	9b6a00e66e	docs: document MESA_GLSL=errors keyword Added with commit `0161691f35`, still checked on shaderapi.c _mesa_get_shader_flag method. Fixes: `0161691f35` "mesa: add GLSL_REPORT_ERRORS debug flag" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-30 15:45:33 +01:00
Khem Raj	da84d071a6	winsys/svga/drm: Include sys/types.h vmw_screen.h uses dev_t which is defines in sys/types.h this header is required to be included for getting dev_t definition. This issue happens on musl C library, it is hidden on glibc since sys/types.h is included through another system headers Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-30 13:50:25 +01:00
Ross Burton	1c1efa4ca9	Revert "meson: drop GLESv1 .so version back to 1.0.0" This patch claimed that the autotools build generates libGLESv1_CM.so.1.0.0, but it doesn't: es1api_libGLESv1_CM_la_LDFLAGS = \ -no-undefined \ -version-number 1:1 \ $(GC_SECTIONS) \ $(LD_NO_UNDEFINED) Revert commit `cc15460e18` to ensure that the autotools and meson builds produce the same libraries. Fixes: `cc15460e18` "meson: drop GLESv1 .so version back to 1.0.0" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-30 13:49:20 +01:00
Juan A. Suarez Romero	8d621e8ff7	anv: enable descriptor indexing capabilities This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: `6e230d7607` "anv: Implement VK_EXT_descriptor_indexing" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-30 09:23:46 +02:00
Juan A. Suarez Romero	06c9d7f9f9	radv: enable descriptor indexing capabilities This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: `0e10790558` "radv: Enable VK_EXT_descriptor_indexing." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-30 09:23:23 +02:00
Juan A. Suarez Romero	bbbe00a101	spirv: add missing SPV_EXT_descriptor_indexing capabilities Add ShaderNonUniformEXT, UniformBufferArrayNonUniformIndexingEXT, SampledImageArrayNonUniformIndexingEXT, StorageBufferArrayNonUniformIndexingEXT, StorageImageArrayNonUniformIndexingEXT, InputAttachmentArrayNonUniformIndexingEXT, UniformTexelBufferArrayNonUniformIndexingEXT and StorageTexelBufferArrayNonUniformIndexingEXT capabilities. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-30 09:22:45 +02:00
Caio Marcelo de Oliveira Filho	1fb6630636	spirv: Properly handle SpvOpAtomicCompareExchangeWeak The code was handling the Weak variant in some cases, but missing others, e.g. the get_deref_nir_atomic_op. Add all the missing cases with the same behavior of the non-Weak SpvOpAtomicCompareExchange. Note that the Weak variant is basically an alias, as SPIR-V 1.3, Revision 7 says "OpAtomicCompareExchangeWeak Deprecated (use OpAtomicCompareExchange). Has the same semantics as OpAtomicCompareExchange." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-29 19:02:44 -07:00
Tomeu Vizoso	cc6bbf6397	panfrost/ci: Initial commit These files implement running almost all of deqp-gles2 on Chomebooks of the rk3399-gru-kevin type in Collabora's LAVA lab. The approach follows what is currently being used for virglrenderer, but scheduling the actual test jobs via LAVA. We start by building a container in Docker that contains a suitable rootfs and kernel for the DUT, deqp and all dependencies for building Mesa itself. The Mesa is built and the rootfs, deqp and Mesa are combined in a cpio ramdisk. A LAVA job is generated, submitted to LAVA and the results are processed by simply comparing them to the expectations that are stored in git. Any code that changes the expectations (hopefully tests are fixed) needs to also update the expectations file. The next step is adding support for other devices, possibly in other LAVA labs. In order to use this, the repository has to be configured to run the gitlab-ci.yaml file from the panfrost/ci dir, and a LAVA token needs to be setup. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-30 01:22:43 +00:00
Rafael Antognolli	b15f5cfd20	iris: Do not advertise multisampled image load/store. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-29 17:04:04 -07:00
Rob Clark	9cb8037e54	freedreno/a6xx: pre-bake UBWC flags in texture-view Small cleanup. No need to defer this to emit time. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-29 17:01:01 -07:00
Rob Clark	8506ebfb95	freedreno/a6xx: small texture emit cleanup Prep work for fb_read (blend_equation_advanced) Switch to using 'enum pipe_shader_type' everywhere, and (optional, in non-cache / slowpath case) pass ctx instead of image/ssbo state. In the fb_read case we also need to access the framebuffer state, so having the ctx simplifies things. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-29 17:01:01 -07:00
Rob Clark	da327afb2a	freedreno/ir3: switch fragcoord to sysval Because who are we kidding... it is a sysval. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-29 17:01:01 -07:00
Plamena Manolova	11518384c4	i965: Re-enable fast color clears for GEN11. This patch re-enables fast color clears for GEN11. It also ensures that we use linear color formats for sRGB surfaces during fast clears. Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-29 21:19:59 +00:00
Rafael Antognolli	9175c7058e	intel/blorp: Make blorp update the clear color in gen11. Hardware docs say that Gen11 requires the use of two MI_ATOMICs of size QWORD when updating the clear color. The second MI_ATOMIC also needs CS Stall and Return Data Control set. v2: Remove include of srgb header (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-29 21:19:59 +00:00
Rafael Antognolli	f8c3f408a6	intel/genxml: Update MI_ATOMIC genxml definition. Change some of the single bit fields to booleans, and add an enum with the definition of the ATOMIC_OPCODE. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-29 21:19:59 +00:00
Jordan Justen	38ffd7ce79	intel/genxml: Support base-16 in value & start fields in gen_sort_tags.py With python's int(), if the optional second parameter is 0, then python will support the 0x prefix for hex numbers. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-29 21:19:58 +00:00
Plamena Manolova	232c0f6489	isl: Set ClearColorConversionEnable. The ClearColorConversionEnable bit needs to be set for GEN11 when inderect clear colors are used. Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-04-29 21:19:58 +00:00
Eric Engestrom	1587586182	delete autotools input files Leftovers from when autotools was deleted. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-29 21:17:19 +00:00
Eric Engestrom	7ca8ba199f	delete autotools .gitignore files One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-29 21:17:19 +00:00
Kenneth Graunke	f3bdffc33d	iris: Only enable GL_AMD_depth_clamp_separate on Gen9+ The hardware feature is new as of Gen9+. I accidentally enabled it on Gen8.	2019-04-29 13:25:12 -07:00
Kenneth Graunke	dcfca0af7c	iris: Set XY Clipping correctly. I was setting it based off a pipe_rasterizer_state field that appears to be entirely dead outside of the draw module respecting it. I should be setting it when the primitive type reaching the SF is neither points nor lines. This is, unfortunately, rather dirty, as we have to look at the rasterizer state, the geometry shader state, the tessellation evaluation shader state, and the primitive type...	2019-04-29 10:53:23 -07:00
Rhys Perry	bd4c661ad0	ac,ac/nir: use a better sync scope for shared atomics https://reviews.llvm.org/rL356946 (present in LLVM 9 and later) changed the meaning of the "system" sync scope, making it no longer restricted to the memory operation's address space. So a single address space sync scope is needed for shared atomic operations (such as "system-one-as" or "workgroup-one-as") otherwise buffer_wbinvl1 and s_waitcnt instructions can be created at each shared atomic operation. This mostly reimplements LLVMBuildAtomicRMW and LLVMBuildAtomicCmpXchg to allow for more sync scopes and uses the new functions in ac->nir with the "workgroup-one-as" or "workgroup" sync scopes. F1 2017 (4K, Ultra High settings, TAA), avg FPS : 59 -> 59.67 (+1.14%) Strange Brigade (4K, ~highest settings), avg FPS : 51.5 -> 51.6 (+0.19%) RotTR/mountain (4K, VeryHigh settings, FXAA), avg FPS : 57.2 -> 57.2 (+0.0%) RotTR/tomb (4K, VeryHigh settings, FXAA), avg FPS : 42.5 -> 43.0 (+1.17%) RotTR/valley (4K, VeryHigh settings, FXAA), avg FPS : 40.7 -> 41.6 (+2.21%) Warhammer II/fallen, avg FPS : 31.63 -> 31.83 (+0.63%) Warhammer II/skaven, avg FPS : 37.77 -> 38.07 (+0.79%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-29 18:20:44 +01:00
Hal Gentz	e91ee763c3	glx: Fix synthetic error generation in __glXSendError To quote Uli Schlachter, who understands this stuff more than I do: > The function __glXSendError() in mesa's src/glx/glx_error.c invents an X11 > protocol error out of thin air. For the sequence number it uses dpy->request. > This is the sequence number of the last request that was sent. _XError() will > then update dpy->last_request_read based on the sequence number of the error > that just "came in". > > If now another something comes in with a sequence number less than > dpy->last_request_read, since sequence numbers are monotonically increasing, > widen() will incorrectly add 1<<32 to the sequence number and things might go > downhill afterwards. `__glXSendErrorForXcb` was also patched, as that's the function that `glXCreateContextAttribsARB` actually uses. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99781 Cc: mesa-stable@lists.freedesktop.org Fixes: `ad503c41` 'apple: Initial import of libGL for OSX from AppleSGLX svn repository' Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com>	2019-04-29 12:52:48 -04:00
Lionel Landwerlin	9628631a38	Revert "anv: limit URB reconfigurations when using blorp" In commit 0d46e404 ("anv: limit URB reconfigurations when using blorp") we tried to limit the number of URB reconfiguration by checking if the last allocation is large enough to fit the blorp dispatch. We used the last bound pipeline to compare the allocation. The problem with this is that the pipeline is bound but its commands might not have been emitted into the command buffer yet. Let's just revert commit `0d46e40467` since it didn't seem to yield any performance improvement. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 0d46e404 ("anv: limit URB reconfigurations when using blorp") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110535 Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-29 11:41:27 +00:00
Erik Faye-Lund	cc5b8a938a	mesa/st: remove always-false state This code is essentially dead now. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-29 10:28:38 +00:00
Erik Faye-Lund	be110ba2e4	mesa/st: accept NULL and empty buffer objects It's prefectly legal and well-defined to render using a non-existing or empty buffer object. The data coming out of the buffer object isn't well defined unless we have the robustness flag set on the context, but that's a different matter, and up to the shader hardware; it's the same as out-of-bounds reads. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-29 10:28:38 +00:00
Erik Faye-Lund	ef13691e0c	swr: support NULL-resources It's legal for a buffer-object to have a NULL-resource, but let's just skip over it, as there's nothing to do. This patch switches the order of the conditionals in swr_update_derived, so the logic becomes a bit more straight forward: if (is_user_buffer) ... else if (resource) ... else ... ...instead of this: if (!is_user_buffer) if (resource) ... else ... else ... Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-04-29 10:28:38 +00:00
Erik Faye-Lund	04b0c6e9df	nouveau: support NULL-resources It's legal for a buffer-object to have a NULL-resource, but let's just skip over it, as there's nothing to do. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-04-29 10:28:38 +00:00
Erik Faye-Lund	a11945d179	i915: support NULL-resources It's legal for a buffer-object to have a NULL-resource, but let's just skip over it, as there's nothing to do. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-04-29 10:28:38 +00:00
Erik Faye-Lund	a8e8204b18	gallium/u_vbuf: support NULL-resources It's legal for a buffer-object to have a NULL-resource, but let's just skip over it, as there's nothing to do. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-29 10:28:38 +00:00
Erik Faye-Lund	0607ceb655	mesa/st: remove impossible error-check st_setup_current never sets this flag, and it's already checked against right before. So let's remove this pointless check. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-29 10:28:38 +00:00
Andres Gomez	c81fbb42d9	glsl/linker: check for xfb_offset aliasing From page 76 (page 80 of the PDF) of the GLSL 4.60 v.5 spec: " No aliasing in output buffers is allowed: It is a compile-time or link-time error to specify variables with overlapping transform feedback offsets." Currently, this is expected to fail, but it succeeds: " ... layout (xfb_offset = 0) out vec2 a; layout (xfb_offset = 0) out vec4 b; ... " Fixes the following piglit test: tests/spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-overlap.vert Fixes the following test: KHR-GL44.enhanced_layouts.xfb_output_overlapping v2: - Use a data structure to track the used components instead of a nested loop (Ilia). v3: - Take the BITSET_WORD array out from the gl_transform_feedback_buffer struct and make it local to the validation process (Timothy). - Do not use a nested scope for the validation (Timothy). v4: - Add reference to the fixed piglit test in the commit log. - Add reference to the fixed VK-GL-CTS test in the commit log (Tapani). - Empty initialize the BITSET_WORD pointers array (Tapani). Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-29 12:13:29 +02:00
Patrick Lerda	812288bf0f	lima/ppir: fix pointer referenced after a free Issue detected by valgrind. Fixes: `92d7ca4b1c` ("gallium: add lima driver") Signed-off-by: Patrick Lerda <patrick9876@free.fr> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-29 10:43:51 +02:00
Eleni Maria Stea	bb953de96c	radv: consider MESA_VK_VERSION_OVERRIDE when setting the api version Before setting the physical device API version, we should check if the MESA_VK_VERSION_OVERRIDE environment variable is set and take it into account. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-29 09:00:51 +02:00
Kenneth Graunke	9dcf90d7ba	intel/fs: Don't emit empty ELSE blocks. While we can clean this up later, it's trivial to not generate the stupid code in the first place, which saves some optimization work. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-28 22:36:09 -07:00
Kenneth Graunke	2b44b27dbe	nir: Add a new nir_cf_list_is_empty_block() helper. Helper and name suggested by Eric Anholt. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-28 22:36:08 -07:00
Kenneth Graunke	08dc93c67c	glsl/list: Add an exec_list_is_singular() helper. Similar to list_is_singular() in util/list.h. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-28 22:35:42 -07:00
Tapani Pälli	376c3e8f87	anv: expose VK_EXT_queue_family_foreign on Android VK_ANDROID_external_memory_android_hardware_buffer requires this extension. It is safe to enable it since currently aux usage is disabled for ahw buffers. Fixes following dEQP extension dependency test on Android: dEQP-VK.api.info.device#extensions Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-29 07:31:02 +03:00
Andreas Baierl	c960323a81	lima/ppir: Add gl_FragCoord handling Treat gl_FragCoord variable as a system value and lower the w component with a nir pass. Add the necessary bits for correct codegen. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-29 02:46:44 +00:00
Andreas Baierl	b82de2b4d7	nir: add rcp(w) lowering for gl_FragCoord On some hardware (e.g. Mali400) the shader needs to apply some transformations for correct gl_FragCoord handling. The lowering actions look like the following in pseudocode: gl_FragCoord.xyz = gl_FragCoord_orig.xyz gl_FragCoord.w = 1.0 / gl_FragCoord_orig.w Add this lowering as a nir pass in preparation for using it in the driver. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-29 02:46:44 +00:00
Romain Failliot	7050eccd77	docs: changed "Done" to "DONE" in features.txt Mesamatrix.net expects uppercase. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-29 09:32:01 +10:00
Alyssa Rosenzweig	ec65e1b763	panfrost: Workaround -bshadow regression I have no idea what's happening here, but let's not regress an app that used to work in the mean time while we're figuring it out.. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:39:20 +00:00
Alyssa Rosenzweig	3978614d88	panfrost/midgard: Safety check immediate precision degradations Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Alyssa Rosenzweig	0ebf1047a4	panfrost: Use fp32 (not fp16) varyings In a perfect world, we'd use fp16 varyings for mediump and fp32 for highp, allowing us to get a performance win without sacrificing conformance. Unfortunately, we're not there (yet), so it's better we assume always fp32 than always fp16 to avoid artefacts / breaking a lot of deqp. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Alyssa Rosenzweig	a81267f228	panfrost/midgard: imov workaround Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Alyssa Rosenzweig	53d6e11393	panfrost/midgard: Fix tex propogation Unbreaks mpv. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Alyssa Rosenzweig	68a1508dc9	panfrost/midgard: Fix regressions in -bjellyfish Two fixes here, one is that we tried to copyprop non-strictly-SSA values which was bound to fly in our face. The other was peeling back the imov workaround.. Turns out we still need that. More research is needed still, but let's not regress real apps. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Alyssa Rosenzweig	bdaa23b32b	panfrost/midgard: Only copyprop without an outmod With an outmod, we would need to propagate that through, which is for future work. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Alyssa Rosenzweig	a3d6a3dfc4	Revert "panfrost/midgard: Extend copy propagation pass" Fixes: commit `b53b4573c3`. Optimization gone wrong. In the future, we should try this again (it's a net win if implemented right), but at the moment this just regresses. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-28 21:34:32 +00:00
Samuel Pitoiset	07745f9494	radv: add missing VEGA20 chip in radv_get_device_name() Otherwise it returns "AMD RADV unknown". Cc: 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-27 12:16:23 +02:00
Kenneth Graunke	6bd4cb920e	iris: Fix zeroing of transform feedback offsets in strange cases. Some of the dEQP.functional.transform_feedback tests end up doing the following sequence of operations: 1. BeginTransformFeedback 2. PauseTransformFeedback 3. Draw 4. ResumeTransformFeedback At step 1, we'd pack 3DSTATE_SO_BUFFER commands saying to zero the SO_WRITE_OFFSET registers. At step 2, we disable streamout, so step 3 doesn't bother emitting those commands. Then, step 4 re-packs new 3DSTATE_SO_BUFFER commands with offset = 0xFFFFFFFF, saying to continue appending at the existing offset. This loads the value from the BO as the offsets - but we never actually zeroed it. So, just maintain a flag saying "we actually emitted the commands", and stomp offset back to zero until we emit some.	2019-04-27 01:07:14 -07:00
Eric Anholt	edb04953c8	vc4: Fall back to renderonly if the vc4 driver doesn't have v3d. I have a platform with vc4 display but V3D 4.x. We can fall back on kmsro's probing to bring up the v3d gallium driver. Acked-by: Rob Clark <robdclark@chromium.org>	2019-04-26 15:02:03 -07:00
Eric Anholt	7e069832a0	kmsro: Add support for V3D. Like vc4, we expect to have SOCs with various displays that have a single V3D instance for rendering. v2: Add v3d to the list of drivers that make enabling kmsro valid. Acked-by: Rob Clark <robdclark@chromium.org>	2019-04-26 14:59:32 -07:00
Marek Olšák	a8a0e5c03c	radeonsi: don't ignore PIPE_FLUSH_ASYNC Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-04-26 15:44:39 -04:00
Eric Anholt	fb0611df3d	v3d: Fix detection of TMU write sequences in register spilling. We can't use the QPU functions to detect this until register allocation is done and we've moved inst->dst into inst->qpu. Fixes bad TMU sequences from register spilling in KHR-GLES31.core.compute_shader.shared-max.	2019-04-26 12:42:30 -07:00
Eric Anholt	18894a5e5a	v3d: Fix detection of the last ldtmu before a new TMU op. We were looking at the start instruction, instead of scanning through the list of following instructions to find any more ldtmus.	2019-04-26 12:42:30 -07:00
Eric Anholt	575caab895	v3d: Re-add support for memory_barrier_shared. Looks like I lost it in a rebase conflict resolution. We'd hit the unknown intrinsic assertion in KHR-GLES31.core.compute_shader.shared-struct. Fixes: `6b1c659825` ("v3d: Add Compute Shader compilation support.")	2019-04-26 12:42:30 -07:00
Eric Anholt	971a13d805	Revert "v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER." This reverts commit `ccce940947`, leaving a note as to why we had to (corruption in chromium, breaking some GLES3.1 tests).	2019-04-26 12:42:30 -07:00
Eric Anholt	49071b2e3f	v3d: Don't try to update the shadow texture for separate stencil. There are two cases where v3d's sampler view's resource doesn't match the base's: shadow textures for sampling from raster, and pointing at the separate depth texture for z32f_s8x24. We only want to update shadow for the first case. Fixes dEQP-GLES31.functional.stencil_texturing.render.depth32f_stencil8_draw when run after the previous testcase.	2019-04-26 12:42:30 -07:00
Eric Anholt	4358904c06	v3d: Add a note about i/o indirection for future performance work.	2019-04-26 12:42:30 -07:00
Eric Anholt	c74d0e7f62	vc4: Use _mesa_hash_table_remove_key() where appropriate.	2019-04-26 12:42:30 -07:00
Eric Anholt	d8486c2ad7	v3d: Use _mesa_hash_table_remove_key() where appropriate.	2019-04-26 12:42:30 -07:00
Eric Anholt	24587ae8ae	v3d: Assert that we do request the normal texturing return data. An unused tex should be DCEed, but if it wasn't we'd run into trouble with not doing a TMUWT.	2019-04-26 12:42:30 -07:00
Eric Anholt	42210a4351	v3d: Apply the GFXH-930 workaround to the case where the VS loads attrs. We were emitting a dummy load for when the VS doesn't load any attributes, but we also need to emit a dummy load for when the render VS loads attributes but the binner VS doesn't. Fixes simulator assertion failures and GPU hangs on KHR-GLES31.core.texture_gather.\*	2019-04-26 12:42:30 -07:00
Eric Anholt	448fc3ea42	v3d: Fill in the ignored segment size fields to appease new simulator. We are assured that the input segment size field is ignored for !separate_segs mode, and now the simulator wants an in-range value set regardless of whether it's functionally ignored or not.	2019-04-26 12:40:31 -07:00
Tapani Pälli	af06963d24	glsl: use empty brace initializer fixes following warning with clang: warning: suggest braces around initialization of subobject Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-26 12:24:41 -07:00
coypu	976004d0e7	gbm: don't return void Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-26 12:04:26 -07:00
Tapani Pälli	7a7f182dac	nir: use braces around subobject in initializer Used same syntax as elsewhere with Mesa sources, verified result against MSVC with godbolt.org. fixes following warning with clang: warning: suggest braces around initialization of subobject v2: empty braces -> braces around subobject (Caio, Kristian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-26 12:01:22 -07:00
Kristian H. Kristensen	a7c70bb2a1	freedreno/drm: Quiet pointer to u64 conversion warning	2019-04-26 11:58:44 -07:00
Alok Hota	8bfb34fd0a	swr/rast: enforce use of tile offsets Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-26 13:00:45 -05:00
Alok Hota	0e49963212	swr/rast: AVX512 support compiled in by default - Emulation of AVX512 built into SIMDLIB - Remove associated macros - Remove knobs controlling AVX512 and let emulation handle it - Refactor variable names for SIMD16 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-26 13:00:38 -05:00
Alok Hota	0bf1df2bb6	swr/rast: Remove deprecated 4x2 backend code - Use 8x2 tiling by default - Remove associated macros - Use SIMDLIB emulation for SIMD16 on SIMD8 hardware - Remove code rot in Load/StoreTile Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-04-26 13:00:24 -05:00
Tomasz Figa	e8bf4efceb	llvmpipe: Always return some fence in flush (v2) If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-26 11:26:33 +01:00
Emil Velikov	591955d82d	llvmpipe: correctly handle waiting in llvmpipe_fence_finish Currently if the timeout differs from 0, we'll end up with infinite wait... even if the user is perfectly clear they don't want that. Use the new lp_fence_timedwait() helper guarding both waits in an !lp_fence_signalled block like the rest of llvmpipe. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-26 11:26:33 +01:00
Emil Velikov	5b284fe6bc	llvmpipe: add lp_fence_timedwait() helper The function is analogous to lp_fence_wait() while taking at timeout (ns) parameter, as needed for EGL fence/sync. v2: - use absolute UTC time, as per spec (Gustaw) - bail out on cnd_timedwait() failure (Gustaw) v3: - check count/rank under mutex (Gustaw) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>	2019-04-26 11:26:33 +01:00
Emil Velikov	bd0c4e360d	vulkan/wsi: don't use DUMB_CLOSE for normal GEM handles Currently we get normal GEM handles from PrimeFDToHandle, yet we close then with DUMB_CLOSE. Use GEM_CLOSE instead. Fixes: `da997ebec9` ("vulkan: Add KHR_display extension using DRM [v10]") Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Keith Packard <keithp@keithp.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-26 11:26:33 +01:00
Emil Velikov	c962a78f18	vulkan/wsi: check if the display_fd given is master As effectively required by the extension, we need to ensure we're master Currently drivers employ vendor specific solutions, which check if the device behind the fd is capable, yet none of them do the master check. In the radv case, if acceleration is available. Instead of duplicating the check in each driver, keep it where it's needed and used. Note this copies libdrm's drmIsMaster() to avoid depending on bleeding edge version of the library. v2: set the fd to -1 if not master (Bas) Fixes: `da997ebec9` ("vulkan: Add KHR_display extension using DRM [v10]") Cc: Andres Rodriguez <andresx7@gmail.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Keith Packard <keithp@keithp.com> Reported-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-26 11:26:33 +01:00
Emil Velikov	1a9367c134	turnip: drop dead close(master_fd) The fd is -1, thus the block of if (fd != -1) close(fd) is dead code. Cc: Chad Versace <chadversary@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-26 11:26:33 +01:00
Jason Ekstrand	00d4e78ea9	nir/algebraic: Optimize integer cast-of-cast These have been popping up more and more with the OpenCL work and other bits causing extra conversions to/from 64-bit. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-26 04:26:08 -05:00
Jason Ekstrand	934f178341	anv/descriptor_set: Don't fully destroy sets in pool destroy/reset In `105002bd2d`, we fixed a memory leak bug where we weren't properly destroying descriptor when destroying/resetting a descriptor pool. However, the only real leak that happened was that we we take a reference to the descriptor set layout in the descriptor set and we weren't dropping our reference. Everything else in the descriptor set is tied to the pool itself and doesn't need to be freed on a per-set basis. This commit changes the destroy/reset functions to only bother walking the list of sets to unref the layouts and otherwise we just assume that the whole-pool destroy/reset takes care of the rest. Now that we're doing more non-trivial things with descriptor sets such as allocating things with util_vma_heap, per-set destruction is starting to show up on perf traces. This takes reset back to where it's supposed to be as a cheap whole-pool operation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-26 05:40:28 +00:00
Jason Ekstrand	baf4802e3e	anv: Better handle 32-byte alignment of descriptor set buffers In `c520f4dec9`, we chose to align the sizes of descriptor set buffers to 32 bytes. We have to align the descriptor set buffer to 32B so that it's valid for using with push constants. We align the size as well so we don't leave lots of holes with util_vma_heap_alloc. Unfortunately, we were only aligning it for alloc and not for free so we were still creating piles of holes when we delete descriptor sets. This causes terrible perf for the allocator once we've deleted piles of descriptor sets. This commit reworks the code so that we align the descriptor set buffer size to 32B for both alloc and free. The result is that it takes the new crucible vkResetDescriptorPool from 104.567719 to 2.898354 seconds. Fixes: `c520f4dec9` "anv: Add a concept of a descriptor buffer" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110497 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-26 05:40:28 +00:00
Dave Airlie	d946cbe9f5	nir: fix bit_size in lower indirect derefs. This fixes a case where we are expecting 64-bit but generate 32-bit consts and validate gets angry. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-26 12:59:43 +10:00
Kenneth Graunke	529ace7887	iris: Silence unused function warning	2019-04-25 17:33:56 -07:00
Marek Olšák	c5f65bfe6c	glsl: fix shader_storage_blocks_write_access for SSBO block arrays (v2) This fixes KHR-GL45.compute_shader.resources-max on radeonsi. Fixes: `4e1e8f684b` "glsl: remember which SSBOs are not read-only and pass it to gallium" v2: use is_interface_array, protect again assertion failures in u_bit_consecutive Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-25 18:57:38 -04:00
Rob Clark	a6ab27dcab	docs/features: update GL too Forgot to update corresponding entries for desktop GL.. kinda wish we didn't have to update both GLES and GL tables. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 15:48:19 -07:00
Rob Clark	7a57cfbed6	freedreno/a6xx: sample-shading support Enables: OES_sample_shading OES_sample_variables OES_shader_multisample_interpolation Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	ee2e3a07bb	freedreno/ir3: sample-shading support The compiler support for: OES_sample_shading OES_sample_variables OES_shader_multisample_interpolation Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	85949c52b4	freedreno: wire up core sample-shading support Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	c8e825aaac	freedreno/ir3: fix load_interpolated_input slot The so->inputs[] table is in units of vec4 Fixes: `7ff6705b8d` freedreno/ir3: convert to "new style" frag inputs Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	49f922d96c	freedreno/a6xx: add VALIDREG/CONDREG helper macros There are a few places that we check if a shader stage input reg is used/valid (ie. not r63.x).. and there are about to be a bunch more. So add some helper macros for less open-coding. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	f4b4d6cf23	freedreno/ir3: rename frag_vcoord -> ij_pixel Since this is what the value actually is. Cleanup the name before adding more different i,j related values for sample-shading. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	5be415fc2b	freedreno/ir3: remove bogus assert tex instruction can actually return 16b values. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	2f0b9d2249	freedreno/ir3: lower load_barycentric_at_offset Calculates i,j at specified offset within a pixel. A new load_size_ir3 intrinsic is used in conjunction with fddx/fddy to translate the offset into primitive space and adjust the i,j from load_barycentric_pixel accordingly. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	c4f423aa36	freedreno/ir3: lower load_barycentric_at_sample This lowers load_barycentric_at_sample to load_sample_pos_from_id plus load_barycentric_at_offset. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	4e3ce224a7	freedreno: update generated headers Pull in updates for sample shading. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	6d6ec2d4d2	freedreno/ir3: cleanup instruction builder macros De-duplicate the "normal" and "flags" versions of the macros, and while at it go ahead and add "flags" versions for all the remaining macros, since we'll at least need INSTR1F in a following commit. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	77b3b96a3b	freedreno/ir3: more emit-cat5 fixes Couple more opcodes which don't take a sampler id as first arg. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	9032f0690c	freedreno/ir3: fix rgetpos decoding It takes an argument. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	4d08c1b595	compiler: rename SYSTEM_VALUE_VARYING_COORD And add corresponding enums for different sorts of varying interpolation. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	96d2e4ab8a	freedreno: add robustness support Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	6503918689	freedreno/drm: update for robustness Update UABI header and add FD_PP_PGTABLE and FD_NR_FAULTS params. Robustness can be supported by a kernel which provides the new ABI if it also indicates that per-process pagetables are in use. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:07 -07:00
Alyssa Rosenzweig	77d091d0c5	panfrost/midgard: Add new bitwise ops These fused NOT-ops could maybe help somehow...? Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-25 20:37:46 +00:00
Alyssa Rosenzweig	bcabcfe3ad	panfrost/midgard: Identify inand This was previously thought to be inot, but it's actually a bit more general than that! :) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-25 20:37:45 +00:00
Alyssa Rosenzweig	5f942db190	panfrost/midgard: Copy prop for texture registers We'll want to unify this with main copy prop (and extend to varyings), but that'll take more care to handle some special cases, so leave it as a stub pass for now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-25 20:37:45 +00:00
Alyssa Rosenzweig	4d821a1101	panfrost/midgard: Optimize csel involving 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-25 20:37:45 +00:00
Alyssa Rosenzweig	b53b4573c3	panfrost/midgard: Extend copy propagation pass This extends copy propagation to respect output modifiers for ALU instructions, as well as potentially fixing some bugs related to looping (all dEQP loop tests pass). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-25 20:37:45 +00:00
Alyssa Rosenzweig	7bc91b487b	panfrost/midgard: Reduce fmax(a, 0.0) to fmov.pos This will allow us to copyprop away the move and eliminate the instruction entirely. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-25 20:37:45 +00:00
Bas Nieuwenhuizen	295536d47a	radv: Expose Vulkan 1.1 for Android. We have the YCBCR feature now. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	8bb3cec7c9	radv: Expose VK_EXT_ycbcr_image_arrays. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	fc9248e13e	radv: Enable YCBCR conversion feature. This enabled the basic YCBCR features. We support basic multiplane formats using 8-bit and 16-bit unorms, as well as YUV2 formats. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	379b82dace	radv: Add ycbcr subsampled & multiplane formats to csv. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	52c1adda21	radv: Add ycbcr format features. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	b769a549ee	radv: Add hashing for the ycbcr samplers. Otherwise caching gets very confused. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	5c3467e74a	radv: Run the new ycbcr lowering pass. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	91702374d5	radv: Add ycbcr lowering pass. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	5564c38212	radv: Update descriptor sets for multiple planes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	7f6732ac69	radv: Add ycbcr samplers in descriptor set layouts. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	427024bf2e	ac/nir: Add support for planes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	dc917c8073	radv: Allow mixed src/dst aspects in copies. e.g. COLOR + PLANE_2, as well COLOR + COLOR for multiplane images. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	b2cfa231d0	radv: Add support for image views with multiple planes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	65c4f612aa	radv: Add ycbcr conversion structs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	a837768857	radv: Support different source & dest aspects for planar images in blit2d. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	66507cc656	radv: Add single plane image views & meta operations. Copies & clear of multiplane images is not allowed so we do not have to handle that case. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	42d159f276	radv: Add multiple planes to images. No functional changes. This temporarily uses plane 0 for everything. Long term plan is that only single plane images get to use metadata like htile/dcc/cmask/fmask. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	d3225e533f	radv: Add logic for multisample format descriptions. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Bas Nieuwenhuizen	09c4a911e5	radv: Add logic for subsampled format descriptions. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Caio Marcelo de Oliveira Filho	055f6281d4	intel/fs: Don't handle texop_tex for shaders without implicit LOD These will be lowered by nir_lower_tex() with the lower_tex_when_implicit_lod_not_supported, so don't need the extra handling here. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-25 12:13:06 -07:00
Caio Marcelo de Oliveira Filho	d5ac5d6e83	nir: Add option to lower tex to txl when shader don't support implicit LOD We already add the LOD src, so go ahead and update the texop as well when this option is set. v2: Make it an option. (Rob Clark) v3: Use a more concise name suggested by Jason. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-25 12:13:06 -07:00
Topi Pohjolainen	ff642fb0e6	intel/compiler/fs/icl: Use dummy masked urb write for tess eval One cannot write the URB arbitrarily and therefore the message has to be carefully constructed. The clever tricks originate from Kenneth and Jason, I'm just writing the patch. Fixes GPU hangs on ICL with Vulkan CTS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-04-25 22:00:43 +03:00
Andrii Simiklit	4e9592c5fa	iris: make the TFB result visible to others OpenGL 4.6 Spec: "5.3.3 Rules ....... Note: “Updates” via rendering or transform feedback are treated consistently with updates via GL commands. Once EndTransformFeedback has been issued, any subsequent command in the same context that uses the results of the transform feedback operation will see the results." v2: removed a wrong comment ( Kenneth Graunke <kenneth@whitecape.org> ) v3: - flush+dirty depends on buffers usage history - removed an old hack ( Kenneth Graunke <kenneth@whitecape.org> ) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110404 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-25 11:48:04 -07:00
Kenneth Graunke	aa7306b4cf	iris: Some tidying for preemption support Just enable it during init_render_context on Gen10+, and move the Gen9 state tracking into iris_genx_state so it only exists on Gen9. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-25 11:26:24 -07:00
Marek Olšák	383f406591	radeonsi: remove dirty slot masks from scissor and viewport states All registers in the array need to be updated if any of them is changed. Only apps writing gl_ViewportIndex were affected by this bug.	2019-04-25 11:49:38 -04:00
Marek Olšák	440135e5a0	radeonsi/gfx9: rework the gfx9 scissor bug workaround (v2) Needed to track context rolls caused by streamout and ACQUIRE_MEM. ACQUIRE_MEM can occur outside of draw calls. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110355 v2: squashed patches and done more rework Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-04-25 11:49:38 -04:00
Marek Olšák	bc0d924507	radeonsi/gfx9: set that window_rectangles always roll the context Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-04-25 11:49:38 -04:00
Jon Turney	5d310015c5	meson: Force '.so' extension for DRI drivers DRI driver loadable modules are always installed with install_megadriver.py with names ending with '.so', irrespective of platform. Force the name the loadable module is built with to match, so install_megadriver.py doesn't spin trying to remove non-existent symlinks. Fixes: `c77acc3c` "meson: remove meson-created megadrivers symlinks"	2019-04-25 12:40:16 +01:00
Nicolai Hähnle	9445a4ab43	radeonsi: add radeonsi_sync_compile option Force the driver thread to sync immediately with a compiler thread (but compilation still happens in a separate thread). This can be useful to simplify debugging compiler issues. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:29 +02:00
Nicolai Hähnle	ca95adf8ff	radeonsi: add radeonsi_aux_debug option for aux context debug dumps Enabling this option will create ddebug-style dumps for the aux context, except that instead of intercepting the pipe_context layer we just dump the IB contents on flush. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:27 +02:00
Nicolai Hähnle	fea3dcb844	ddebug: expose some helper functions as non-inline Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:24 +02:00
Nicolai Hähnle	ac0b60fa47	ddebug: dump driver state into a separate file Due to asynchronous execution, it's not clear which of the draws the state may refer to. This also works around an issue encountered with radeonsi where dumping the driver state itself caused a hang. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:21 +02:00
Nicolai Hähnle	b7fab7b02d	ddebug: log calls to pipe->flush This can be useful when internal draws lead to a hang. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:19 +02:00
Nicolai Hähnle	fe0d2b3d37	ddebug: set thread name For better debuggability. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:16 +02:00
Nicolai Hähnle	563faa3903	util/u_log: flush auto loggers before starting a new page Without this, command stream dumps of radeonsi may misleadingly end up in a later page. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:35:09 +02:00
Nicolai Hähnle	8bef4df196	radeonsi: add si_debug_options for convenient adding/removing of options Move the definition of radeonsi_clear_db_cache_before_clear there, as well as radeonsi_enable_nir. This removes the AMD_DEBUG=nir option. We currently still have two places for options: the driconf machinery and AMD_DEBUG/R600_DEBUG. If we are to have a single place for options, then the driconf machinery should be preferred since it's more flexible. The only downside of the driconf machinery was that adding new options was quite inconvenient. With this change, a simple boolean option can be added with a single line of code, same as for AMD_DEBUG. One technical limitation of this particular implementation is that while almost all driconf features are available, the translation machinery doesn't pick up the description strings for options added in si_debvug_options. In practice, translations haven't been provided anyway, and this is intended for developer options, so I'm not too worried. It could always be added later if anybody really cares. v2: - use bool instead of uint8_t for options - si_debug_options.inc -> si_debug_options.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:31:02 +02:00
Michel Dänzer	5078d66a86	gitlab-ci: Use meson buildtype debug instead of default debugoptimized This can save a lot of time for some of the meson CI jobs. Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-25 10:51:41 +02:00
Juan A. Suarez Romero	b06ae53606	Revert "intel/compiler: split is_partial_write() into two variants" This reverts commit `40b3abb4d1`. It is not clear that this commit was entirely correct, and unfortunately it was pushed by error. CC: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-25 09:19:10 +02:00
Timothy Arceri	b155f74d7b	nir: fix nir_remove_unused_varyings() We were only setting the used mask for the first component of a varying. Since the linking opts split vectors into scalars this has mostly worked ok. However this causes an issue where for example if we split a struct on one side of the interface but not the other, then we can possibly end up removing the first components on the side that was split and then incorrectly remove the whole struct on the other side of the varying. With this change we simply mark all 4 components for each slot used by a struct. We could possibly make this more fine gained but that would require a more complex change. This fixes a bug in Strange Brigade on RADV when tessellation is enabled, all credit goes to Samuel Pitoiset for tracking down the cause of the bug. Fixes: `f1eb5e6399` ("nir: add component level support to remove_unused_io_vars()") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 16:37:36 +10:00
Lionel Landwerlin	f15409ee55	i965: fix icelake performance query enabling This was a rebase issue which lost of change to a file moved from i965 to src/intel/perf. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `134e750e16` ("i965: extract performance query metrics") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-25 01:11:54 +00:00
Marek Olšák	36cfe5fd62	radeonsi: add BOs after need_cs_space need_cs_space may clear the buffer list. Fixes: `951d60f8cd` "radeonsi: delay adding BOs at the beginning of IBs until the first draw" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-24 20:59:07 -04:00
Marek Olšák	45ca7798dc	glsl: handle interactions between EXT_gpu_shader4 and texture extensions also, EXT_texture_buffer_object has to be enabled separately. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	e71936a731	st/mesa: expose EXT_gpu_shader4 if GLSL 1.40 is supported Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	503f94b43f	mesa: only allow EXT_gpu_shader4 in the compatibility profile Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	ba265d1144	mesa: expose EXT_texture_buffer_object This is needed for exposing the samplerBuffer functions under EXT_gpu_shader4. v2: - expose it in the compat profile only - make it an alias of EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	825c35999c	glsl: allow "varying out" for fragment shader outputs with EXT_gpu_shader4 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	4ff3b8e18a	glsl: add texture builtin functions for EXT_gpu_shader4 v2: some fixes to texture functions thanks to piglit tests Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	8dbe23c8c6	glsl: add arithmetic builtin functions for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	7004114102	glsl: add builtin variables for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	1a973aa5e1	glsl: apply some 1.30 and other rules to EXT_gpu_shader4 as well Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Chris Forbes	85fefd1913	glsl: enable types for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	a7f38e7fbd	glsl: add `unsigned int` type for EXT_GPU_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Chris Forbes	2d8f4fff49	glsl: enable noperspective\|flat\|centroid for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Chris Forbes	8740726e46	glsl: add scaffolding for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	1faf833949	mesa: enable glGet for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Eric Anholt	d23b47fda5	v3d: Disable SSBOs and atomic counters on vertex shaders. The CTS fails on dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.*vertex when they are enabled, due to the VS being run for both bin and render. I think this behavior is expected to be valid, but I can't find text in atomic counters or SSBO specs saying so (the closed I found was in shader_image_load_store). Just disable it for now, since the closed source driver doesn't expose vertex atomic counters/SSBOs either.	2019-04-24 17:24:11 -07:00
Eric Anholt	97316d3783	st/mesa: Don't set atomic counter size != 0 if MAX_SHADER_BUFFERS == 0. This is just asking for tests to get confused about the HW supporting atomics in this shader stage or not, such as dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_expression_vertex. v2: Rebase on the other atomic cleanups that have happened since posting. v3: Commit message tweak by Marek. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-24 17:24:11 -07:00
Kenneth Graunke	2812ef2a26	iris: Advertise EXT_texture_sRGB_R8 support Using the luminance format, like both brw and anv do.	2019-04-24 16:49:13 -07:00
Kenneth Graunke	59aa7c924d	iris: Enable GL_AMD_depth_clamp_separate We support this, we just forgot to turn it on.	2019-04-24 16:49:13 -07:00
Marek Olšák	131d56edfb	util: fix a compile failure in u_compute.c on windows	2019-04-24 19:04:20 -04:00
Mike Blumenkrantz	c7c59f75e5	iris: enable preemption support for gen10 this automatically enables preemption on gen10 where it is disabled by default but still available Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-24 14:47:47 -07:00
Mike Blumenkrantz	7315882023	iris: add preemption support on gen9 this is basically just porting the following two commits to gallium: `d8b50e152a` `5c454661c6` resolves kwg/mesa#49 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-24 14:47:08 -07:00
Kenneth Graunke	21688a306b	iris: Split iris_flush_and_dirty_for_history into two helpers. We create two new helpers, iris_flush_bits_for_history, and iris_dirty_for_history, then use them in the existing function. The first accumulates flush bits based on res->bind_history, but doesn't actually perform a flush. This allows us to accumulate flush bits by looping over multiple resources, but ultimately emit a single flush for all of them. The latter flags dirty bits without flushing, which again allows us to handle multiple resources, but also is more convenient when writing from the CPU where we don't need a flush (as in commit `4d12236072`).	2019-04-24 13:31:32 -07:00
Dave Airlie	3323cf08f0	intel/compiler: fix uninit non-static variable. (v2) Pointed out by coverity. v2: init nir_locals also. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-25 06:06:57 +10:00
Dave Airlie	ce17e413de	virgl/drm: insert correct handles into the table. (v3) This inserts a handle for the flink name and a handle the correct gem handle for the bo. v2: fix handles/names confusion (Lepton Wu) v3: set flink name correctly (Lepton Wu) Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-04-25 06:05:43 +10:00
Dave Airlie	8a39f83fb2	virgl/drm: handle flink name better. This realigns this code with code from radeon. Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-04-25 06:05:43 +10:00
Dave Airlie	92ef4cf9f0	virgl/drm: cleanup buffer from handle creation (v2) This cleans up and realigns this code with what is in radeon v2: fix names->handles (Lepton Wu) Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-04-25 06:05:43 +10:00
Kenneth Graunke	19b246257d	iris: Actually put Mesa in GL_RENDERER string I constructed the right thing and then returned the other one.	2019-04-24 12:54:27 -07:00
Jiang, Sonny	69430d7e59	va: use a compute shader for the blit Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-24 15:47:41 -04:00
Marek Olšák	7fc3d21646	gallium: add PIPE_CAP_PREFER_COMPUTE_BLIT_FOR_MULTIMEDIA	2019-04-24 15:47:41 -04:00
Dylan Baker	5aedf48713	docs: update calendar, and news item and link release notes for 19.0.3	2019-04-24 10:53:04 -07:00
Dylan Baker	6bd7d4f19e	docs: Add SHA256 sums for mesa 19.0.3	2019-04-24 10:50:39 -07:00
Dylan Baker	7cb9043879	docs: add relnotes for 19.0.3	2019-04-24 10:50:37 -07:00
Marek Olšák	09e4771af9	gallium: set PIPE_CAP_MAX_FRAMES_IN_FLIGHT to 2 for all drivers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-24 10:41:04 -04:00
Rafael Antognolli	f2041d2a92	intel/isl: Resize clear color buffer to full cacheline Fixes MCS fast clear gpu hangs with Vulkan CTS on ICL in CI. v2 (Nanley): In the title s/Align/Resize/ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Tested-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-24 08:56:42 +03:00
Jason Ekstrand	45957c05b0	anv/descriptor_set: Properly align descriptor buffer to a page Instead of aligning and then taking inline uniforms into account, we need to take inline uniforms into account and then align to a page. Otherwise, we may not be aligned to a page and allocation may fail. Fixes: `43f40dc7cb` "anv: Implement VK_EXT_inline_uniform_block" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-24 05:40:27 +00:00
Jason Ekstrand	3d33c13eca	anv/descriptor_set: Only vma_heap_finish if we have a descriptor buffer Fixes: `7bb34ecff9` "anv: release memory allocated by bo_heap when..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-24 05:40:27 +00:00
Jason Ekstrand	0bc1942c9d	anv/descriptor_set: Destroy sets before pool finalization Fixes: `105002bd2d` "anv: destroy descriptor sets when pool gets..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-24 05:40:27 +00:00
Jason Ekstrand	6be603edf7	anv/descriptor_set: Unlink sets from the pool in set_destroy anv_descriptor_pool_free_set is called on the clean-up path of anv_descriptor_set_create and the set may not have been added to the pool's list of sets yet. While we're here, we move adding it to that list into set_create for symmetry. Fixes: `105002bd2d` "anv: destroy descriptor sets when pool gets..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-24 05:40:27 +00:00
Tapani Pälli	4add3c6880	android/iris: fix driinfo header filename Fixes iris driver Android build. Fixes: `faa52e328e` "iris: Add mechanism for iris-specific driconf options" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 22:25:17 -07:00
Ian Romanick	21223acf7d	intel/fs: Fix D to W conversion in opt_combine_constants Found by GCC warning: src/intel/compiler/brw_fs_combine_constants.cpp: In function ‘bool needs_negate(const fs_reg, const imm)’: src/intel/compiler/brw_fs_combine_constants.cpp:306:34: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] return ((reg->d & 0xffffu) < 0) != (imm->w < 0); ~~~~~~~~~~~~~~~~~~~^~~ The result of the bit-and is a 32-bit value with the top bits all zero. This will never be < 0. Instead of masking off the bits, just cast to int16_t and let the compiler handle the actual conversion. Fixes: `e64be391dd` ("intel/compiler: generalize the combine constants pass") Cc: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-23 19:48:33 -07:00
Alyssa Rosenzweig	e4ec814c39	panfrost/midgard: Remove assembler This code is outdated and unused; now that the compiler is mature, there's no point keeping it around in-tree (or at all). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:44:00 +00:00
Ryan Houdek	2cd1aa3429	panfrost: Adds Bifrost shader disassembler utility This code is stable and can live upstream independently while the rest of the Bifrost stack comes up. v2: Added a verbose flag to hide away some of the more verbose features that nobody really needs [The Bifrost disassembler is written by Connor Abbott, Lyude Paul, and Ryan Houdek.] Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:39:01 +00:00
Alyssa Rosenzweig	bb1aff3007	panfrost/midgard: Add "op commutes?" property Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	1f345bc7d6	panfrost/midgard: Refactor opcode tables We create an all-encompassing opcode table for handling name and properties, removing a number of ad hoc opcode tables which became brittle and quickly out of date. While we're at it, we fix some incorrect opcodes relating to ball/bany, and move a small function out to midgard_compile.c. Together these changes should allow compilation without warnings, along with helping the codebase health considerably. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	4d995e0da8	panfrost/midgard: Optimize MIR in progress loop Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	e9f84f1447	panfrost/midgard: Implement copy propagation Most copy prop should occur at the NIR level, but we generate a fair number of moves implicitly ourselves, etc... long story short, it's a net win to also do simple copy prop + DCE on the MIR. As a bonus, this fixes the weird imov precision bug once and for good, I think. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	fcdfb67711	panfrost/midgard: Set integer mods Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	422aceb407	panfrost/midgard: Document sign-extension/zero-extension bits (vector) For floating point ops, these bits determine the "negate?" and "abs?" modifiers. For integer ops, it turns out they control how sign/zero extension work, useful for mixing types. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	b453c877d9	panfrost/midgard: Update integer op list In the future, we might want to switch to a table-based approach, but for now, at least have it current. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	0b380a7868	panfrost/midgard: Remove unused mir_next_block Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:32 +00:00
Alyssa Rosenzweig	879ff866b6	panfrost/midgard: Fix off-by-one in successor analysis This reduces register pressure substantially since we get smaller liveness ranges. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	521ac6e5b1	panfrost/midgard: Track loop depth This fixes nested loops. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	84f09ff433	panfrost/midgard: Dead code eliminate MIR We reshuffle the existing "dead move elimination" pass into a generic dead code elimination layer, fixing bugs incurred with looping in the process. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	328a5ef598	panfrost: Use actual imov instruction The bug this worked around is no longer applicable, it seems -- remove the hack that breaks more than it fixes. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	12cd89da81	panfrost: Disable indirect outputs for now The hardware needs this lowered anyway; for now, might as well use mesa's default lowering for pure conformance reasons. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	9db5816e02	panfrost/midgard: imul can only run on mul This restriction makes sense logically. Not sure why it wasn't obeyed before. In conjunction with previous commit's disclaimer, fixes dEQP-GLES2.functional.shaders.loop.for_dynamic_iterations. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	a1aaf72915	panfrost/midgard: Don't try to inline constants on branches Along with a corresponding fix to the move elimination pass (not included here yet -- I just have it disabled for now), this will fix dEQP-GLES2.functional.shaders.loops.for_uniform_iterations.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	c0fb2605dc	panfrost: Respect backwards branches in RA Fixes a bunch of issues with looping. Honestly, I'm not sure why loops worked at all before. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	7d45bd9c91	panfrost/midgard: Remove useless MIR dump Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	8b15f8a343	panfrost/midgard: Respect component of bcsel condition Fixes a bunch of non-vec4 indexing.varying_array tests. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	6a466c0a06	panfrost/midgard: Implement indirect loads of varyings/UBOs This adds preliminary support for indirect loads of varying arrays and uniform arrays, bringing a few new tests in shader.indexing.* to passing, although there remains a number of cases still missing. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	1f7b3884c9	panfrost/midgard: Pipe through varying arrays Varying arrays sometimes are lowered to a series of directly accessed varyings (which we handled okay), but when indirectly accessed, they appear as a single array; we need to handle this as well. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Alyssa Rosenzweig	042d0bb5c3	panfrost/mdg/disasm: Print raw varying_parameters The semantics of this field are not well understood; it is better to print it unconditionally along with the other unknown state, rather than silently eat the value. Without this change, some critical state was being lost in some shaders (notably, the offset for load/store scratchpad intructions found in shaders that spill registers.) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-24 02:22:31 +00:00
Kenneth Graunke	864873dea9	iris: Prefer staging blits when destination supports CCS_E. Otherwise our textures don't get color compression. Thanks to Eero Tamminen for noticing this was missing! Improves performance of GLB27_FillTestC24Z16 on my Apollolake laptop with single channel RAM by 2.3x. Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>	2019-04-23 18:59:27 -07:00
Marek Olšák	d8b296d3ad	gallium: replace drm_driver_descriptor::configuration with driconf_xml PIPE_CAPs are better. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 21:20:26 -04:00
Marek Olšák	8ae50e6004	gallium: replace DRM_CONF_SHARE_FD with PIPE_CAP_DMABUF Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 21:20:26 -04:00
Marek Olšák	e3841368f3	gallium: replace DRM_CONF_THROTTLE with PIPE_CAP_MAX_FRAMES_IN_FLIGHT Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 21:20:24 -04:00
Marek Olšák	a20800f49d	st/dri: simplify throttling code Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 21:19:48 -04:00
Marek Olšák	d9838f653a	gallium: document conservative rasterization flags Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 21:19:48 -04:00
Ian Romanick	26391cceaa	intel/compiler: Lower ffma on Gen4 and Gen5 flrp32 is also a 3-source instruction, but there is another pending series that handles that for Gen4 and Gen5. v2: Rebase on "intel/compiler: Don't have sepearate, per-Gen nir_options" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-23 17:50:28 -07:00
Ian Romanick	fd1fa9afc7	intel/compiler: Don't have sepearate, per-Gen nir_options Instead, just have separate scalar vs. vector nir_options and do per-Gen "fix ups". Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-23 17:50:16 -07:00
Ian Romanick	3b087f668f	glsl: Silence may unused parameter warnings in glsl/ir.h Every file that included glsl/ir.h had a warning like: src/compiler/glsl/ir.h: In member function ‘virtual bool ir_rvalue::is_lvalue(const _mesa_glsl_parse_state) const’: src/compiler/glsl/ir.h:236:64: warning: unused parameter ‘state’ [-Wunused-parameter] virtual bool is_lvalue(const struct _mesa_glsl_parse_state state = NULL) const ^ Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `fa4ebf6b8d` ("glsl: add _mesa_glsl_parse_state object to is_lvalue()") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-23 17:49:19 -07:00
Timothy Arceri	a6b7068ff5	st/mesa/radeonsi: fix race between destruction of types and shader compilation Commit `624789e370` moved the destruction of types out of atexit() and made use of a ref count instead. This is useful for avoiding a crash where drivers such as radeonsi are still compiling in a thread when the app exits and has not called MakeCurrent to change from the current context. While the above scenario is technically an app bug we shouldn't crash. However that change caused another race condition between the shader compilation tread in radeonsi and context teardown functions. This patch makes two changes to fix this new problem: First we explicitly call _mesa_destroy_shader_compiler_types() when destroying the st context rather than calling it indirectly via _mesa_free_context_data(). We do this as we must call it after st_destroy_context_priv() so that we don't destory the glsl types before the compilation threads finish. Next wait for the shader threads to finish in si_destroy_context() this also means we need to call context destroy before destroying the queues in si_destroy_screen(). Fixes: `624789e370` ("compiler/glsl: handle case where we have multiple users for types") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-24 10:23:10 +10:00
Bas Nieuwenhuizen	3844ed8d44	radv: Add adaptive_sync driconfig option and enable it by default. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-23 23:49:39 +00:00
Bas Nieuwenhuizen	f2e0f5c3c4	vulkan/wsi: Add X11 adaptive sync support based on dri options. The dri options are optional. When the dri options are not provided the WSI will not use adaptive sync. FWIW I think for xf86-video-amdgpu this still requires an X11 config option, so only people who opt in can get possible regressions from this. So then the remaining question is: why do this in the WSI? It has been suggested in another MR that the application sets this. However, I disagree with that as I don't think we'll ever get a reasonable set of applications setting it. The next questions is whether this can be a layer. It definitely can be as implemented now. However, I think this generally fits well with the function of the WSI. Furthemore, for e.g. the DISPLAY WSI this is much harder to do in a layer. Of course, most of the WSI could almost be a layer, but I think this still fits best in the WSI. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-23 23:49:39 +00:00
Bas Nieuwenhuizen	3c2e8267d0	radv: Add support for driconf. This includes 0 options. The cache parsing is located at a position where we can easily add config filtering by VkApplicationInfo. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-23 23:49:39 +00:00
Mike Blumenkrantz	b53d256db8	iris: add support for INTEL_conservative_rasterization this hooks up the iris gallium driver to existing mesa bits which handle the implementation resolves kwg/mesa#8 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 16:36:30 -07:00
Mike Blumenkrantz	e00f6a0605	st/mesa: indicate intel extension support for inner_coverage based on cap if the driver (iris) indicates support for the inner_coverage pipe cap, this will set the necessary states in the driver flags and rasterizer structs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 16:36:16 -07:00
Mike Blumenkrantz	1b9041c76a	gallium: add pipe cap for inner_coverage conservative raster mode this can be used by drivers which support the extension to indicate support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 16:36:00 -07:00
Kenneth Graunke	2208d5a683	iris: Fix DrawTransformFeedback math when there's a buffer offset We need to subtract the starting offset from the final offset before dividing by the stride. See src/intel/vulkan/genX_cmd_buffer.c:3142. Not known to fix anything.	2019-04-23 15:57:07 -07:00
Kenneth Graunke	38db20245b	iris: Make some offset math helpers take a const isl_surf pointer	2019-04-23 15:47:10 -07:00
Caio Marcelo de Oliveira Filho	7e2684ce01	spirv: Handle SpvOpDecorateId This operation decorate with an Id instead of a Literal or String. It is used by HlslCounterBufferGOOGLE (provided by SPV_GOOGLE_hlsl_functionality1). Even if we don't do anything with that decoration, we must be able to parse SPIR-V that uses it. Fixes: `891886da2f` "spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-23 14:58:01 -07:00
Caio Marcelo de Oliveira Filho	7b66d584a3	spirv: Rename vtn_decoration literals to operands Decorations (and ExecutionModes) can have not only literals, but also Ids associated with them. So rename the field to the more general name "Operand" used by the spec. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-23 14:58:01 -07:00
Lionel Landwerlin	0fb0058f18	anv: fix argument name for vkCmdEndQuery Doesn't fix anything but it's not the right function prototype. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `673f33c77d` ("anv: Implement CmdBegin/EndQueryIndexed") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-04-24 04:33:26 +08:00
Chia-I Wu	cc53815ae1	virgl: skip empty cmdbufs Several empty cmdbufs are submitted by app/xserver per frame, from glamor_block_handler for example. Let's skip them. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-23 19:07:48 +00:00
Eric Anholt	ec686a66db	gallium: Remove the malloc pipebuffer manager. This has been unused since r600 stopped using it in 2010. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>	2019-04-23 10:36:07 -07:00
Eric Anholt	6345dfc8f3	gallium: Remove the "alt" pipebuffer manager interface. This one would allocate from two underlying pools, but has never been used. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>	2019-04-23 10:36:07 -07:00
Eric Anholt	8e31a4f27f	gallium: Remove the ondemand pipebuffer manager. I couldn't find any uses in the tree since its introduction. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>	2019-04-23 10:36:07 -07:00
Eric Anholt	f5c08d9818	gallium: Remove the pool pipebuffer manager. Noticed while trying to decide if pipebuffer was of any use to me, and found that nothing has used it in the last 10 years at least. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Kristian Høgsberg <hoegsberg@gmail.com>	2019-04-23 10:36:07 -07:00
Jonathan Marek	d133f55a99	freedreno: a2xx: same gmem2mem sequence for all tiles Set REG_A2XX_RB_COPY_DEST_OFFSET in the tile init as it won't get touched by the draw batch. Then gmem2mem is the same for all tiles. Similar to what is done in a6xx, but only for gmem2mem. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-23 17:13:32 +00:00
Jonathan Marek	4107e0678a	freedreno: a2xx: enable batch reordering Batch reordering on a2xx is now tested and functional. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-23 17:13:32 +00:00
Jonathan Marek	7f670ca5fd	freedreno: a2xx: use nir_lower_io for TGSI shaders Allows removing the load_deref/store_deref code in the compiler. tgsi_to_nir now uses screen instead of options so we can simplify that too. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-23 17:13:32 +00:00
Jonathan Marek	bce4f11dbc	freedreno: a2xx: disable PIPE_CAP_PACKED_UNIFORMS a2xx driver is currently broken when PIPE_CAP_PACKED_UNIFORMS is enabled, disable it for now. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-23 17:13:32 +00:00
Jonathan Marek	418c3d9a4f	freedreno: a2xx: fix builtin blit program compilation tgsi_to_nir now requires a screen pointer and is used by fd2_prog_init. fd2_prog_init is used before fd_context_init so set the pointer manually. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-23 17:13:32 +00:00
Jonathan Marek	33cafb41a2	svga: add new ATC formats to the format conversion table Fixes the static assertion error. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Jonathan Marek	0e696416f9	freedreno: a2xx: add GL_AMD_compressed_ATC_texture support Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Jonathan Marek	734409096b	freedreno: a3xx: add GL_AMD_compressed_ATC_texture support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Jonathan Marek	0719a5f646	st/mesa: add ATC support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Jonathan Marek	bfa72e4d52	llvmpipe, softpipe: no support for ATC textures Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Jonathan Marek	ea254fcd3c	gallium: add ATC format support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Jonathan Marek	73c1d7e8c9	mesa: add GL_AMD_compressed_ATC_texture support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-23 17:11:56 +00:00
Marek Olšák	951d60f8cd	radeonsi: delay adding BOs at the beginning of IBs until the first draw so that bound compute shader resources won't be added when they are not needed and same for graphics. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:36 -04:00
Marek Olšák	09bb8c8557	radeonsi: add helper si_get_minimum_num_gfx_cs_dwords Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:34 -04:00
Marek Olšák	c59d238bb0	radeonsi: add si_cp_copy_data Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:33 -04:00
Marek Olšák	694e320643	winsys/amdgpu: clean up and remove nonsensical assertion The assertion considers max_dw from the current IB in the chain, but big_ib_buffer is a buffer for the next IB, which can be smaller. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:31 -04:00
Marek Olšák	1807f6cfe9	winsys/amdgpu: enable chaining for compute IBs Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:06 -04:00
Marek Olšák	b99bed6246	winsys/amdgpu: reorder chunks, make BO_HANDLES first, IB and FENCE last Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	437d032b7d	winsys/amdgpu: make IBs writable and expose their address Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	2313176817	ac: add REWIND and GDS registers to register headers Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	35cd57df2e	ac: add ac_get_i1_sgpr_mask Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	bfb9287599	ac: add radeon_info::is_pro_graphics Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	64d6cc982d	ac: add radeon_info::marketing_name, replacing the winsys callback Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	9b33465481	tgsi/scan: add uses_drawid Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Kenneth Graunke	77449d7c41	iris: Track valid data range and infer unsynchronized mappings. Applications frequently call glBufferSubData() to consecutive regions of a VBO to append new vertex data. If no data exists there yet, we can promote these to unsynchronized writes, even if the buffer is busy, since the GPU can't be doing anything useful with undefined content. This can avoid a bunch of unnecessary blitting on the GPU. u_threaded_context would do this for us, and in fact prohibits us from doing so (see TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED). But we haven't hooked that up yet, and it may be useful to disable u_threaded_context when debugging...at which point we'd still want this optimization. At the very least, it would let us measure the benefit of threading independently from this optimization. And it's not a lot of code. Removes most stall avoidance blits in "Total War: WARHAMMER." On my Skylake GT4e at 1920x1080, this appears to improve performance in games by the following (but I did not do many runs for proper statistics gathering): ---------------------------------------------- \| DiRT Rally \| +2% (avg) \| + 2% (max) \| \| Bioshock Infinite \| +3% (avg) \| + 9% (max) \| \| Shadow of Mordor \| +7% (avg) \| +20% (max) \| ----------------------------------------------	2019-04-23 00:24:08 -07:00
Kenneth Graunke	768b17a7ad	iris: Make a resource_is_busy() helper This checks both "is it busy" and "do we have work queued up for it"?	2019-04-23 00:24:08 -07:00
Kenneth Graunke	5ad0c88dbe	iris: Replace buffer backing storage and rebind to update addresses. This implements PIPE_CAP_INVALIDATE_BUFFER and invalidate_resource(), as well as the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag. When either of these happen, we swap out the backing storage of the buffer for a new idle BO, allowing us to write to it immediately without stalling or queueing a blit. On my Skylake GT4e at 1920x1080, this improves performance in games: ----------------------------------------------- \| DiRT Rally \| +25% (avg) \| +17% (max) \| \| Bioshock Infinite \| +22% (avg) \| +11% (max) \| \| Shadow of Mordor \| +27% (avg) \| +83% (max) \| -----------------------------------------------	2019-04-23 00:24:08 -07:00
Kenneth Graunke	0a082b6560	iris: Make memzone_for_address non-static I want to use this in iris_resource.c.	2019-04-23 00:24:08 -07:00
Kenneth Graunke	72277044e2	iris: Make a gl_shader_stage -> pipe_shader_stage helper function This is probably not the best place for it, but I don't feel like moving the one out of the TGSI translator today, and we already have the other direction here, so...shrug	2019-04-23 00:24:08 -07:00
Kenneth Graunke	b45dff1da8	iris: Rework image views to store pipe_image_view. This will be useful when rebinding images.	2019-04-23 00:24:08 -07:00
Kenneth Graunke	2f60850a3f	iris: Rework UBOs and SSBOs to use pipe_shader_buffer This unifies a bunch of the UBO and SSBO code to use common structures. Beyond iris_state_ref, pipe_shader_buffer also gives us a buffer size, which can be useful when filling out the surface state.	2019-04-23 00:24:08 -07:00
Kenneth Graunke	00d4019676	iris: Track bound constant buffers This helps avoid having to iterate over [0, PIPE_MAX_CONSTANT_BUFFERS) looking to see if any resources are bound.	2019-04-23 00:24:08 -07:00
Kenneth Graunke	4d12236072	iris: Mark constants dirty on transfer unmap even if no flushes occur I have various conditions in place to try and avoid unnecessary PIPE_CONTROL flushes, especially to batches which may have never used the buffer being mapped. But if we do a CPU map to a bound constant buffer, we still need to mark push constants dirty, even if there's nothing happening in batches that would warrant a flush. Fixes obvious misrendering in the "XCOM 2: War of the Chosen" menus (lots of rainbow colored triangles). Fixes lots of blinking elements in "Shadow of Mordor". Fixes missing crowd rendering in "DiRT Rally".	2019-04-23 00:24:08 -07:00
Lionel Landwerlin	b1ba7ffdbd	intel: workaround VS fixed function issue on Gen9 GT1 parts The issue is noticeable in the dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d test where a triangle goes missing when we use the maximum number of URB entries as specified by the documentation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107505 Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-23 13:41:20 +08:00
Matt Turner	4ec258ac3c	intel/compiler: Improve fix_3src_operand() Allow ATTR and IMM sources unconditionally (ATTR are just GRFs, IMM will be handled by opt_combine_constants(). Both are already allowed by opt_copy_propagation(). Also allow FIXED_GRF if the regioning is 8,8,1. Could also allow other stride=1 regions (e.g., 4,4,1) and scalar regions but I don't think those occur. This is sufficient to allow a pass added in a future commit (fs_visitor::lower_linterp) to avoid emitting extra MOV instructions. I removed the 'src.stride > 1' case because it seems wrong: 3-src instructions on Gen6-9 are align16-only and can only do stride=1 or stride=0. A run through Jenkins with an assert(src.stride <= 1) never triggers, so it seems that it was dead code. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-22 16:54:31 -07:00
Matt Turner	8aae7a3998	intel/compiler: Add unit tests for sat prop for different exec sizes The two new unit tests verify that propagating a saturate between instructions of different exec sizes does not happen. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-22 16:54:21 -07:00
Matt Turner	54d4d34b96	intel/compiler: Use SIMD16 instructions in fs saturate prop unit test Will allow us to test that propagation between instructions of different exec sizes does not happen (in the next commit). The stray-looking change in intervening_dest_write is to adjust the size of the texture result to keep the test functioning identically when the instructions' exec sizes are doubled. Without the change, the texture does not overwrite the destination fully as the unit test intends. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-04-22 16:54:17 -07:00
Rafael Antognolli	70e03e220c	intel/fs: Remove fs_generator::generate_linterp from gen11+. We now have a lowering pass that will do this at the fs_visitor level, so we can remove this code from gen11+. v2: Reduce size of the "i" array from 4 to 2 (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-22 16:54:00 -07:00
Rafael Antognolli	9ea90aae1e	intel/fs: Add a lowering pass for linear interpolation. On gen11, instead of using a PLN instruction, we convert FS_OPCODE_LINTERP to 2 or 4 multiply adds. That is done in the fs_generator code. This patch adds a lowering pass that does the same thing at the fs_visitor. It also drops the usage of NF types, since we don't need the extra precision and it lets us skip the accumulator. With all that, some optimizations will still be run on the generated code, and we should get better scheduling. v2: Update comment about saturation and conditional mod (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-22 16:54:00 -07:00
Rafael Antognolli	c0504569ea	intel/fs: Move the scalar-region conversion to the generator. Move the scalar-region conversion from the IR to the generator, so it doesn't affect the Gen11 path. We need the non-scalar regioning for a later lowering pass that we are adding. v2: Better commit message (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-22 16:54:00 -07:00
Rafael Antognolli	0778748eba	intel/fs: Only propagate saturation if exec_size is the same. Otherwise it could propagate the saturation from a SIMD16 instruction into a SIMD8 instruction. With that, only part of the destination register, which is the source of the move with saturation, would have been updated. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-22 16:53:55 -07:00
Kenneth Graunke	087f92c59a	i965: Tidy bogus indentation left by previous commit I left code indented one level too far in the previous commit to make the diff easier to review. Drop that extra level now. Fixes: `6981069fc8` i965: Ignore uniform storage for samplers or images, use binding info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-22 15:41:56 -07:00
Kenneth Graunke	6981069fc8	i965: Ignore uniform storage for samplers or images, use binding info gl_nir_lower_samplers_as_deref creates new top level sampler and image uniforms which have been split from structure uniforms. i965 assumed that it could walk through gl_uniform_storage slots by starting at var->data.location and walking forward based on a simple slot count. This assumed that structure types were walked in a particular order. With samplers and images split out of structures, it becomes impossible to assign meaningful locations. Consider: struct S { sampler2D a; sampler2D b; } s[2]; The gl_uniform_storage locations for these follow this map: 0 => a[0], 1 => b[0], 2 => a[0], 3 => b[0]. But the new split variables look like: sampler2D lowered_a[2]; sampler2D lowered_b[2]; and there is no way to know that there's effectively a stride to get to the location for successive elements of a[] or b[]. So, working with location becomes effectively impossible. Ultimately, the point of looking at uniform storage was to pull out the bindings from the opaque index fields. gl_nir_lower_samplers_as_derefs can obtain this information while doing the splitting, however, and sets up var->data.binding to have the desired values. We move gl_nir_lower_samplers before brw_nir_lower_image_load_store so gl_nir_lower_samplers_as_derefs has the opportunity to set proper image bindings. Then, we make the uniform handling code skip sampler(-array) variables, and handle image param setup based on var->data.binding. Fixes Piglit tests/spec/glsl-1.10/execution/samplers/uniform-struct, this time without regressing dEQP-GLES2.functional.uniform_api.random.3. Fixes: `f003859f97` nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-22 15:39:55 -07:00
Kenneth Graunke	47303b466c	Revert "glsl: Set location on structure-split sampler uniform variables" This reverts commit `9e0c744f07`, which regressed dEQP-GLES2.functional.uniform_api.random.3. It turns out that the newly produced location is meaningless and impossible to consume by drivers that want to look at gl_uniform_storage, so it's probably better to leave it unset (0) than a number that looks usable. Leave a tombstone^Wcomment to discourage the next person from making the obvious looking fix. See the next commit for a longer description of the problem. This breaks tests/spec/glsl-1.10/execution/samplers/uniform-struct on i965, which was originally fixed by the revert. The next commit will fix it again. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-22 15:39:55 -07:00
Marek Olšák	b58e5fb6f3	radeonsi: use CP DMA for the null const buffer clear on CIK This is a workaround for a thread deadlock that I have no idea why it occurs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108879 Fixes: `9b331e462e` Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-22 16:05:52 -04:00
Danylo Piliaiev	f280c36c08	drirc: Add workaround for Epic Games Launcher Epic Games Launcher could be launched in opengl mode with "-opengl" option. It creates 4.4 opengl core context however it uses deprecated functionality e.g. default vertex buffer object. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110462 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-22 16:04:19 -04:00
Kenneth Graunke	1566054459	iris: Track bound and writable SSBOs Marek recently extended pipe->set_shader_buffers() to take an extra writable_bitmask parameter, indicating which SSBOs are writable (some may be bound read-only). We can use this to decide whether to set EXEC_OBJECT_WRITE when pinning. Avoiding the write flag can save us some cross-batch flushing if the SSBO is used for reading in both the render and compute engines.	2019-04-22 11:31:14 -07:00
Chia-I Wu	e9c5e13344	virgl: clear vertex_array_dirty Clear vertex_array_dirty after the state is emitted. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-22 10:19:47 -07:00
Lubomir Rintel	e983a975c6	gallivm: disable NEON instructions if they are not supported The LLVM project made some questionable decisions about defaults for armv7 (e.g. they enable NEON that is not there on NVIDIA and Marvell platforms). On top of that, getHostCPUFeatures() doesn't disable missing machine attributes. Finally, -neon alone is not sufficient to disable emmision of NEON instructions. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-22 09:47:49 -07:00
Lubomir Rintel	bc6bfc861f	gallivm: guess CPU features also on ARM getHostCPUFeatures() is also available on ARM, for even longer time than for x86. Use it -- it potentially enables instructions that may speed things up. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/518 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-22 09:47:39 -07:00
Kenneth Graunke	36478b9f77	iris: Enable the dual_color_blend_by_location driconf option. This fixes rendering in Unigine Valley 1.0 and Heaven 4.0.	2019-04-22 09:36:36 -07:00
Kenneth Graunke	faa52e328e	iris: Add mechanism for iris-specific driconf options Based on Nicolai's `0f8c5de869`. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-22 09:35:36 -07:00
Jason Ekstrand	ccb25aaeaf	nir: Use the NIR_SRC_AS_ macro to define nir_src_as_deref We have a macro for this now; no reason to hand-roll it for derefs. While we're here, move the NIR_DEFINE_CAST for derefs down to where all the other ones are. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-22 15:23:24 +00:00
Jason Ekstrand	2314db10bf	anv,radv: Update release notes for newly implemented extensiosn A lot has happened in those two drivers since the 19.0 release and we keep forgetting to update release notes. Time to bring everything up to date again before 19.1 gets released. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-22 14:47:23 +00:00
Samuel Pitoiset	b3e3440c87	radv: add VK_NV_compute_shader_derivates support Only computeDerivativeGroupLinear is supported for now. All crucible tests pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-22 14:51:57 +02:00
Ian Romanick	a6ccc4c0c8	intel/fs: Add support for float16 to the fsign optimizations Commit `ad98fbc217` ("intel/fs: Refactor code generation for nir_op_fsign to its own function") criss-crossed with `c2b8fb9a81` ("anv/device: expose VK_KHR_shader_float16_int8 in gen8+"), and I was not paying enough attention when I rebased. This adds back the float16 changes and enables the optimization. v2: Incorporate more changes from `19cd2f5deb` and `a8d8b1a139` that I missed in the previous version. Fixes: `ad98fbc217` ("intel/fs: Refactor code generation for nir_op_fsign to its own function") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110474 Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2019-04-20 20:49:34 -07:00
Icenowy Zheng	3e91c7d544	lima: add Android build Currently only meson build supported is added for lima driver. Add Android build support for lima. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Acked-by: Qiang Yu <yuq825@gmail.com>	2019-04-21 01:05:19 +00:00
Andre Heider	8b13aac966	st/nine: skip position checks in SetCursorPosition() For HW cursors, "cursor.pos" doesn't hold the current position of the pointer, just the position of the last call to SetCursorPosition(). Skip the check against stale values and bump the d3dadapter9 drm version to expose this change of behaviour. Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2019-04-20 13:06:29 +02:00
Jason Ekstrand	828ec41154	anv: Rework the descriptor set layout create loop Previously, we were storing the per-binding create info pointer in the immutable_samplers field temporarily so that we can switch the order in which we walk the loop. However, now that we have multiple arrays of structs to walk, it makes more sense to store an index of some sort. Because we want to leave immutable_samplers as NULL for undefined bindings, we store index + 1 and then subtract one later. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 23:26:41 +00:00
Jason Ekstrand	2b388c3d04	anv: Ignore descriptor binding flags if bindingCount == 0 I missed this on the first go round. The bindingCount field of VkDescriptorSetLayoutBindingFlagsCreateInfoEXT is allowed to be zero which means the flags array is ignored. Fixes: `d6c9bd6e01` "anv: Put binding flags in descriptor set layouts" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 23:26:41 +00:00
Alyssa Rosenzweig	648cda258b	panfrost/mdg: Use shared fsign lowering Fixes failures in shaders.operator.common_functions.sign.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-19 23:15:57 +00:00
Alyssa Rosenzweig	31d9caa239	panfrost: Fixup vertex offsets to prevent shadow copy Mali attribute buffers have to be 64-byte aligned. However, Gallium enforces no such requirement; for unaligned buffers, we were previously forced to create a shadow copy (slow!). To prevent this, we instead use the offseted buffer's address with the lower bits masked off, and then add those masked off bits to the src_offset. Proof of correctness included, possibly for the opportunity to say "QED" unironically. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-19 22:50:20 +00:00
Alyssa Rosenzweig	e008d4f011	panfrost: Track BO lifetime with jobs and reference counts This (fairly large) patch continues work surrounding the panfrost_job abstraction to improve job lifetime management. In particular, we add infrastructure to track which BOs are used by a particular job (currently limited to the vertex buffer BOs), to reference count these BOs, and to automatically manage the BOs memory based on the reference count. This set of changes serves as a code cleanup, as a way of future proofing for allowing flushing BOs, and immediately as a bugfix to workaround the missing reference counting for vertex buffer BOs. Meanwhile, there are a few cleanups to vertex buffer handling code itself, so in the short-term, this allows us to remove the costly VBO staging workaround, since this patch addresses the underlying causes. v2: Use pipe_reference for BO reference counting, rather than managing it ourselves. Don't duplicate hash-table key removal. Fix vertex buffer counting. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-19 22:50:20 +00:00
Andres Gomez	a151500dd1	docs/relnotes: add support for VK_KHR_shader_float16_int8 v2: radv also supports it now (Samuel Pitoiset). Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-20 00:29:16 +02:00
Jason Ekstrand	9ce7c29724	anv/nir: Add a central helper for figuring out SSBO address formats Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	470422870a	nir: Add helpers for getting the type of an address format Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	6e230d7607	anv: Implement VK_EXT_descriptor_indexing Now that everything is in place to do bindless for all resource types except input attachments and UBOs, VK_EXT_descriptor_indexing is "trivial". Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	d6c9bd6e01	anv: Put binding flags in descriptor set layouts Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	c0d9926df7	anv: Use bindless handles for images Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	83af92e593	intel/fs: Add support for bindless image load/store/atomic Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	e6803f6b6f	anv: Use bindless textures and samplers This commit changes anv to put bindless handles and sampler pointers into the descriptor buffer and use those instead of bindful when we run out of binding table space. This "spilling" of descriptors allows to to advertise an almost unbounded number of images and samplers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	bf61f057f7	anv: Pass the plane into lower_tex_deref Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	f16fcb9db7	anv: Use write_image_view to initialize immutable samplers Instead of setting it manually, call the helper. When setting descriptor sets becomes more complicated than just setting some struct values, this will keep immutable sampler handling correct. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	e612c3b9bf	anv: Count the number of planes in each descriptor binding Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	843286d324	intel/fs: Add support for bindless texture ops We add two new texture sources for bindless surface and sampler handles. Bindless surface handles are expected to be pre-shifted so that the 20-bit surface state table index is in the top 20 bits of the 32-bit handle. This lets us avoid any extra shifts in the shader. Bindless sampler handles are 32-byte aligned byte offsets from general state base address. We use 32-byte aligned instead of 16-byte aligned to avoid having to use more indirect messages than needed. It means we can't tightly pack samplers but that's probably not a big deal. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	2edf29b933	intel,nir: Lower TXD with a bindless sampler When we have a bindless sampler, we need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	bd56ce8ce5	anv: Implement VK_KHR_shader_atomic_int64 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	79fb0d27f3	anv: Implement SSBOs bindings with GPU addresses in the descriptor BO This commit adds a new way for ANV to do SSBO bindings by just passing a GPU address in through the descriptor buffer and using the A64 messages to access the GPU address directly. This means that our variable pointers are now "real" pointers instead of a vec2(BTI, offset) pair. This carries a few of advantages: 1. It lets us support a virtually unbounded number of SSBO bindings. 2. It lets us implement VK_KHR_shader_atomic_int64 which we couldn't implement before because those atomic messages are only available in the bindless A64 form. 3. It's way better than messing around with bindless handles for SSBOs which is the only other option for VK_EXT_descriptor_indexing. 4. It's more future looking, maybe? At the least, this is what NVIDIA does (they don't have binding based SSBOs at all). This doesn't a priori mean it's better, it just means it's probably not terrible. The big disadvantage, of course, is that we have to start doing our own bounds checking for robustBufferAccess again have to push in dynamic offsets. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	3cf78ec2bd	anv: Lower some SSBO operations in apply_pipeline_layout In order to avoid the potential overhead of A64 operations on all SSBO ops, we look for those SSBO ops where we can get to the descriptor set from the SSBO access operation and lower those to a binding-table approach. When robustBufferAccess is enabled, this lets the hardware do the bounds checking for us. It also avoids some potentially expensive 64-bit integer calculations. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	e7a1e8f735	anv: Add a has_a64_buffer_access to anv_physical_device This is more descriptive and a bit nicer than checking for gen >= 8 && use_softpin everywhere. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	b1a633d9fb	intel/nir: Re-run int64 lowering in postprocess_nir We're about to start doing 64-bit pointer calculations in ANV. They will get applied after brw_preprocess_nir which is where we currently do 64-bit integer arithmetic lowering. Because we're adding 64-bit integer arithmetic after the initial lowering has happened, we need to lower again. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	995dc4e5c3	nir/lower_io: Expose some explicit I/O lowering helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	146deec9ef	anv/pipeline: Add skeleton support for spilling to bindless If the number of surfaces or samplers exceeds what we can put in a table, we will want to spill out to bindless. There is no bindless support yet but this gets us the basic framework that will be used by later commits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	a7d4871846	anv/pipeline: Sort bindings by most used first This commit just sorts the bindings by how often they're used vs the array size of the binding. This will let us make more nuanced decisions about what goes in the binding table vs. what to make bindless. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	a5a0dc08f1	anv: Add a #define for the max binding table size This also fixes a bug where we mis-calculate maximum binding table sizes and may return true in vkGetDescriptorSetLayoutSupport even for sets too large to fit in a binding table. Fixes: `ddc4069122` "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	3b755b52e8	anv: Put image params in the descriptor set buffer on gen8 and earlier This is really where they belong; not push constants. The one downside here is that we can't push them anymore for compute shaders. However, that's a general problem and we should figure out how to push descriptor sets for compute shaders. This lets us bump MAX_IMAGES to 64 on BDW and earlier platforms because we no longer have to worry about push constant overhead limits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	83b943cc2f	anv: Make all VkDeviceMemory BOs resident permanently We spend a lot of time in the driver adding things to hash sets to track residency. The reality is that a properly built Vulkan app uses large memory objects and sub-allocates from them. In a typical frame, most of if not all of those allocations are going to be resident for the entire frame so we're really not saving ourselves much by tracking fine-grained residency. Just throwing everything in the validation list does make it a little bit more expensive inside the kernel to walk the list and ensure that all our VA is in order. However, without relocations, the overhead of that is pretty small. If we ever do run into a memory pressure situation where the fine- grained residency could even potentially help, we would likely be swapping one page out to make room for another within the draw call and performance is totally lost at that point. We're better off swapping out other apps and just letting ours run a whole frame. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Rob Clark	a9241edfa3	freedreno/ir3: fix const assert Fixes: `fe8c57e859` freedreno/ir3: use nir_src_as_uint in a few places Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-19 12:36:06 -07:00
Kristian H. Kristensen	bcb81b4d48	gallium/auxiliary/vl: Fix a couple of warnings Remove unused functions and mark unhandled default case with unreachable. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-19 16:17:37 +00:00
Kristian H. Kristensen	0719fc4c31	egl/dri2: Mark potentially unused 'display' variable with MAYBE_UNUSED Sometimes there is no X11 platform. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-19 16:17:37 +00:00
Kristian H. Kristensen	b5a3567b51	ralloc: Fully qualify non-virtual destructor call This suppresses warning about calling a non-virtual destructor in a non-final class with virtual functions: src/compiler/glsl/ast.h:53:4: warning: destructor called on non-final 'ast_node' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor] DECLARE_LINEAR_ZALLOC_CXX_OPERATORS(ast_node); Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-19 16:17:37 +00:00
Kristian H. Kristensen	41593f3c37	nir_opcodes.py: Saturate to expression that doesn't overflow Compiler warns about overflow when assigning UINT64_MAX to something smaller than a uin64_t: src/compiler/nir/nir_constant_expressions.c:16909:50: warning: implicit conversion from 'unsigned long long' to 'uint1_t' (aka 'unsigned char') changes value from 18446744073709551615 to 255 [-Wconstant-conversion] uint1_t dst = (src0 + src1) < src0 ? UINT64_MAX : (src0 + src1); ~~~ ^~~~~~~~~~ Shift UINT64_MAX down to the appropriate maximum value for the type being assigned to. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-19 16:17:37 +00:00
Kristian H. Kristensen	15605cc9d4	glsl_to_nir: Initialize debug variable If we want to assert on found == true when the loop exits early, we need to initialize it to false. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-19 16:17:37 +00:00
Kristian H. Kristensen	3ecfe20648	tgsi: Mark tgsi_strings_check() unused It's there to hold the static asserts, don't warning about it being unused. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-04-19 16:17:37 +00:00
Lionel Landwerlin	0d46e40467	anv: limit URB reconfigurations when using blorp If the last graphics pipeline bound to the command buffer has enough space in its VS URB entries for Blorp then avoid reconfiguring the URB partitions. v2: s/0/MESA_SHADER_VERTEX/ (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 16:58:06 +01:00
Lionel Landwerlin	84e70556fb	intel/devinfo: add basic sanity tests on device database v2: #undef NDEBUG (Eric) Use inc_include & inc_src (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Anuj Phogat anuj.phogat@gmail.com	2019-04-19 15:56:21 +00:00
Lionel Landwerlin	773e6aa9fd	intel/devinfo: fix missing num_thread_per_eu on ICL There was an assumption that num_thread_per_eu would be set in the Gen8 features. Since this is mostly the same of all gen8->11 (except GEN9_LP that overwrites it) let's just factor it out. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Anuj Phogat anuj.phogat@gmail.com	2019-04-19 15:56:21 +00:00
Eric Anholt	38c75aff4c	nir: Use the nir_builder _imm helpers in setting up deref offsets. When looking at the dEQP nested_struct_array_dynamic_index_fragment code after lowering, I was horrified at the amount of adding and multiplying by 0 we were doing. The builder _imm helpers handle that for you so that the following optimization passes have less work to do. Plus, it's easier to read. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-19 08:45:14 -07:00
Eric Anholt	9ac5ec2f90	nir: Fix deref offset calculation for structs. We were calcuating the offset for the field within the struct, and just dropping it on the floor. Fixes a regression in KHR-GLES3.shaders.struct.local.nested_struct_array_dynamic_index_fragment and a few of its friends since the scratch lowering commit. Fixes: `e8e159e9df` ("nir/deref: Add helpers for getting offsets") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-19 08:45:14 -07:00
Erico Nunes	2288b59ddc	lima: enable nir fsign lowering in ppir The mali utgard pp doesn't support a sign instruction. Use the nir lowering function for fsign to implement fsign in ppir. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-19 15:42:23 +00:00
Erico Nunes	4577eb7b7c	nir/algebraic: add lowering for fsign The mali utgard pp doesn't support a sign instruction. In the ARM offline shader compiler, the sign function is implemented using sub(gt(0.0, a), lt(0.0, a)). This is a generic optimization, so implement it in the nir level when lower_fsign is set, alongside the lowering for isign. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-19 15:42:23 +00:00
Brian Paul	f9c594cdf5	docs: s/Aptril/April/ Found by Manuel Huber. Trivial.	2019-04-19 08:30:27 -06:00
Erico Nunes	56230f0428	lima/ppir: support ppir_op_ceil Add a few missing ppir_op_ceil enum handling entries to implement nir_op_fceil in lima ppir. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-19 10:22:03 +00:00
Bas Nieuwenhuizen	8d2654a419	radv: Support VK_EXT_inline_uniform_block. Basically just reserve the memory in the descriptor sets. On the shader side we construct a buffer descriptor, since AFAIU VGPR indexing on 32-bit pointers in LLVM is still broken. This fully supports update after bind and variable descriptor set sizes. However, the limits are somewhat arbitrary and are mostly about finding a reasonable division of a 2 GiB max memory size over the set. v2: - rebased on top of master (Samuel) - remove the loading resources rework (Samuel) - only load UBO descriptors if it's a pointer (Samuel) - use LLVMBuildPtrToInt to avoid IR failures (Samuel) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2)	2019-04-19 09:21:47 +02:00
Samuel Pitoiset	2b515a8259	ac/nir: use the new raw/struct SSBO atomic intrisics for comp_swap This is actually fixed now. This change requires LLVM r358579. Make sure to have it in your tree, otherwise the following piglit will hang: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-19 09:20:15 +02:00
Samuel Pitoiset	895e10d2db	ac/nir: only use the new raw/struct SSBO atomic intrinsics with LLVM 9+ They are buggy with older LLVM version, see r358579. Fixes: `78c551aca1` ("ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-19 09:20:13 +02:00
Samuel Pitoiset	31164cf5f7	ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+ They are buggy with LLVM 8 because they weren't marked as source of divergence, see r358579. Fixes: `dd0172e865` ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-19 09:20:09 +02:00
Kenneth Graunke	a913fbf124	iris: Be less aggressive at postdraw work skipping We empty the cache sets when flushing the batch, at which point we need to add any framebuffer related BOs even though the bindings haven't changed. So, we now do the cache set tracking unconditionally. For now, we continue skipping resolve work based on the same conditions in the predraw functions - the thinking is if we didn't trigger resolves, there's nothing to update here. Time will tell if this works. Partly reverts commit `365886ebe1`, and fixes Unigine Valley rendering on Gen9+. Drops drawoverhead scores by about 10-12%. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110353	2019-04-18 18:51:58 -07:00
Jason Ekstrand	cd4ffb376f	intel/fs: Account for live range lengths in spill costs The current register allocator has a concept of "spill benefit" which is based on the number of nodes with which a given node interferes. The idea is that you want to spill stuff with high interference because those are the most likely registers to help when spilling. However, this fails to take into account the length of the live range so the allocator frequently picks "cheap" (not many uses) registers which are actually very short lived and so spilling them doesn't help with the pressure situation. This commit takes into account the length of the live range to make long-lived registers more likely to get spilled than short-lived ones. This encourages the spill chooser to choose slightly larger registers which will affect a larger area of the program and hopefully we have to spill fewer of them to get the same reduction in over-all register pressure. Shader-db results on Kaby Lake: total spills in shared programs: 23664 -> 12050 (-49.08%) spills in affected programs: 19243 -> 7629 (-60.35%) helped: 296 HURT: 8 total fills in shared programs: 32028 -> 25139 (-21.51%) fills in affected programs: 20378 -> 13489 (-33.81%) helped: 295 HURT: 16 Of course, most of that is in Deus Ex... Shader-db results on Kaby Lake (without Deus Ex): total spills in shared programs: 6479 -> 5834 (-9.96%) spills in affected programs: 3231 -> 2586 (-19.96%) helped: 40 HURT: 4 total fills in shared programs: 17165 -> 17099 (-0.38%) fills in affected programs: 6951 -> 6885 (-0.95%) helped: 40 HURT: 7 Even without Deus Ex, the spill help is pretty respectable. The worst hurt shaders were one compute shader in Aztec Ruins and one fragment shader in KSP that were each hurt by around 13% fill 9% spill. VkPipeline-db results on Kaby Lake: total spills in shared programs: 9149 -> 8069 (-11.80%) spills in affected programs: 5197 -> 4117 (-20.78%) helped: 27 HURT: 16 total fills in shared programs: 26390 -> 25477 (-3.46%) fills in affected programs: 12662 -> 11749 (-7.21%) helped: 24 HURT: 22 The Vulkan results were decidedly more mixed but we don't have nearly as many apps in that database yet. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 23:04:45 +00:00
Gurchetan Singh	1fd635862f	virgl/vtest: bump up protocol version + support encoded transfers This more accurately reflects what the drm winsys does. Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:39:23 -07:00
Gurchetan Singh	b5698562e4	virgl/vtest: wait after issuing a transfer get Otherwise, there's artifacts when running Unigine Valley with protocol version 2. We can get away with not waiting for most buffers, but let's be conservative. Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:39:18 -07:00
Gurchetan Singh	581ab2bc70	virgl/vtest: modify sending and receiving data for shared memory We need to copy the shared memory region to the display target. Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:39:12 -07:00
Gurchetan Singh	96c3418e06	virgl/vtest: receive and handle shared memory fd The only tricky part is with protocol 0 we can either have a display target or resource backing store. With protocol 2 we can have both. Make the map/unmap functions only deal with the resource backing store. v2: Handle MSAA texture case. v3: spelling v4: Fix dangling else (@prak) v5: mmap --> os_mmap (@prak) + added comments (@gerddie) Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:39:05 -07:00
Gurchetan Singh	9a638bc7c2	virgl/vtest: plumb support for shared memory Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:38:58 -07:00
Gurchetan Singh	9881733e32	virgl/vtest: add utilities for receiving fds v2: recieve --> receive (airlied@) Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:38:52 -07:00
Gurchetan Singh	0dd661777a	virgl/vtest: execute a transfer_get when flushing the front buffer This just moves everything to a helper function -- "flush_front_buffer" will be used later. virgl_vtest_resource_map / virgl_vtest_resource_unmap already take care to map the display target. Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:38:44 -07:00
Gurchetan Singh	599d55371c	virgl: wait after a flush We really need to wait under certain circumstances, or we can end up writing to memory the same time the host is reading. Partial revert of d6dc68 ("virgl: use uint16_t mask instead of separate booleans"). Test cases: - dEQP-GLES31.functional.texture.texture_buffer.render_modify.as_vertex_array.bufferdata on vtest protocol version 2 - Flickering during Alien Isolation Fixes: d6dc68 ("virgl: use uint16_t mask instead of separate booleans") Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Piotr Rak <p.rak@samsung.com>	2019-04-18 15:38:04 -07:00
Lionel Landwerlin	dfd79079da	anv: fix uninitialized pthread cond clock domain Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `843775bab7` ("anv: Rework fences") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 23:23:03 +01:00
Kristian H. Kristensen	e731f2648d	.gitignore: Remove autotool artifacts Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-04-18 14:12:43 -07:00
Eric Anholt	12f6c34806	v3d: Fix atomic cmpxchg in shaders on hardware. In what might be my first case of finding a divergence between hardware and simpenrose for v3d 4.x, it seems that despite what the spec claims, you actually need specific values in the TYPE field for atomic ops. Fixes dEQP-GLES31.functional..compswap.	2019-04-18 13:24:55 -07:00
Eric Anholt	1ce143ca19	v3d: Fix an invalid reuse of flags generation from before a thrsw. Noticed while debugging the last GLES 3.1 failure, though it doesn't seem to affect that bug.	2019-04-18 13:24:55 -07:00
Jason Ekstrand	db4a70e678	anv: Drop some unneeded ANV_FROM_HANDLE for physical devices Ever since `48ed2a7bb0`, we've had one at the top of the function. Reviewed-by: Caio Marcelo de Oliveira Filho caio.oliveira@intel.com	2019-04-18 20:12:57 +00:00
Jason Ekstrand	981209d175	anv: Re-sort the GetPhysicalDeviceFeatures2 switch statement Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 20:12:57 +00:00
Marek Olšák	7bc33a5cd5	radeonsi/gfx9: use the correct condition for the DPBB + QUANT_MODE workaround Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-18 15:58:45 -04:00
Ian Romanick	6b97fa9a99	nir/algebraic: Strength reduce some compares of x and -x Converting the x vs -x comparison to an x vs 0 comparison enable cmod propagation to help. The seems to be a win everywhere except Gen7. Skylake and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 15566733 -> 15566014 (<.01%) instructions in affected programs: 72617 -> 71898 (-0.99%) helped: 302 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.15% max: 7.69% x̄: 1.28% x̃: 0.98% 95% mean confidence interval for instructions value: -2.55 -2.21 95% mean confidence interval for instructions %-change: -1.40% -1.16% Instructions are helped. total cycles in shared programs: 413014786 -> 413015475 (<.01%) cycles in affected programs: 707594 -> 708283 (0.10%) helped: 227 HURT: 101 helped stats (abs) min: 1 max: 612 x̄: 36.07 x̃: 20 helped stats (rel) min: 0.04% max: 19.39% x̄: 2.25% x̃: 1.49% HURT stats (abs) min: 2 max: 334 x̄: 87.90 x̃: 45 HURT stats (rel) min: 0.07% max: 14.51% x̄: 4.54% x̃: 3.36% 95% mean confidence interval for cycles value: -8.12 12.32 95% mean confidence interval for cycles %-change: -0.67% 0.34% Inconclusive result (value mean confidence interval includes 0). Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13828220 -> 13827881 (<.01%) instructions in affected programs: 60887 -> 60548 (-0.56%) helped: 253 HURT: 6 helped stats (abs) min: 1 max: 5 x̄: 1.36 x̃: 1 helped stats (rel) min: 0.16% max: 3.85% x̄: 0.81% x̃: 0.64% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.26% max: 0.89% x̄: 0.47% x̃: 0.27% 95% mean confidence interval for instructions value: -1.39 -1.23 95% mean confidence interval for instructions %-change: -0.85% -0.70% Instructions are helped. total cycles in shared programs: 386870095 -> 386894412 (<.01%) cycles in affected programs: 1537307 -> 1561624 (1.58%) helped: 127 HURT: 188 helped stats (abs) min: 1 max: 381 x̄: 17.89 x̃: 4 helped stats (rel) min: 0.02% max: 14.33% x̄: 1.00% x̃: 0.33% HURT stats (abs) min: 2 max: 5585 x̄: 141.43 x̃: 14 HURT stats (rel) min: 0.03% max: 11.50% x̄: 1.65% x̃: 1.06% 95% mean confidence interval for cycles value: 21.95 132.45 95% mean confidence interval for cycles %-change: 0.32% 0.85% Cycles are HURT. Sandy Bridge total instructions in shared programs: 10896339 -> 10896276 (<.01%) instructions in affected programs: 10757 -> 10694 (-0.59%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.29 x̃: 1 helped stats (rel) min: 0.12% max: 1.85% x̄: 0.87% x̃: 0.89% 95% mean confidence interval for instructions value: -1.42 -1.15 95% mean confidence interval for instructions %-change: -1.03% -0.72% Instructions are helped. total cycles in shared programs: 155091003 -> 155090480 (<.01%) cycles in affected programs: 102761 -> 102238 (-0.51%) helped: 51 HURT: 0 helped stats (abs) min: 1 max: 36 x̄: 10.25 x̃: 4 helped stats (rel) min: 0.02% max: 2.57% x̄: 0.76% x̃: 0.36% 95% mean confidence interval for cycles value: -12.98 -7.53 95% mean confidence interval for cycles %-change: -0.97% -0.56% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8234667 -> 8234652 (<.01%) instructions in affected programs: 2063 -> 2048 (-0.73%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 1.56% x̄: 0.82% x̃: 0.81% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.97% -0.67% Instructions are helped. total cycles in shared programs: 188700906 -> 188700598 (<.01%) cycles in affected programs: 283480 -> 283172 (-0.11%) helped: 83 HURT: 3 helped stats (abs) min: 2 max: 8 x̄: 3.78 x̃: 4 helped stats (rel) min: 0.04% max: 0.55% x̄: 0.15% x̃: 0.12% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.04% 95% mean confidence interval for cycles value: -3.87 -3.29 95% mean confidence interval for cycles %-change: -0.16% -0.12% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	f3d6df719c	nir/algebraic: Fix some 1-bit Boolean weirdness Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total cycles in shared programs: 372594532 -> 372594460 (<.01%) cycles in affected programs: 46854 -> 46782 (-0.15%) helped: 9 HURT: 0 helped stats (abs) min: 2 max: 22 x̄: 8.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.41% x̄: 0.16% x̃: 0.09% 95% mean confidence interval for cycles value: -14.34 -1.66 95% mean confidence interval for cycles %-change: -0.28% -0.04% Cycles are helped. Ivy Bridge total instructions in shared programs: 12038379 -> 12038373 (<.01%) instructions in affected programs: 1278 -> 1272 (-0.47%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.31% max: 0.77% x̄: 0.54% x̃: 0.55% total cycles in shared programs: 180889027 -> 180888997 (<.01%) cycles in affected programs: 29979 -> 29949 (-0.10%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 5 helped stats (rel) min: 0.02% max: 0.34% x̄: 0.11% x̃: 0.07% 95% mean confidence interval for cycles value: -13.40 1.40 95% mean confidence interval for cycles %-change: -0.27% 0.05% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total cycles in shared programs: 155091021 -> 155091003 (<.01%) cycles in affected programs: 8842 -> 8824 (-0.20%) helped: 2 HURT: 0 No changes on Iron Lake or GM45. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	403aac7500	nir/algebraic: Replace a pattern where iand with a Boolean is used as a bcsel All of the affected shaders are in Mad Max. I noticed this while looking at some other things. I tried a couple similar patterns, but the affect on cycles was general negative. It may be worth revisiting this later. v2: Rebase on 1-bit Boolean changes. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15282073 -> 15282053 (<.01%) instructions in affected programs: 1192 -> 1172 (-1.68%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.43 x̃: 1 helped stats (rel) min: 1.16% max: 2.17% x̄: 1.65% x̃: 1.39% 95% mean confidence interval for instructions value: -1.73 -1.13 95% mean confidence interval for instructions %-change: -1.91% -1.38% Instructions are helped. total cycles in shared programs: 372595954 -> 372594532 (<.01%) cycles in affected programs: 11477 -> 10055 (-12.39%) helped: 14 HURT: 0 helped stats (abs) min: 76 max: 122 x̄: 101.57 x̃: 104 helped stats (rel) min: 7.76% max: 15.62% x̄: 12.94% x̃: 14.78% 95% mean confidence interval for cycles value: -111.05 -92.09 95% mean confidence interval for cycles %-change: -14.90% -10.98% Cycles are helped. No changes on any Gen6 or earlier platforms. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	25bfba3335	nir/algebraic: Recognize open-coded copysign(1.0, a) All of the affected shaders are in Mad Max. The inner part of the pattern is itself an open-coded sign(a). I tried using that as a pattern, but the results were not good. A bunch of shaders were helped for instructions, but overall cycles, spill, and fills were hurt. v2: Rebase on 1-bit Boolean changes. v3: Fix order of copysign() parameters in comments and commit message. Noticed by Matt. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15282141 -> 15282073 (<.01%) instructions in affected programs: 6106 -> 6038 (-1.11%) helped: 17 HURT: 0 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.02% max: 2.20% x̄: 1.15% x̃: 1.06% 95% mean confidence interval for instructions value: -4.00 -4.00 95% mean confidence interval for instructions %-change: -1.30% -1.00% Instructions are helped. total cycles in shared programs: 372597886 -> 372595954 (<.01%) cycles in affected programs: 32701 -> 30769 (-5.91%) helped: 17 HURT: 0 helped stats (abs) min: 6 max: 216 x̄: 113.65 x̃: 118 helped stats (rel) min: 0.40% max: 21.86% x̄: 6.20% x̃: 5.83% 95% mean confidence interval for cycles value: -152.84 -74.45 95% mean confidence interval for cycles %-change: -8.89% -3.51% Cycles are helped. No changes on any Gen6 or earlier platforms. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	1711bf6cf2	intel/fs: Generate better code for fsign multiplied by a value v2: Rebase on v2 changes in previous two commits. v3: Rebase on `85c35885b3` ("nir: Rework nir_src_as_alu_instr to not take a pointer"). shader-db results: Skylake and Broadwell had similar results. (Skylake shown) total instructions in shared programs: 15297100 -> 15282141 (-0.10%) instructions in affected programs: 956685 -> 941726 (-1.56%) helped: 4527 HURT: 0 helped stats (abs) min: 1 max: 221 x̄: 3.30 x̃: 2 helped stats (rel) min: 0.07% max: 10.53% x̄: 1.85% x̃: 1.37% 95% mean confidence interval for instructions value: -3.48 -3.12 95% mean confidence interval for instructions %-change: -1.88% -1.81% Instructions are helped. total cycles in shared programs: 372809551 -> 372597886 (-0.06%) cycles in affected programs: 13645512 -> 13433847 (-1.55%) helped: 4362 HURT: 125 helped stats (abs) min: 1 max: 2088 x̄: 50.73 x̃: 28 helped stats (rel) min: 0.01% max: 28.20% x̄: 2.77% x̃: 2.39% HURT stats (abs) min: 1 max: 1836 x̄: 76.90 x̃: 28 HURT stats (rel) min: <.01% max: 34.36% x̄: 3.03% x̃: 1.42% 95% mean confidence interval for cycles value: -50.98 -43.37 95% mean confidence interval for cycles %-change: -2.67% -2.55% Cycles are helped. total spills in shared programs: 23465 -> 23463 (<.01%) spills in affected programs: 42 -> 40 (-4.76%) helped: 1 HURT: 0 total fills in shared programs: 31766 -> 31763 (<.01%) fills in affected programs: 69 -> 66 (-4.35%) helped: 1 HURT: 0 Haswell total instructions in shared programs: 13839992 -> 13828311 (-0.08%) instructions in affected programs: 712503 -> 700822 (-1.64%) helped: 3477 HURT: 0 helped stats (abs) min: 1 max: 221 x̄: 3.36 x̃: 2 helped stats (rel) min: 0.07% max: 10.64% x̄: 1.96% x̃: 1.52% 95% mean confidence interval for instructions value: -3.58 -3.14 95% mean confidence interval for instructions %-change: -2.01% -1.92% Instructions are helped. total cycles in shared programs: 387026330 -> 386872483 (-0.04%) cycles in affected programs: 11329966 -> 11176119 (-1.36%) helped: 3307 HURT: 139 helped stats (abs) min: 2 max: 1776 x̄: 49.58 x̃: 18 helped stats (rel) min: 0.01% max: 20.38% x̄: 2.27% x̃: 1.79% HURT stats (abs) min: 1 max: 2314 x̄: 72.68 x̃: 20 HURT stats (rel) min: <.01% max: 33.99% x̄: 2.28% x̃: 0.96% 95% mean confidence interval for cycles value: -49.31 -39.98 95% mean confidence interval for cycles %-change: -2.15% -2.01% Cycles are helped. LOST: 1 GAINED: 0 Ivy Bridge total instructions in shared programs: 12045602 -> 12038463 (-0.06%) instructions in affected programs: 623837 -> 616698 (-1.14%) helped: 2498 HURT: 0 helped stats (abs) min: 1 max: 39 x̄: 2.86 x̃: 2 helped stats (rel) min: 0.05% max: 10.00% x̄: 1.30% x̃: 1.05% 95% mean confidence interval for instructions value: -2.96 -2.75 95% mean confidence interval for instructions %-change: -1.34% -1.26% Instructions are helped. total cycles in shared programs: 181025675 -> 180891323 (-0.07%) cycles in affected programs: 11329329 -> 11194977 (-1.19%) helped: 2439 HURT: 47 helped stats (abs) min: 1 max: 1565 x̄: 57.06 x̃: 26 helped stats (rel) min: 0.02% max: 24.56% x̄: 2.02% x̃: 1.64% HURT stats (abs) min: 1 max: 1269 x̄: 102.51 x̃: 43 HURT stats (rel) min: 0.11% max: 52.94% x̄: 4.15% x̃: 1.34% 95% mean confidence interval for cycles value: -59.91 -48.17 95% mean confidence interval for cycles %-change: -1.99% -1.82% Cycles are helped. Sandy Bridge, Iron Lake, and GM45 had similar results. (Sandy Bridge shown) total instructions in shared programs: 10896368 -> 10896339 (<.01%) instructions in affected programs: 3767 -> 3738 (-0.77%) helped: 17 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.71 x̃: 1 helped stats (rel) min: 0.13% max: 9.52% x̄: 3.58% x̃: 2.73% 95% mean confidence interval for instructions value: -2.27 -1.14 95% mean confidence interval for instructions %-change: -5.14% -2.03% Instructions are helped. total cycles in shared programs: 155091109 -> 155091021 (<.01%) cycles in affected programs: 47241 -> 47153 (-0.19%) helped: 15 HURT: 8 helped stats (abs) min: 2 max: 81 x̄: 15.73 x̃: 4 helped stats (rel) min: 0.03% max: 10.59% x̄: 1.55% x̃: 0.71% HURT stats (abs) min: 14 max: 32 x̄: 18.50 x̃: 17 HURT stats (rel) min: 0.32% max: 2.79% x̄: 2.43% x̃: 2.71% 95% mean confidence interval for cycles value: -14.59 6.93 95% mean confidence interval for cycles %-change: -1.41% 1.08% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Matt Turner <mattst88@gmail.com> [v2]	2019-04-18 12:38:05 -07:00
Ian Romanick	06d2c11641	intel/fs: Add a scale factor to emit_fsign Normally fsign generates -1, 0, or +1. The new scale factor, S, causes fsign to generate -S, 0, or +S. v2: Rebase on v2 changes in previous commit. v3: Rebase on `85c35885b3` ("nir: Rework nir_src_as_alu_instr to not take a pointer"). Reviewed-by: Matt Turner <mattst88@gmail.com> [v2]	2019-04-18 12:37:48 -07:00
Ian Romanick	ad98fbc217	intel/fs: Refactor code generation for nir_op_fsign to its own function v2: Call emit_fsign from inside the existing switch statement. Suggested by Matt. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	90430d0488	intel/fs: Eliminate dead code first This simplifies the later patch "i965/fs: Generate better code for fsign multiplied by a value". shader-db results: Broadwell and Skylake had similar results. (Skylake shown) total cycles in shared programs: 372808735 -> 372809551 (<.01%) cycles in affected programs: 1519520 -> 1520336 (0.05%) helped: 243 HURT: 277 helped stats (abs) min: 1 max: 226 x̄: 34.05 x̃: 5 helped stats (rel) min: 0.01% max: 13.88% x̄: 1.46% x̃: 0.27% HURT stats (abs) min: 1 max: 1810 x̄: 32.82 x̃: 5 HURT stats (rel) min: 0.01% max: 16.03% x̄: 1.56% x̃: 0.29% 95% mean confidence interval for cycles value: -7.18 10.32 95% mean confidence interval for cycles %-change: -0.17% 0.46% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge, Haswell and Ivy Bridge had similar results. (Sandy Bridge shown) total cycles in shared programs: 155091458 -> 155091109 (<.01%) cycles in affected programs: 370797 -> 370448 (-0.09%) helped: 24 HURT: 36 helped stats (abs) min: 1 max: 331 x̄: 103.17 x̃: 41 helped stats (rel) min: 0.02% max: 7.70% x̄: 2.07% x̃: 0.56% HURT stats (abs) min: 1 max: 291 x̄: 59.08 x̃: 10 HURT stats (rel) min: 0.02% max: 5.29% x̄: 1.02% x̃: 0.15% 95% mean confidence interval for cycles value: -37.92 26.28 95% mean confidence interval for cycles %-change: -0.88% 0.45% Inconclusive result (value mean confidence interval includes 0). Iron Lake and GM45 had similar results. (GM45 shown) total cycles in shared programs: 129133970 -> 129133978 (<.01%) cycles in affected programs: 111966 -> 111974 (<.01%) helped: 3 HURT: 1 helped stats (abs) min: 2 max: 4 x̄: 2.67 x̃: 2 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16 HURT stats (rel) min: 0.07% max: 0.07% x̄: 0.07% x̃: 0.07% 95% mean confidence interval for cycles value: -12.93 16.93 95% mean confidence interval for cycles %-change: -0.05% 0.08% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Kristian H. Kristensen	a90aa14f5a	freedreno: Fix format string warning Modifiers are uin64_t. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-18 11:46:13 -07:00
Kristian H. Kristensen	9c82a55efc	freedreno/a6xx: Add helper for incrementing regid Increments the regid by specified amount unless regid is is r63.x (invalid). Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-18 11:46:13 -07:00
Kristian H. Kristensen	6aa211b316	freedreno: Use enum values from matching enum We get a couple of warnings from using mismatched enum values. This fixes that. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-18 11:46:13 -07:00
Kristian H. Kristensen	c34b285b38	freedreno/a2xx: Fix redundant if statement We test the condition, declare a few variables, then test the exact same condition again. Let's not do that. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-04-18 11:46:13 -07:00
Kristian H. Kristensen	18ce6ac632	freedreno/ir3: Mark ir3_context_error() as NORETURN Fixes a few warnings. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-04-18 11:46:13 -07:00
Jason Ekstrand	c6463f8ac2	nir: Add a nir_src_as_intrinsic() helper Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Jason Ekstrand	85c35885b3	nir: Rework nir_src_as_alu_instr to not take a pointer Other nir_src_as_* functions just take a nir_src. It's not that much more memory copying and the constness preserving really isn't worth the cognitive dissonance. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Jason Ekstrand	eee994e769	nir: Drop "struct" from some nir_* declarations Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Lionel Landwerlin	db5b372bb9	anv: implement WaEnableStateCacheRedirectToCS This 3d performance workaround was initially put in the kernel but the media driver requires different settings so the register has been whitelisted in i915 [1] and userspace drivers are left initializing it as they wish. [1] : https://patchwork.freedesktop.org/series/59494/ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-04-18 17:43:08 +01:00
Lionel Landwerlin	eaadb62c9e	i965: implement WaEnableStateCacheRedirectToCS This 3d performance workaround was initially put in the kernel but the media driver requires different settings so the register has been whitelisted in i915 [1] and userspace drivers are left initializing it as they wish. [1] : https://patchwork.freedesktop.org/series/59494/ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-04-18 17:43:08 +01:00
Lionel Landwerlin	d1be67db39	iris: implement WaEnableStateCacheRedirectToCS This 3d performance workaround was initially put in the kernel but the media driver requires different settings so the register has been whitelisted in i915 [1] and userspace drivers are left initializing it as they wish. [1] : https://patchwork.freedesktop.org/series/59494/ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-04-18 17:43:08 +01:00
Iago Toral Quiroga	c2b8fb9a81	anv/device: expose VK_KHR_shader_float16_int8 in gen8+ v2 (Jason): - Merge shaderFloat16 and shaderInt8 enablement into a single patch. - Merge extension enable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2019-04-18 13:23:03 +02:00
Iago Toral Quiroga	5a5d44b713	anv/pipeline: support Float16 and Int8 SPIR-V capabilities in gen8+ v2: - Merge Float16 and Int8 capabilities into a single patch (Jason) - Merged patch that enabled SPIR-V front-end checks for these caps (except for Int8, which was already merged) v3: - Keep capabilities sorted (Jason) v4: - SpvCapabilityFloat16 support already added in master (Juan) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2019-04-18 13:23:03 +02:00
Iago Toral Quiroga	e6ee07a664	compiler/spirv: move the check for Int8 capability So it is right after the checks for the other various Int* capabilities. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 13:23:03 +02:00
Iago Toral Quiroga	8ed6d74c92	intel/compiler: validate region restrictions for mixed float mode v2: - Adapted unit tests to make them consistent with the changes done to the validation of half-float conversions. v3 (Curro): - Check all the accummulators - Constify declarations - Do not check src1 type in single-source instructions. - Check for all instructions that read accumulator (either implicitly or explicitly) - Check restrictions in src1 too. - Merge conditional block - Add invalid test case. v4 (Curro): - Assert on 3-src instructions, as they are not validated. - Get rid of types_are_mixed_float(), as we know instruction is mixed float at that point. - Remove conditions from not verified case. - Fix brackets on conditional. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-04-18 13:22:46 +02:00
Iago Toral Quiroga	58d6417e59	intel/compiler: validate conversions between 64-bit and 8-bit types v2: - Add some tests with UB type too (Jason) v3: - consider implicit conversions from 2src instructions too (Curro). v4: - Do not check src1 type in single-source instructions (Curro). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	7376d57a9c	intel/compiler: validate region restrictions for half-float conversions v2: - Consider implicit conversions in 2-src instructions too (Curro) - For restrictions that involve destination stride requirements only validate them for Align1, since Align16 always requires packed data. - Skip general rule for the dst/execution type size ratio for mixed float instructions on CHV and SKL+, these have their own set of rules that we'll be validated separately. v3 (Curro): - Do not check src1 type in single-source instructions. - Check restriction on src1. - Remove invalid test. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	6ff52f0628	intel/compiler: also set F execution type for mixed float mode in BDW The section 'Execution Data Types' of 3D Media GPGPU volume, which describes execution types, is exactly the same in BDW and SKL+. Also, this section states that there is a single execution type, so it makes sense that this is the wider of the two floating point types involved in mixed float mode, which is what we do for SKL+ and CHV. v2: - Make sure we also account for the destination type in mixed mode (Curro). Acked-by: Francisco Jerez <currojerez@riseup.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	100debc3c9	intel/compiler: implement SIMD16 restrictions for mixed-float instructions v2: f32to16/f16to32 can use a :W destination (Curro) v3: check destination is packed (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	6d87c651c9	intel/compiler: skip MAD algebraic optimization for half-float or mixed mode It is very likely that this optimzation is never useful and we'll probably just end up removing it, so let's not bother adding more cases to it for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	64b93292ac	intel/compiler: remove inexact algebraic optimizations from the backend NIR already has these and correctly considers exact/inexact qualification, whereas the backend doesn't and can apply the optimizations where it shouldn't. This happened to be the case in a handful of Tomb Raider shaders, where NIR would skip the optimizations because of a precise qualification but the backend would then (incorrectly) apply them anyway. Besides this, considering that we are not emitting much math in the backend these days it is unlikely that these optimizations are useful in general. A shader-db run confirms that MAD and LRP optimizations, for example, were only being triggered in cases where NIR would skip them due to precise requirements, so in the near future we might want to remove more of these, but for now we just remove the ones that are not completely correct. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	ddd1706ab3	intel/compiler: fix cmod propagation for non 32-bit types v2: - Do not propagate if the bit-size changes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	66002eeebe	intel/compiler: add a brw_reg_type_is_integer helper v2: - Fixed typo: meant BRW_REGISTER_TYPE_UB instead BRW_REGISTER_TYPE_UV Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	44e1affaec	intel/compiler: implement is_zero, is_one, is_negative_one for 8-bit/16-bit There are no 8-bit immediates, so assert in that case. 16-bit immediates are replicated in each word of a 32-bit immediate, so we only need to check the lower 16-bits. v2: - Fix is_zero with half-float to consider -0 as well (Jason). - Fix is_negative_one for word type. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	e64be391dd	intel/compiler: generalize the combine constants pass At the very least we need it to handle HF too, since we are doing constant propagation for MAD and LRP, which relies on this pass to promote the immediates to GRF in the end, but ideally we want it to support even more types so we can take advantage of it to improve register pressure in some scenarios. v2 (Jason): - Support 64-bit types too. - Check if we need to set the half-float flag if the immediate already existed. - Multiply the size of the immediate by the width of the copy Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	fb990bd76e	intel/eu: force stride of 2 on NULL register for Byte instructions The hardware only allows a stride of 1 on a Byte destination for raw byte MOV instructions. This is required even when the destination is the NULL register. Rather than making sure that we emit a proper NULL:B destination every time we need one, just fix it at emission time. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	ce68a061de	intel/compiler: ask for an integer type if requesting an 8-bit type v2: - Assign BRW_REGISTER_TYPE_B directly for 8-bit (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	092b147774	intel/compiler: rework conversion opcodes Now that we have the regioning lowering pass we can just put all of these opcodes together in a single block and we can just assert on the few cases of conversion instructions that are not supported in hardware and that should be lowered in brw_nir_lower_conversions. The only cases what we still handle separately are the conversions from float to half-float since the rounding variants would need to fallthrough and we are already doing this for boolean opcodes (since they need to negate), plus there is also a large comment about these opcodes that we probably want to keep so it is just easier to keep these separate. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	472244b374	intel/compiler: activate 16-bit bit-size lowerings also for 8-bit Particularly, we need the same lowewrings we use for 16-bit integers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	40b3abb4d1	intel/compiler: split is_partial_write() into two variants This function is used in two different scenarios that for 32-bit instructions are the same, but for 16-bit instructions are not. One scenario is that in which we are working at a SIMD8 register level and we need to know if a register is fully defined or written. This is useful, for example, in the context of liveness analysis or register allocation, where we work with units of registers. The other scenario is that in which we want to know if an instruction is writing a full scalar component or just some subset of it. This is useful, for example, in the context of some optimization passes like copy propagation. For 32-bit instructions (or larger), a SIMD8 dispatch will always write at least a full SIMD8 register (32B) if the write is not partial. The function is_partial_write() checks this to determine if we have a partial write. However, when we deal with 16-bit instructions, that logic disables some optimizations that should be safe. For example, a SIMD8 16-bit MOV will only update half of a SIMD register, but it is still a complete write of the variable for a SIMD8 dispatch, so we should not prevent copy propagation in this scenario because we don't write all 32 bytes in the SIMD register or because the write starts at offset 16B (wehere we pack components Y or W of 16-bit vectors). This is a problem for SIMD8 executions (VS, TCS, TES, GS) of 16-bit instructions, which lose a number of optimizations because of this, most important of which is copy-propagation. This patch splits is_partial_write() into is_partial_reg_write(), which represents the current is_partial_write(), useful for things like liveness analysis, and is_partial_var_write(), which considers the dispatch size to check if we are writing a full variable (rather than a full register) to decide if the write is partial or not, which is what we really want in many optimization passes. Then the patch goes on and rewrites all uses of is_partial_write() to use one or the other version. Specifically, we use is_partial_var_write() in the following places: copy propagation, cmod propagation, common subexpression elimination, saturate propagation and sel peephole. Notice that the semantics of is_partial_var_write() exactly match the current implementation of is_partial_write() for anything that is 32-bit or larger, so no changes are expected for 32-bit instructions. Tested against ~5000 tests involving 16-bit instructions in CTS produced the following changes in instruction counts: Patched \| Master \| % \| ================================================ SIMD8 \| 621,900 \| 706,721 \| -12.00% \| ================================================ SIMD16 \| 93,252 \| 93,252 \| 0.00% \| ================================================ As expected, the change only affects SIMD8 dispatches. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	0986199b31	intel/compiler: workaround for SIMD8 half-float MAD in gen8 Empirical testing shows that gen8 has a bug where MAD instructions with a half-float source starting at a non-zero offset fail to execute properly. This scenario usually happened in SIMD8 executions, where we used to pack vector components Y and W in the second half of SIMD registers (therefore, with a 16B offset). It looks like we are not currently doing this any more but this would handle the situation properly if we ever happen to produce code like this again. v2 (Jason): - Move this workaround to the lower_regioning pass as an additional case to has_invalid_src_region() - Do not apply the workaround if the stride of the source operand is 0, testing suggests the problem doesn't exist in that case. v3 (Jason): - We want offset % REG_SIZE > 0, not just offset > 0 - Use a helper to compute the offset Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1)	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	aaae24179f	intel/compiler: fix ddy for half-float in Broadwell Broadwell has restrictions that apply to Align16 half-float that make the Align16 implementation of this invalid for this platform. Use the gen11 path for this instead, which uses Align1 mode. The restriction is not present in cherryview, gen9 or gen10, where the Align16 implementation seems to work just fine. v2: - Rework the comment in the code, move the PRM citation from the commit message to the comment in the code (Matt) - Cherryview isn't affected, only Broadwell (Matt) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	60c7c6d3ba	intel/compiler: fix ddx and ddy for 16-bit float We were assuming 32-bit elements. Also, In SIMD8 we pack 2 vector components in a single SIMD register, so for example, component Y of a 16-bit vec2 starts is at byte offset 16B. This means that when we compute the offset of the elements to be differentiated we should not stomp whatever base offset we have, but instead add to it. v2 - Use byte_offset() helper (Jason) - Merge the fix for SIMD8: using byte_offset() fixes that too. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	8f40d392b9	intel/compiler: set correct precision fields for 3-source float instructions Source0 and Destination extract the floating-point precision automatically from the SrcType and DstType instruction fields respectively when they are set to types :F or :HF. For Source1 and Source2 operands, we use the new 1-bit fields Src1Type and Src2Type, where 0 means normal precision and 1 means half-precision. Since we always use the type of the destination for all operands when we emit 3-source instructions, we only need set Src1Type and Src2Type to 1 when we are emitting a half-precision instruction. v2: - Set the bit separately for each source based on its type so we can do mixed floating-point mode in the future (Topi). v3: - Use regular citation style for the comment referencing the PRM (Matt). - Decided not to add asserts in the emission code to check that only mixed HF/F types are used since such checks would break negative tests for brw_eu_validate.c (Matt) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	e6b7410187	intel/compiler: allow half-float on 3-source instructions since gen8 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	ee049f6b71	intel/compiler: don't compact 3-src instructions with Src1Type or Src2Type bits We are now using these bits, so don't assert that they are not set. In gen8, if these bits are set compaction is not possible. On gen9 and CHV platforms set_3src_control_index() checks these bits (and others) against a table to validate if the particular bit combination is eligible for compaction or not. v2 - Add more detail in the commit message explaining the situation for SKL+ and CHV (Jason) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	120c970619	intel/compiler: add new half-float register type for 3-src instructions This is available since gen8. v2: restore previously existing assertion. v3: don't use separate tables for gen7 and gen8, just assert that we don't use half-float before gen8 (Matt) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	4ab2b97a8f	intel/compiler: add instruction setters for Src1Type and Src2Type. The original SrcType is a 3-bit field that takes a subset of the types supported for the hardware for 3-source instructions. Since gen8, when the half-float type was added, 3-source floating point operations can use use mixed precision mode, where not all the operands have the same floating-point precision. While the precision for the first operand is taken from the type in SrcType, the bits in Src1Type (bit 36) and Src2Type (bit 35) define the precision for the other operands (0: normal precision, 1: half precision). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	a8d8b1a139	intel/compiler: drop unnecessary temporary from 32-bit fsign implementation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	19cd2f5deb	intel/compiler: implement 16-bit fsign v2: - make 16-bit be its own separate case (Jason) v3: - Drop the result_int temporary (Jason) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	4588f4a604	intel/compiler: handle extended math restrictions for half-float Extended math with half-float operands is only supported since gen9, but it is limited to SIMD8. In gen8 we lower it to 32-bit. v2: quashed together the following patches (Jason): - intel/compiler: allow extended math functions with HF operands - intel/compiler: lower 16-bit extended math to 32-bit prior to gen9 - intel/compiler: extended Math is limited to SIMD8 on half-float Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (allow extended math functions with HF operands, extended Math is limited to SIMD8 on half-float)	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	114f4e6c29	intel/compiler: lower some 16-bit float operations to 32-bit The hardware doesn't support half-float for these. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	b6a454791b	intel/compiler: assert restrictions on conversions to half-float There are some hardware restrictions that brw_nir_lower_conversions should have taken care of before we get here. v2: - rebased on top of regioning lowering pass Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	66806405af	intel/compiler: handle b2i/b2f with other integer conversion opcodes Since we handle booleans as integers this makes more sense. v2: - rebased to incorporate new boolean conversion opcodes v3: - rebased on top regioning lowering pass Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v2)	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	92f4761198	intel/compiler: split float to 64-bit opcodes from int to 64-bit Going forward having these split is a bit more convenient since these two groups have different restrictions. v2: - Rebased on top of new regioning lowering pass. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Iago Toral Quiroga	3e377c68f8	intel/compiler: add a NIR pass to lower conversions Some conversions are not directly supported in hardware and need to be split in two conversion instructions going through an intermediary type. Doing this at the NIR level simplifies a bit the complexity in the backend. v2: - Consider fp16 rounding conversion opcodes - Properly handle swizzles on conversion sources. v3 - Run the pass earlier, right after nir_opt_algebraic_late (Jason) - NIR alu output types already have the bit-size (Jason) - Use 'is_conversion' to identify conversion operations (Jason) v4: - Be careful about the intermediate types we use so we don't lose range and avoid incorrect rounding semantics (Jason) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Dominik Drees	829f278ad0	Add no_aos_sampling GALLIVM_PERF option This forces using general sampling and should improve precision and performance in some cases.	2019-04-17 22:16:19 +00:00
Samuel Pitoiset	ad6dc13fc7	ac: use struct/raw store intrinsics for 8-bit/16-bit int with LLVM 9+ This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-17 22:10:30 +02:00
Samuel Pitoiset	26ea506235	ac: use struct/raw load intrinsics for 8-bit/16-bit int with LLVM 9+ This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-17 22:10:28 +02:00
Samuel Pitoiset	6fd5e39b60	ac: add support for more types with struct/raw LLVM intrinsics LLVM 9+ now supports 8-bit and 16-bit types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-17 22:10:25 +02:00
Samuel Pitoiset	9cf55b022d	radv: add VK_KHR_shader_atomic_int64 but disable it for now No support for 64-bit compare&swap atomic operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-17 21:59:56 +02:00
Samuel Pitoiset	d118e382dd	ac/nir: add 64-bit SSBO atomic operations support Except compare&swap which is still buggy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-17 21:59:54 +02:00
Samuel Pitoiset	78c551aca1	ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap Use the raw version (ie. IDXEN=0) because vindex is unused. Use the old intrinsic for compare&swap because the new one hangs the GPU for some reasons. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-17 21:59:52 +02:00
Roland Scheidegger	dded2edf8b	gallivm: fix saturated signed add / sub with llvm 9 llvm 8 removed saturated unsigned add / sub x86 sse2 intrinsics, and now llvm 9 removed the signed versions as well - they were proposed for removal earlier, but the pattern to recognize those was very complex, so it wasn't done then. However, instead of these arch-specific intrinsics, there's now arch-independent intrinsics for saturated add / sub, both for signed and unsigned, so use these. They should have only advantages (work with arbitrary vector sizes, optimal code for all archs), although I don't know how well they work in practice for other archs (at least for x86 they do the right thing). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110454 Reviewed-by: Brian Paul <brianp@vmware.com>	2019-04-17 17:42:13 +02:00
Juan A. Suarez Romero	b74e605cf4	meson: Add dependency on genxml to anvil genfiles This fixes a race condition where anv_gen_files are executed before genxml files, which causes a build failure v2: add dependency on idep_genxml (Lionel) Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-17 15:49:55 +02:00
Lionel Landwerlin	baf59e40cd	intel/perf: constify accumlator parameter Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	93dbe52ab0	intel/perf: drop counter size field We can deduct the size from another field, let's just save some space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	a646485c28	i965: perf: add mdapi pipeline statistics queries on gen10/11 The Gen10+ expected format adds an additional counter which we can't disclose yet. We can still make the size of the expected query result match. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	d855906366	intel/perf: stub gen10/11 missing definitions Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	d47cc4acbf	i965: move mdapi guid into intel/perf One more thing we want to share between the different APIs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	b48d6d7471	i965: move mdapi result data format to intel/perf We want to reuse this in Anv. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	2be07fc751	i965: move brw_timebase_scale to device info Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	41b54b5faf	i965: move OA accumulation code to intel/perf We'll want to reuse this in our Vulkan extension. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	f6bba7760f	i965: move mdapi data structure to intel/perf We'll want to reuse those structures later on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	134e750e16	i965: extract performance query metrics We would like to reuse performance query metrics in other APIs. Let's make the query code dealing with the processing of raw counters into human readable values API agnostic. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-17 14:10:42 +01:00
Lionel Landwerlin	603ddda622	i965: store device revision in gen_device_info Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-17 14:10:42 +01:00
Topi Pohjolainen	ea42ba36b9	intel/compiler/icl: Use tcs barrier id bits 24:30 instead of 24:27 Similarly to `1cc17fb731` Fixes gpu hangs with dEQP-VK.tessellation.shader_input_output.barrier Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-04-17 14:55:49 +03:00
Erik Faye-Lund	ce1761edab	virgl: document potentially failing blit This blit can fail, but this is not new; in the old version we didn't even try to blit in this case. So let's just document the limitation for now, and leave this for another day. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	3fdacf1c39	virgl: do color-conversion during when mapping transfer When running on OpenGL ES, we can't just map any format for reading, because of limitations on glReadPixels. So let's fall back to the blit code-path, and translate the pixels to the correct format in the end. This fixes the remaining failures of KHR-GL32.packed_pixels.* apart from the sRGB tests. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	9e9d9b352e	virgl: only blit if resource is read Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	fba03322a2	virgl: get readback-formats from host Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	749bbd39c7	gallium/util: support translating between uint and sint formats Without this, we can't for instance convert between r8_sint and r8g8b8a8_sint. But that's pretty useful, so let's support it as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	f31b65f1c1	virgl: make sure bind is set for non-buffers Otherwise, virglrenderer will reject the resource. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	afbd68378a	virgl: support write-back with staged transfers We currently don't support writing to resources that uses a temporary staging-resource to resolve the pixels. If a write-bit was set, we forgot to perform a blit back to the old resource, followed by trying to update the wrong resource, which lacks backing-storage. The end-result would be that nothing useful happened. This approach also fixes a few smaller bugs, like using the wrong box (without x y and z zeroed out), which means a partial update of a multisampled texture could result in the wrong part of the texture being updated. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	0bc8683ffa	virgl: use pipe_box for blit dst-rect Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	121e366632	virgl: rewrite core of virgl_texture_transfer_map Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	1f27bd3f2b	virgl: return error if allocating resolve_tmp fails Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	fc8b1ca33a	virgl: wait for the right resource In case we're resolving, we need to wait for the resolved resource instead of the original one. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	6263304b2d	virgl: check for readback on correct resource Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	ac932ff822	virgl: make unmap queuing a bit more straight-forward It's hard to read the code that decides if we want to queue up an unmap or destroy the transfer right away. So let's make it a bit simpler, by setting a bool in case we want to queue it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	b08e73308e	virgl: simplify virgl_texture_transfer_unmap logic There's no reason to keep an extra indentation level here, let's merge the two if-conditions. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	7dd601a399	virgl: track full virgl_resource instead of just virgl_hw_res Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	c62434f106	virgl: tmp_resource -> templ This isn't the temporary resource itself, it's the template that we'll create the resource from. So let's name it appropriately. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:08 +00:00
Erik Faye-Lund	18a721fd56	virgl: remove pointless transfer-counter This is only written to, never read. Let's just get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-17 07:27:07 +00:00
Timothy Arceri	3c5a9ab9f0	radeonsi/nir: fix scanning of bindless images Fixes: `d62d434fe9` ("ac/nir_to_llvm: add image bindless support")	2019-04-17 09:56:56 +10:00
Kenneth Graunke	c4478889b7	iris: Add texture cache flushing hacks for blit and resource_copy_region This is a port of Jason's `8379bff6c4` from i965 to iris. We can't find anything relevant in the documentation and no one we've talked to has been able to help us pin down a solution. Unfortunately, we have to put the hack in both iris_blit() and iris_copy_region(). st/mesa's CopyImage() implementation sometimes chooses to use pipe->blit() instead of pipe->resource_copy_region(). For blits, we only do the hack if the blit source format doesn't match the underlying resource (i.e. it's reinterpreting the bits). Hopefully this should not be too common.	2019-04-16 13:04:22 -07:00
Eric Anholt	697e2e1f26	v3d: Always set up the qregs for CSD payload. We were failing to set up payload[1] for use by LocalInvocationIndex/ID and shared variable accesses if gl_WorkGroupID/gl_GlobalInvocationID wasn't used (possibly because you only have one workgroup). You're always going to use payload[1], and payload[0] is common enough and we have DCE in the backend to clean it up if it happens to not be used.	2019-04-16 12:10:39 -07:00
Eric Anholt	1bc71e8b65	v3d: Only look up the 3rd texture gather offset for non-arrays. Fixes assertion failures in the CTS since Karol's cleanup when NIR started noticing that we were reading an invalid component. Fixes: `5450f1c9fb` ("v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value")	2019-04-16 12:07:59 -07:00
Caio Marcelo de Oliveira Filho	a0dae78e72	spirv: Tell which opcode or value is unhandled when failing v2: When available, include the opcode name too. (Karol) v3: Use more to_string helpers. (Karol) Include the wrong bit_size in those failures. Include the capability number in spv_check_supported. Provide vtn_fail_with_* macros to avoid noise in the call sites. v4: Provide macros only for opcode and decoration, which have enough usages to justify them. (Jason) Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-16 11:11:10 -07:00
Caio Marcelo de Oliveira Filho	0ccfe741b1	spirv: Add more to_string helpers Also, use a set to identify repeated values. The previous arrangement worked when the repetitions were one after another, but in some of the new cases they are not. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-16 11:11:10 -07:00
Jason Ekstrand	583a4d9a27	intel/mi_builder: Disable mem_mem tests on IVB Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-04-16 12:59:12 -05:00
Kenneth Graunke	33314cf410	iris: Change vendor and renderer strings This patch changes the GL_VENDOR string from "Mesa Project" to "Intel". This makes GLX_MESA_query_renderer report "Vendor: Intel (0x8086)" instead of "Vendor: Mesa Project (0x8086)" which is arguably wrong. We now also use a consistent vendor string across Windows and Linux. It also prepends "Mesa" to the GL_RENDERER string, both to credit the community and have a distinguishing mark between the two drivers. We drop "DRI" compared to i965, as it's not really that important. Improves performance in Portal by 1.8x. Iris is now 3.86% faster than i965 at the portal-d1.dem timedemo on my Kabylake laptop. One change is that Portal selects the MapBufferRange path based on the vendor string, and iris's BufferSubData path is still missing the storage invalidation optimization.	2019-04-16 10:27:20 -07:00
Jason Ekstrand	56d9532316	intel/mi_builder: Re-order an initializer The order doesn't matter in C99 but some C++ compilers seem to care. Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-04-16 12:07:15 -05:00
Jason Ekstrand	ba0f203ae8	nir/algebraic: Use a cache to avoid re-emitting structs This takes the stupid simplest and most reliable approach to reducing redundancy that I could come up with: Just use the struct declaration as the cach key. This cuts the size of the generated C file to about half and takes about 50 KiB off the .data section. size before (release build): text data bss dec hex filename 5363833 336880 13584 5714297 573179 _install/lib64/libvulkan_intel.so size after (release build): text data bss dec hex filename 5229017 285264 13584 5527865 545939 _install/lib64/libvulkan_intel.so Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-04-16 16:40:15 +00:00
Jason Ekstrand	0c712fd404	nir/algebraic: Move the template closer to the render function Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-04-16 16:40:15 +00:00
Kenneth Graunke	4c3c417b00	iris: Move iris_debug_recompile calls before uploading. Order of operations is important, otherwise we'll find the program we just uploaded as the "old" compile and get confused why nothing is different between the two keys. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-04-16 09:01:20 -07:00
Kenneth Graunke	04f97eefa3	iris: Print the reason for shader recompiles. I was lazy earlier and hadn't bothered typing / refactoring this. Now I'm hitting some extra recompiles and would like to see why. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-04-16 09:01:18 -07:00
Kenneth Graunke	fad7801afd	i965: Move program key debugging to the compiler. The i965 driver has a bunch of code to compare two sets of program keys and print out the differences. This can be useful for debugging why a shader needed to be recompiled on the fly due to non-orthogonal state dependencies. anv doesn't do recompiles, so we didn't need to share this in the past - but I'd like to use it in iris. This moves the bulk of the code to the compiler where it can be reused. To make that possible, we need to decouple it from i965 - we can't get at the brw program cache directly, nor use brw_context to print things. Instead, we use compiler->shader_perf_log(), and simply pass in keys. We put all of this debugging code in brw_debug_recompile.c, and only export a single function, for simplicity. I also tidied the code a bit while moving it, now that it all lives in one file. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-04-16 09:01:15 -07:00
Marek Olšák	4f715868a9	winsys/amdgpu: don't set GTT with GDS & OA placements on APUs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-04-16 10:24:19 -04:00
Marek Olšák	d3ce8a7f6b	nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possible Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-16 10:24:19 -04:00
suresh guttula	d98f6380cb	st/va/enc: Add support for frame_cropping_flag of VAEncSequenceParameterBufferH264 This patch will add support for frame_cropping when the input size is not matched with aligned size. Currently vaapi driver ignores frame cropping values provided by client. This change will update SPS nalu with proper cropping values. Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-04-16 10:15:09 -04:00
suresh guttula	05cc018ae6	radeon/vce:Add support for frame_cropping_flag of VAEncSequenceParameterBufferH264 This patch will add support for frame_cropping when the input size is not matched with aligned size. Currently vaapi driver ignores frame cropping values provided by client. This change will update SPS nalu with proper cropping values. v2: Moving default crop setting to else when enc_frame_cropping_flag is not set. Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-04-16 10:15:09 -04:00
suresh guttula	8becf5b46d	vl: Add cropping flags for H264 This patch adds cropping flags for H264 in pipe_h264_enc_pic_control. Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-04-16 10:15:09 -04:00
Tapani Pälli	624789e370	compiler/glsl: handle case where we have multiple users for types Both Vulkan and OpenGL might be using glsl_types simultaneously or we can also have multiple concurrent Vulkan instances using glsl_types. Patch adds a one time init to track number of users and will release types only when last user calls _glsl_type_singleton_decref(). This change fixes glsl_type memory leaks we have with anv driver. v2: reuse hash_mutex, cleanup, apply fix also to radv driver and rename helper functions (Jason) v3: move init, destroy to happen on GL context init and destroy Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-16 12:58:00 +03:00
Danylo Piliaiev	04508f57d1	intel/compiler: Do not reswizzle dst if instruction writes to flag register If we write to the flag register changing the swizzle would change what channels are written to the flag register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110201 Fixes: `4cd1a0be` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: <ian.d.romanick@intel.com>	2019-04-16 09:42:08 +00:00
Michel Dänzer	9b2473c7a4	gitlab-ci: Use LLVM 3.4 from Debian jessie for scons-llvm job This gets us closer to the officially supported minimum version of LLVM, which is 3.3. Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:57:55 +02:00
Michel Dänzer	5789bd935e	gitlab-ci: Do not use subshells for compiling dependencies bash subshells don't inherit the -e option by default, so failures in the subshell commands wouldn't cause the CI job to fail. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:57:55 +02:00
Michel Dänzer	172ccfffda	gitlab-ci: Drop unused clang 5/6 packages Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:57:55 +02:00
Michel Dänzer	3fca2b760c	gitlab-ci: Use clang 8 instead of 7 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:57:55 +02:00
Michel Dänzer	979df83940	gitlab-ci: Remove unused Debian packages from Docker image v2: * Also remove autotools, now that the Mesa autotools build system has been dropped. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1	2019-04-16 10:41:07 +02:00
Michel Dänzer	792d6987a3	gitlab-ci: Remove unneded (stuff from) APT command lines We either compile these locally, or they are dependencies of other packages we install. v2: * Adapt to leaving self-compiled packages untouched. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:41:07 +02:00
Michel Dänzer	e9de19ffca	gitlab-ci: Install most packages from Debian buster We now use the C frontend of GCC 8 instead of 6 (required tweaking the before_script for the clang job). We cannot use the C++ frontend of GCC 7 or newer yet, because upstream GCC 7 changed some C++ name mangling stuff in backwards incompatible ways, and LLVM < 6.0 packages aren't available in buster. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:41:07 +02:00
Michel Dänzer	ecb3eedc54	gitlab-ci: Use Debian packages instead of pip ones for meson and scons Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:41:07 +02:00
Michel Dänzer	caf83e96e4	gitlab-ci: Use HTTPS for APT repositories Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:41:07 +02:00
Michel Dänzer	d00b1c4511	gitlab-ci: Use Debian stretch instead of Ubuntu bionic The APT archive used by the Ubuntu docker image can be slow, even timing out sometimes, causing spurious failures of the containers-build job. The Debian docker image uses deb.debian.org, which is backed by a content distribution network. One downside is that stretch only has GCC 6, whereas bionic had 7. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-16 10:14:21 +02:00
Gert Wollny	1c5ff3a6d0	doc/features: Add a few extensions to the feature matrix These additions already landed but I forgot to update the feature matrix. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-16 08:01:13 +00:00
Samuel Pitoiset	ecbe6cb805	radv: sort the shader capabilities alphabetically Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-16 09:14:22 +02:00
Kenneth Graunke	024a57d23c	iris: Make shader_perf_log print to stderr if INTEL_DEBUG=perf is set This matches i965's behavior, and makes sure that shader compiler messages are visible when setting INTEL_DEBUG=perf.	2019-04-15 23:33:03 -07:00
Samuel Pitoiset	8704bd5588	radv: enable shaderInt8 on SI and CIK No CTS failures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-16 08:22:54 +02:00
Chia-I Wu	c45c889f95	virgl: fix fence fd version check Fixes: `d1a1c21e76` ("virgl: native fence fd support") Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-15 23:25:47 +00:00
Chia-I Wu	442e75071b	virgl: introduce virgl_drm_fence virgl_drm_fence can wrap either a fence fd or a virgl_hw_res. Because a fence fd is cheaper than a virgl_hw_res, we use it whenever it is available. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-15 23:25:47 +00:00
Chia-I Wu	334103efbf	virgl: hide fence internals from the driver Fence fds are cheaper than resources. We want to let winsys make the decision and use fence fds whenever they are supported. This commit prepares the work. For the moment, we create a resource _and_ a fence fd when supports_fences is true. This will be fixed such that we create a resource _or_ a fence fd. (And because of a version check bug that we will fix later, supports_fences is actually never true). Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-15 23:25:47 +00:00
Chia-I Wu	a23c091988	virgl: handle fence_server_sync in winsys It does not need help from the driver. This also fixes one issue where the fence is ignored when the transfer queue is full. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-15 23:25:47 +00:00
Roland Scheidegger	88e0bbf24a	gallivm: fix bogus assert in get_indirect_index 0 is a valid value as max index, and the code handles it fine. This isn't commonly seen, as it will only happen with array declarations of size 1. Fixes piglit tests/shaders/complex-loop-analysis-bug.shader_test Fixes: `a3c898dc97` "gallivm: fix improper clamping of vertex index when fetching gs inputs" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110441 Reviewed-by: Brian Paul <brianp@vmware.com>	2019-04-16 00:49:38 +02:00
Andres Gomez	42351c21bb	glsl/linker: always validate explicit locations for first and last interfaces Until now, we were only doing this when linking a SSO program. However, nothing avoids linking a non SSO program which doesn't have both a VS and FS. In those cases, we also need to report the usual linking errors, if happening. v2: Use a better name for the renamed function (Timothy). Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-15 22:34:50 +00:00
Rhys Perry	6281517f3e	vc4: fix build Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `5131b7a43f` ('gallium: add support for formatted image loads')	2019-04-15 23:27:21 +01:00
Andres Gomez	dbb309dd71	docs: drop Andres Gomez from the release cycles Juan A. Suarez takes his place and the shorter loop makes Dylan repeating earlier. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-15 22:03:17 +00:00
Kenneth Graunke	0f3dc832bc	iris: Fix FLUSH_EXPLICIT handling with staging buffers. I neglected to blit the staging buffer back to the real one at transfer_flush_region (FlushMappedBufferRange) time.	2019-04-15 14:51:01 -07:00
Kenneth Graunke	62b2ce0592	iris: Preserve all PIPE_TRANSFER flags in xfer->usage We need to preserve PIPE_TRANSFER_FLUSH_EXPLICIT, DISCARD_RANGE, and so on, but don't want to pass them to iris_bo_map(). So, keep them all, but mask them off when calling map. Chris Wilson told me to do this a long time ago and he was right.	2019-04-15 14:51:01 -07:00
Kenneth Graunke	9c52dce6a9	iris: Actually mark blorp_copy_buffer destinations as written.	2019-04-15 14:51:01 -07:00
grmat	8cb50edebf	drirc: add Spectacle, Falkon to a-sync blacklist Spectacle is the plasma screenshot utility Falkon is a KDE web browser that should succeed Konqueror	2019-04-15 17:38:44 -04:00
davidbepo	10d33ddd50	drirc: add Waterfox to adaptive-sync blacklist	2019-04-15 17:27:15 -04:00
El Christianito	4d02f591cb	drirc: add Budgie WM to adaptive-sync blacklist Budgie Window Manager is an increasingly used alternative to GNOME and MATE. Default in Solus OS, also used in other distros. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-15 17:27:15 -04:00
Dylan Baker	a988d95389	ci: Delete autotools build jobs Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Matt Turner <mattst88@gmail.com>	2019-04-15 13:44:41 -07:00
Dylan Baker	b165ac972b	docs: drop most autoconf references There's still a few in here, but those docs are already so out of date that it probably makes more sense to delete them. Such as the GLES docs which still claim we only support 1.1 and 2.0, with no mention of 3.x at all. v2: - Add docs for testing back end (Eric Engestrom) - Drop more autootols references - meson is now required not recommended - Add $PWD Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Matt Turner <mattst88@gmail.com>	2019-04-15 13:44:34 -07:00
Dylan Baker	95aefc94a9	Delete autotools Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Matt Turner <mattst88@gmail.com>	2019-04-15 13:44:29 -07:00
Marek Olšák	de0c97c817	radeonsi: enable GL_EXT_shader_image_load_formatted no changes - the driver doesn't use the format Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-15 16:18:07 -04:00
Rhys Perry	a35f2bbb85	st/mesa: add support for EXT_shader_image_load_formatted v3: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-15 16:18:07 -04:00
Rhys Perry	082d180a22	mesa, glsl: add support for EXT_shader_image_load_formatted v3: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-15 16:18:07 -04:00
Rhys Perry	5131b7a43f	gallium: add support for formatted image loads v3: rebase v3: make use of u_pipe_screen_get_param_defaults Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-15 16:18:07 -04:00
Samuel Pitoiset	bf4a0485d9	radv: set ACCESS_NON_READABLE on stores for copy/fill/clear meta shaders The compiler will emit GLC=1. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-15 21:36:53 +02:00
Bas Nieuwenhuizen	f6fdd39eab	radv: Use local buffers for the global bo list. Even if we don't use local buffers in general. Turns out that even though the performance is not the best the kernel still does it better than our own list. We still have to keep the radv bo list for buffers that are shared externally. This improves Talos on lowest quality setting (so as CPU bound as possible) by ~10% if the global bo list is enabled. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 20:39:38 +02:00
Bas Nieuwenhuizen	af9534b9f3	ac: Move has_local_buffers disable to radeonsi. In radv we had a separate flag to actually use it + an env option to experimentally use it. The common code setting has_local_buffers to false of course broke that experimental option. Also the "enable on APU" did not make sense for RADV as it is still disabled by default. Fixes: `b21a4efb55` "radv/winsys: allow local BOs on APUs" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 20:39:28 +02:00
Bas Nieuwenhuizen	a589d8c0ab	radv: Add bolist RADV_PERFTEST flag. To test global_bo_list performance. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 20:39:05 +02:00
Marek Olšák	dbab755ecf	ac: fix incorrect bindless atomic code in visit_image_atomic Coverity: CID 1444664 Fixes: `d62d434fe9` ("ac/nir_to_llvm: add image bindless support") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-15 12:52:02 -04:00
Rhys Perry	8671cfe2a2	nir,ac/nir: fix cube_face_coord Seems it was missing the "/ ma + 0.5" and the order was swapped. Fixes: `a1a2a8dfda` ('nir: add AMD_gcn_shader extended instructions') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 17:22:47 +01:00
Jason Ekstrand	90108deb27	anv: Update to use the new features struct names These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-15 13:25:43 +00:00
Jason Ekstrand	7f113c07b2	vulkan: Update the XML and headers to 1.1.106 Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-15 13:25:43 +00:00
Timothy Arceri	8f74a60c43	nir: fix packing components with arrays When gathering info for unmovable types we need to handle arrays. While we dont support packing/moving arrays we do support packing scalar components with these arrays. Fixes piglit: tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test Fixes: `5eb17506e1` ("nir: do not pack varying with different types") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 19:25:12 +10:00
Samuel Pitoiset	14f03978ed	radv: enable VK_KHR_shader_float16_int8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-15 10:43:55 +02:00
Samuel Pitoiset	bbe8febd93	spirv: add SpvCapabilityFloat16 support Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-15 10:43:52 +02:00
Kenneth Graunke	8bf9b7b5b6	intel: Emit 3DSTATE_VF_STATISTICS dynamically Pipeline statistics queries should not count BLORP's rectangles. (23) How do operations like Clear, TexSubImage, etc. affect the results of the newly introduced queries? DISCUSSION: Implementations might require "helper" rendering commands be issued to implement certain operations like Clear, TexSubImage, etc. RESOLVED: They don't. Only application submitted rendering commands should have an effect on the results of the queries. Piglit's arb_pipeline_statistics_query-vert_adj exposes this bug when the driver is hacked to always perform glBufferData via a GPU staging copy (for debugging purposes). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-14 19:58:04 -07:00
Jason Ekstrand	47709ca146	nir/validate: Require unused bits of nir_const_value to be zero Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	c4b28d1730	nir/load_const_to_scalar: Get rid of a bit size switch statement Now that nir_const_value is a scalar, we don't need the switch on bit size in order to pluck off components properly. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	893dd34702	spirv: Drop some unneeded bit size switch statements Now that nir_const_value is a scalar, we don't need the switch on bit size in order copy components around properly. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	b8197a01a9	nir/constant_folding: Get rid of a bit size switch statement Now that nir_const_value is a scalar, we don't need the switch on bit size in order to swizzle them properly. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Karol Herbst	14531d676b	nir: make nir_const_value scalar v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-04-14 22:25:56 +02:00
Karol Herbst	73d883037d	spirv: reduce array size in vtn_handle_constant we already assert above that there are no more than 3 sources, so it doesn't make sense to use an array of 4 sources Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Karol Herbst	e72beacb95	nir/loop_analyze: use nir_const_value.b for boolean results, not u32 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	10602db78c	nir/print: Use nir_src_as_int for array indices Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	9b1e4bab6b	nir/builder: Add a nir_imm_zero helper v2: replace nir_zero_vec with nir_imm_zero (Karol Herbst) Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-14 22:25:56 +02:00
Karol Herbst	daaf777376	nir/builder: Move nir_imm_vec2 from blorp into the builder While we're here, fix a typo which caused it to actually return a vec4 with the third and fourth components zero. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Karol Herbst	606b74035e	lima: use nir_src_as_float Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-14 22:25:56 +02:00
Karol Herbst	fe8c57e859	freedreno/ir3: use nir_src_as_uint in a few places v2 (Jason Ekstrand): - Add even more places Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Karol Herbst	bbf2ecaf35	intel/nir: use nir_src_is_const and nir_src_as_uint Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Jason Ekstrand	6b1c398bcb	intel/nir: Take a nir_tex_instr and src index in brw_texture_offset This makes things a bit simpler and it's also more robust because it no longer has a hard dependency on the offset being a 32-bit value.	2019-04-14 22:25:56 +02:00
Karol Herbst	2a36699ed3	radv: use nir constant helpers Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Karol Herbst	adb2263014	amd/nir: some cleanups Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Alyssa Rosenzweig	1e2cb3e964	panfrost/midgard: Use shared nir_lower_viewport_transform v2: Run before lowering I/O. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-14 19:16:29 +00:00
Alyssa Rosenzweig	2ce4adefa5	nir: Add nir_lower_viewport_transform On Mali hardware (supported by Panfrost and Lima), the fixed-function transformation from world-space to screen-space coordinates is done in the vertex shader prior to writing out the gl_Position varying, rather than in dedicated hardware. This commit adds a shared NIR pass for implementing coordinate transformation and lowering gl_Position writes into screen-space gl_Position writes. v2: Run directly on derefs before io/vars are lowered to cleanup the code substantially. Thank you to Qiang for this suggestion! v3: Bikeshed continues. v4: Add to Makefile.sources (per Jason's comment). Bikeshed comment. Ian and Qiang's reviews are from v3, but no real functional changes from v4. Rob's review is from v4. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-14 19:15:13 +00:00
Alyssa Rosenzweig	89b02bffcb	panfrost: Cleanup indexed draw handling As part of this cleanup, we use the newly-exposed u_vbuf_get_minmax_index, deduplicating quite a bit of bookkeeping. We also centralize the draw_flags tracking to make this code cleaner / futureproofed; we have already had bugs regarding this field so we might as well get it right now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-14 15:25:46 +00:00
Alyssa Rosenzweig	74b17b9a9f	panfrost/midgard: Drop dependence on mesa/st This was used as a workaround for uniform sizing which was fixed in `771adffe` ("st: Lower uniforms in st in the...") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-14 15:25:46 +00:00
Mauro Rossi	1af7701666	draw: fix building error in draw_gs_init() Fixes the following building error happening with Android build system: external/mesa/src/gallium/auxiliary/draw/draw_gs.c:740:79: error: address of array 'draw->gs.tgsi.machine->PrimitiveOffsets' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] if (!draw->gs.tgsi.machine->Primitives[i] \|\| !draw->gs.tgsi.machine->PrimitiveOffsets) ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~ 1 error generated. Fixes: `7720ce3` ("draw: add support to tgsi paths for geometry streams. (v2)") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-14 18:07:02 +10:00
Qiang Yu	b46b661f53	lima/gpir: fix alu check miss last store slot Fixes: `92d7ca4b1c` "gallium: add lima driver" Signed-off-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-04-14 12:10:23 +08:00
Qiang Yu	8d91cd64aa	lima/gpir: fix compile fail when two slot node Come from glmark2-es2 jellyfish test. Fixes: `92d7ca4b1c` "gallium: add lima driver" Signed-off-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-04-14 12:10:23 +08:00
Vasily Khoruzhick	fef2f10cc2	lima: add support for depth/stencil fbo attachments and textures Hardware supports writing back Z/S buffers and sampling from them, so add support for that. Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Icenowy Zheng <icenowy@aosc.io>	2019-04-14 01:16:00 +00:00
Vasily Khoruzhick	a817f0fec6	lima: use individual tile heap for each GP job. Looks like it's somehow used by subsequent PP job, so we have to preserve its contents until PP job is done. Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Icenowy Zheng <icenowy@aosc.io>	2019-04-14 01:16:00 +00:00
Christian Gmeiner	b6bed115a5	nir: add lower_ftrunc Port TGSI TRUNC lowering to nir Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-13 17:54:48 +00:00
Mauro Rossi	e538dd67de	android: fix LLVM version string related building errors Adding \ prior to " in llvm version string fixes the following building errors: external/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1290:14: error: expected ')' ", LLVM " MESA_LLVM_VERSION_STRING ^ <command line>:8:34: note: expanded from here ^ external/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1287:10: note: to match this '(' snprintf(rscreen->renderer_string, sizeof(rscreen->renderer_string), ^ 1 error generated. Fixes: 05b114e ("simplify LLVM version string printing") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-04-13 18:56:14 +02:00
Lionel Landwerlin	9e7b0988d6	anv: leave the top 4Gb of the high heap VMA unused In `628c9ca908` I forgot to apply the same -4Gb of the high address of the high heap VMA. This was previously computed in the HIGH_HEAP_MAX_ADDRESS. Many thanks to James for pointing this out. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Xiong, James <james.xiong@intel.com> Fixes: `628c9ca908` ("anv: store heap address bounds when initializing physical device") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-13 12:08:23 +00:00
Eric Anholt	dc402be73e	v3d: Use the new lower_to_scratch implementation for indirects on temps. We can use the same register spilling infrastructure for our loads/stores of indirect access of temp variables, instead of doing an if ladder. Cuts 50% of instructions and max-temps from 2 KSP shaders in shader-db. Also causes several other KSP shaders with large bodies and large loop counts to not be force-unrolled. The change was originally motivated by NOLTIS slightly modifying register pressure in piglit temp mat4 array read/write tests, triggering register allocation failures.	2019-04-12 16:16:58 -07:00
Jason Ekstrand	18ed82b084	nir: Add a pass for selectively lowering variables to scratch space This commit adds new nir_load/store_scratch opcodes which read and write a virtual scratch space. It's up to the back-end to figure out what to do with it and where to put the actual scratch data. v2: Drop const_index comments (by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-12 15:59:31 -07:00
Eric Anholt	8a2d91e124	v3d: Detect the correct number of QPUs and use it to fix the spill size. We were missing a * 4 even if the particular hardware matched our assumption.	2019-04-12 15:59:31 -07:00
Eric Anholt	11ba8a46e4	v3d: Add missing dumping for the spill offset/size uniforms.	2019-04-12 15:59:31 -07:00
Eric Anholt	42cf57f186	v3d: Add missing base offset to CS shared memory accesses. This code is so touchy, trying to emit the minimum amount of address math. Some day we'll move it all to NIR, I hope.	2019-04-12 15:59:31 -07:00
Eric Anholt	6b1c659825	v3d: Add Compute Shader compilation support. While waiting for the CSD UABI to get reviewed, I keep having to rebase the CS patch. Just land the compiler side for now to keep it from diverging. For now this covers just GLES 3.1 compute shaders, not CL kernels.	2019-04-12 15:59:31 -07:00
Eric Anholt	1e0a72ce09	v3d: Replace the old shader-db env var output with the ARB_debug_output. We're using ARB_debug_output for the main shader-db, but I had this env var left around from the shader-db-2 support (vc4 apitrace-based). Keep the env var around since it's nice sometimes to get the stats on a shader you're optimizing without having to do a shader-db run, but drop the old formatting that's not useful and keeps tricking me when I go to add another measurement to the shader-db output.	2019-04-12 15:59:31 -07:00
Eric Anholt	b02dbaa8ce	v3d: Include the number of max temps used in the shader-db output. This gives us finer-grained feedback on how we're doing on register pressure than "did we trigger a new shader to spill or drop thread count?"	2019-04-12 15:59:24 -07:00
Eric Anholt	276ec879fd	v3d: Drop a note for the future about PIPE_CAP_PACKED_UNIFORMS.	2019-04-12 15:58:28 -07:00
Eric Anholt	89b7df552b	v3d: Add and use a define for the number of channels in a QPU invocation. A shader invocation always executes 16 channels together, so we often end up multiplying things by this magic 16 number. Give it a name.	2019-04-12 15:58:28 -07:00
Eric Anholt	b88ef3bd76	nir: Add a comment about how intrinsic definitions work. I was thinking about a refactor, and needed to read this first. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 15:56:12 -07:00
Eric Anholt	35355b4860	nir: Drop remaining references to const_index in favor of the call to use. Please don't make me read a const_index[] expression ever again. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 15:56:04 -07:00
Eric Anholt	6e4d3d0a2f	nir: Drop comments about the constant_index slots for load/stores. The constant_index slots are named right there in the intrinsic definition, and the comment is just a chance to get out of sync. Noticed while reviewing the lower_to_scratch changes that copy-and-pasted wrong comments, and load_ubo and load_per_vertex_output had incorrect comments currently. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 15:55:55 -07:00
Sagar Ghuge	066d2aebc0	intel/fs: Remove unused condition from opt_algebraic case We will never hit a condition where we have src1 and src2 as immediate operands. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-12 13:47:57 -07:00
Kenneth Graunke	9e0c744f07	glsl: Set location on structure-split sampler uniform variables gl_nir_lower_samplers_as_deref splits structure uniform variables, creating new variables for individual fields. As part of that, it calculates a new location. It then never set this on the new variables. Thanks to Michael Fiano for finding this bug. Fixes crashes on i965 with Piglit's new tests/spec/glsl-1.10/execution/samplers/uniform-struct test, which was reduced from the failing case in Michael's app. Fixes: `f003859f97` nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-12 10:35:08 -07:00
Mateusz Krzak	f4fc2ece57	panfrost: use os_mmap and os_munmap 32-bit needs mmap64 for 64-bit offsets. We get 64-bit offsets from kernel. Signed-off-by: Mateusz Krzak <kszaquitto@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-12 16:33:00 +00:00
Mateusz Krzak	411da8b80d	panfrost: cast bo_handles pointer to uintptr_t first Required for 64-bit kernel to interpret the pointer from 32-bit userspace. Signed-off-by: Mateusz Krzak <kszaquitto@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-12 16:33:00 +00:00
Jason Ekstrand	7eaaff18cb	anv/pipeline: Fix MEDIA_VFE_STATE::PerThreadScratchSpace on gen7 We were always programming it with the Broadwell convention which is too large by a factor of two on Haswell and just plain wrong on IVB and BYT. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-04-12 16:08:35 +00:00
Eric Engestrom	da1a5a19bd	gitlab-ci: add lima to the build Suggested-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-12 15:43:19 +00:00
Marek Olšák	f4ae188d50	ac: use the common helper ac_apply_fmask_to_sample Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-12 11:35:31 -04:00
Marek Olšák	971bc10177	radeonsi: set AC_FUNC_ATTR_READNONE for image opcodes where it was missing Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-12 11:34:39 -04:00
Marek Olšák	467ff6ebfe	mesa: don't overwrite existing shader files with MESA_SHADER_CAPTURE_PATH Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-12 11:34:39 -04:00
Marek Olšák	bd2995c8b7	glsl: allow the #extension directive within code blocks for the dri option for Viewperf 13 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-12 11:34:39 -04:00
Samuel Pitoiset	6718bb57ac	ac/nir: remove some useless integer casts for ALU operations Sources are always casted to integers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	8a6442075f	ac/nir: remove useless integer cast in visit_image_load() ac_build_image_opcode() casts if necessary and buffer images are casted too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	ffbb62f808	ac/nir: remove useless integer cast in adjust_sample_index_using_fmask() It's already casted if necessary in ac_build_image_opcode(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	7b5b27a685	ac/nir: remove useles LLVMGetUndef for nir_op_pack_64_2x32_split Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	fd4041987b	ac: add ac_build_load_helper_invocation() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	590a4c8981	ac: add ac_build_ddxy_interp() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	4cb13e9462	ac: add ac_build_umax() and use it where possible This changes the predicate from LessThan to Equal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	cf88bfa75a	ac/nir: make use of ac_build_umin() where possible This changes the predicate from LessThan to Equal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:54 +02:00
Samuel Pitoiset	15dd81913f	ac/nir: make use of ac_build_imin() where possible This changes the predicate from LessThan to Equal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:54 +02:00
Samuel Pitoiset	d7a0c0d53b	ac/nir: make use of ac_build_imax() where possible Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:54 +02:00
Karol Herbst	a55c7352d6	lima: add bool parameter to type_size function Fixes: `035759b61b` ("nir/i965/freedreno/vc4: add a bindless bool to type size functions") Signed-off-by: Karol Herbst <kherbst@redhat.com> Tested-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-12 17:08:53 +02:00
Karol Herbst	98934e6aa1	nvc0/nir: enable bindless texture Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	89a81fbd98	nv50/ir/nir: add support for bindless images Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	b286cdedb7	nv50/ir/nir: handle bindless texture Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	d62d434fe9	ac/nir_to_llvm: add image bindless support With this all piglit bindless image tests pass on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	55fb93b586	ac/nir_to_llvm: make get_sampler_desc() more generic and pass it the image intrinsic This will be required by the bindless support in the following patches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	4a3c04a11f	glsl/nir: add support for lowering bindless images_derefs v2: handle atomics as well make use of nir_rewrite_image_intrinsic v3: remove call to nir_remove_dead_derefs v4: (Timothy Arceri) dont actually call lowering yet Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	0b2e8d9e17	glsl/nir: fetch the type for images from the deref instruction fixes retrieving the sampler type for bindless images stored inside structs. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	d7bbb3caf1	glsl_to_nir: handle bindless textures v2: add support for AMD Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	035759b61b	nir/i965/freedreno/vc4: add a bindless bool to type size functions This required to calculate sizes correctly when we have bindless samplers/images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	3b2a9ffd60	nir: move brw_nir_rewrite_image_intrinsic into common code Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Icenowy Zheng	400f0bfba1	lima: lower bool to float when building shaders Both processors of Mali Utgard are float-only, so bool are not acceptable data type of them. Fortunately the NIR compiler infrastructure has a lower pass to lower bool to float. Call this lower pass to lower bool to float for both GP and PP. This makes Glamor on Xorg server 1.20.3 at least doesn't hang when starting gtk3-demo. The old map of nir op bcsel is changed to fcsel, and the map of b2f32 in PP is dropped because it's not needed now (it's originally only mapped to ppir_op_mov). Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-12 13:40:47 +08:00
Tomeu Vizoso	8f1c686bca	panfrost: Guard against reading past end of buffer Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-12 07:12:17 +02:00
Tomeu Vizoso	c35ae93803	panfrost: split asserts in pandecode Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-12 07:11:52 +02:00
Dave Airlie	604d89c2d1	llvmpipe: fix undefined shift 1 << 31. Pointed out by coverity. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-12 08:54:02 +10:00
Dave Airlie	4690f90728	swrast: fix undefined shift of 1 << 31 Pointed out by coverity Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-12 08:53:59 +10:00
Dave Airlie	e4ed08873b	draw: fix undefined shift of (1 << 31) Pointed out by a coverity scan. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-12 08:53:10 +10:00
Kenneth Graunke	4fcb749044	iris: Actually pin the scratch BO. We were pinning it for compute shaders, and pinning it when restoring saved buffers, but we never actually pinned it in the original batch for VS/TCS/TES/GS/FS. Fixes rendering in GFXBench5's Tessellation demo and a bunch of Piglit geometry shader tests.	2019-04-11 15:03:27 -07:00
Lionel Landwerlin	628c9ca908	anv: store heap address bounds when initializing physical device We can then reuse those bounds to initialize the VMA heaps at logical device creation. This fixes an issue on EHL which has only 36bits of VMA. We were incorrectly using the fixed 48bits upper bound to initialize the logical device heap, resulting in addresses beyong the device's limits. v2: Don't confuse heap size (limited by system memory) and VMA size (limited by number of addressing bits the platform has) v3: Fix low heap vma_size :( (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: James Xiong <james.xiong@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-04-11 22:56:43 +01:00
Jason Ekstrand	316a98dec9	intel/common: Support bigger right-shifts with mi_builder Because why not?	2019-04-11 18:04:09 +00:00
Jason Ekstrand	0d6dea0ac8	anv/cmd_buffer: Use gen_mi_sub instead of gen_mi_add with a negative Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	d17dd46b09	anv: Move mi_memcpy and mi_memset to gen_mi_builder Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	bacb21fc6b	anv: Use gen_mi_builder for queries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	48da45891e	anv: Use gen_mi_builder for conditional rendering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	a3b0894afc	anv: Use gen_mi_builder for indirect dispatch Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	b829dc30c1	anv: Use gen_mi_builder for indirect draw parameters Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	0122a6f037	anv: Use gen_mi_builder for computing resolve predicates Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	83b46ad6d8	anv: Use gen_mi_builder for CmdDrawIndirectByteCount Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	8b8deeca78	intel/common: Add unit tests for gen_mi_builder Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	2f7fcd103e	intel/common: Add a MI command builder Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Eric Anholt	8f065596d2	v3d: Add an optimization pass for redundant flags updates. Our exec masking introduces lots of redundant flags updates, and even without that there will be cases where NIR comparisons on the same sources for different reasons may generate the same comparison instruction before the selection. total instructions in shared programs: 6492930 -> 6460934 (-0.49%) total uniforms in shared programs: 2117460 -> 2115106 (-0.11%) total spills in shared programs: 4983 -> 4987 (0.08%) total fills in shared programs: 6408 -> 6416 (0.12%)	2019-04-11 09:24:02 -07:00
Lubomir Rintel	3dd2001993	kmsro: Extend to include armada-drm This allows using the Marvell Armada display controllers (with the armada drm modesetting driver) along with the render-only drivers, such as Etnaviv on an OLPC XO-1.75 laptop. v2: - Add to Android.mk too Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-11 15:53:29 +00:00
Icenowy Zheng	a155c26a66	lima: implement blit with util_blitter As we have already prepared for using util_blitter, use it to implement lima_blit. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 13:45:51 +00:00
Icenowy Zheng	318ccbe7b2	lima: make lima_context_framebuffer subtype of pipe_framebuffer_state Currently the lima driver saves the framebuffer state in its from-scratch struct lima_context_framebuffer. However, util_blitter requires to save framebuffer with standard struct pipe_framebuffer_state. Make the lima_context_framebuffer a subtype of the standard pipe_framebuffer_state, thus the standard part can be used for util_blitter framebuffer state saving. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 13:45:51 +00:00
Icenowy Zheng	8d27bc351f	lima: add dummy set_sample_mask function The set_sample_mask function is required in util_blitter. Add a dummy one to make util_blitter work. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 13:45:51 +00:00
Eric Engestrom	8c780e54a3	gitlab-ci: build gallium extra hud Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-11 13:15:18 +00:00
Eric Engestrom	c77acc3ceb	meson: remove meson-created megadrivers symlinks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110356 Fixes: `aa7afe324c` "meson: strip rpath from megadrivers" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-11 12:40:16 +00:00
Timothy Arceri	9e3740c47f	nir: initialise some variables in opt_if_loop_last_continue() Fixes a couple of Coverity warnings CID 1444626. Fixes: `e30804c602` ("nir/radv: remove restrictions on opt_if_loop_last_continue()") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-11 20:38:03 +10:00
Juan A. Suarez Romero	83f1b0e95b	nir/xfb: do not use bare interface type In commit `3b3653c4cf` we decided not to use bare types; hence do not use bare type when comparing with interface type to find out if the xfb variable is an array block. This fixes dEQP-VK.transform_feedback.* tests. Fixes: `3b3653c4cf` ("nir/spirv: don't use bare types, remove assert in split vars for testing") CC: Dave Airlie <airlied@redhat.com> CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-11 11:52:45 +02:00
Michel Dänzer	b48e64f903	gitlab-ci: Run CI pipeline for all branches in the main repository In turn, do not run the pipeline for the master branch in forked repositories. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-11 11:22:41 +02:00
Erik Faye-Lund	b60a13d5cb	virgl: use debug_printf instead of fprintf While we're at it, prefix the string with "VIRGL: ", to match similar code elsewhere in virgl. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-11 09:53:25 +02:00
Erik Faye-Lund	7394ef4a72	virgl: do not warn about display-target binding We never want to display a transfer-temp surface, so let's ignore that flag when calculating the new binding flags. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-11 09:53:22 +02:00
Erik Faye-Lund	27d94a83cd	virgl: only warn about unchecked flags The other flags are already vetted, so there's no point in reporting them. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-11 09:53:15 +02:00
Erik Faye-Lund	8f1a147d68	virgl: unsigned int -> unsigned We don't usually spell out the int part of unsigned. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-11 09:53:10 +02:00
Tapani Pälli	ef923088d2	egl: setup fds array correctly when exporting dmabuf For formats with multiple planes, application will pass a num_planes sized fds array which should be initialized properly in case fds amount utilized by the driver is less than the number of planes. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-11 10:16:03 +03:00
Dylan Baker	4122f55574	docs: update calendar, and news item and link release notes for 19.0.2	2019-04-10 20:51:58 -07:00
Dylan Baker	9cb011e7c8	docs: Add sha256 sums for 19.0.2	2019-04-10 20:50:41 -07:00
Dylan Baker	9725c59756	docs: Add release notes for 19.0.2	2019-04-10 20:50:39 -07:00
Jan Vesely	6ec9733b9f	gallium/aux: Report error if loading of a pipe driver fails. Skip over non-existent files. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-10 22:17:09 -04:00
Rob Herring	2b780fe893	kmsro: Add platform support for exynos and sun4i v2: - add Android.mk change Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 09:57:53 +08:00
Rob Herring	b1da1946c7	kmsro: Add lima renderonly support Enable using lima for KMS renderonly. This still needs KMS driver name mapping to kmsro to be used automatically. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 09:57:53 +08:00
Qiang Yu	92d7ca4b1c	gallium: add lima driver v2: - use renamed util_dynarray_grow_cap - use DEBUG_GET_ONCE_FLAGS_OPTION for debug flags - remove DRM_FORMAT_MOD_ARM_AGTB_MODE0 usage - compute min/max index in driver v3: - fix plbu framebuffer state calculation - fix color_16pc assemble - use nir_lower_all_source_mods for lowering neg/abs/sat - use float arrary for static GPU data - add disassemble comment for static shader code - use drm_find_modifier v4: - use lima_nir_lower_uniform_to_scalar v5: - remove nir_opt_global_to_local when rebase Cc: Rob Clark <robdclark@gmail.com> Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Koen Kooi <koen@dominion.thruhere.net> Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: marmeladema <xademax@gmail.com> Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Rohan Garg <rohan@garg.io> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 09:57:53 +08:00
Qiang Yu	64eaf60ca7	drm-uapi: add lima_drm.h Acked-by: Eric Anholt <eric@anholt.net> Signed-of-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 09:57:53 +08:00
Qiang Yu	d26faef2e9	gallium/u_vbuf: export u_vbuf_get_minmax_index This helper function can be used by driver which always need min/max index. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-04-11 09:57:53 +08:00
Qiang Yu	dc37942c4e	u_dynarray: add util_dynarray_grow_cap This is for the case that user only know a max size it wants to append to the array and enlarge the array capacity before writing into it. v2: - rename newsize to newcap - rename util_dynarray_enlarge to util_dynarray_grow_cap Signed-off-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-11 09:57:53 +08:00
Qiang Yu	509dd6e20b	u_math: add ushort_to_float/float_to_ushort v2: - return 0 for NaN too Signed-off-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-04-11 09:57:53 +08:00
Guido Günther	c73fd79cee	gallium: trace: Add missing fence related wrappers Without that kmscube with GALLIUM_TRACE would segfault like: #0 0x0000000000000000 in () #1 0x0000ffff8f311760 in dri2_create_fence_fd (_ctx=0xaaaae266b8b0, fd=10) at ../src/gallium/state_trackers/dri/dri_helpers.c:122 #2 0x0000ffff90788670 in dri2_create_sync (drv=0xaaaae2667910, disp=0xaaaae26691f0, type=12612, attrib_list=0xaaaae26b9290) at ../src/egl/drivers/dri2/egl_dri2.c:2993 #3 0x0000ffff90776a9c in _eglCreateSync (disp=0xaaaae26691f0, type=12612, attrib_list=0xaaaae26b9290, orig_is_EGLAttrib=0, invalid_type_error=12292) at ../src/egl/main/eglapi.c:1823 #4 0x0000ffff90776be4 in eglCreateSyncKHR (dpy=0xaaaae26691f0, type=12612, int_list=0xfffff662e828) at ../src/egl/main/eglapi.c:1848 Signed-off-by: Guido Günther <agx@sigxcpu.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-10 21:31:16 -04:00
Mark Janes	eda36feb2b	intel/tools: Remove redundant definitions of INTEL_DEBUG INTEL_DEBUG is declared extern and defined in gen_debug.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-10 13:15:33 -07:00
Mark Janes	2393cc7f00	intel/common: move gen_debug to intel/dev libintel_common depends on libintel_compiler, but it contains debug functionality that is needed by libintel_compiler. Break the circular dependency by moving gen_debug files to libintel_dev. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-10 13:15:33 -07:00
Mike Blumenkrantz	03d6d01fe2	iris: support INTEL_NO_HW environment variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-10 12:59:17 -07:00
Jian-Hong Pan	7295487c6d	intel: Fix the description of Coffeelake pci-id 0x3E98 According to Intel website [1], the description of chipset 8086:3E98 is Intel(R) UHD Graphics 630. Besides, xserver also mentions it as "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)" in commit d3a26bbf (DRI2: Add another Coffeelake PCI ID) [2]. This patch modifies the description to sync with xserver. [1]: https://ark.intel.com/content/www/us/en/ark/products/134896/intel-core-i5-9600k-processor-9m-cache-up-to-4-60-ghz.html [2]: `d3a26bbf61` Fixes: commit `44f1dcf9b3` "i965: Add a new CFL PCI ID." Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Anuj Phogat anuj.phogat@gmail.com	2019-04-10 12:31:00 -07:00
Jan Vesely	460846981a	Partially revert "gallium: fix autotools build of pipe_msm.la" This partially reverts commit `356ec7a219`. There are symbols needed by libglsl missing, so we might as well skip the entire library. Fixes: `356ec7a219` Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-04-10 14:52:52 -04:00
Eric Anholt	afad1f7d62	vc4: Upload CS/VS UBO uniforms together. Same as I did for V3D, drop all this code trying to GC the non-indirectly-loaded uniforms from the UBO that's used for indirect access of gallium cb[0]. While it does successfully drop some of those, it came at the cost of uploading the VS's indirect unifroms twice, for the bin and render versions of the shader. With the UBO loads simplified, I was also able to easily backport V3D's change to pack a UBO offset into the uniform_data[] field so that we don't need to do the add of the uniform base in the shader. As a bonus, now vc4 doesn't depend on mesa/st type_size functions. total uniforms in shared programs: 25514 -> 25490 (-0.09%) total instructions in shared programs: 77019 -> 76836 (-0.24%)	2019-04-10 11:45:30 -07:00
Eric Anholt	0204fb77e0	vc4: Split UBO0 and UBO1 address uniform handling. I'm going to extend how UBO0 works in a moment.	2019-04-10 11:45:30 -07:00
Eric Anholt	7347d09d6a	vc4: Don't forget to set the range when scalarizing our uniforms. In the next commit, we'll want this for handling UBO access clamping. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-10 11:45:30 -07:00
Eric Anholt	771adffec1	st: Lower uniforms in st in the !PIPE_CAP_PACKED_UNIFORMS case as well. PIPE_CAP_PACKED_UNIFORMS conflates several things: Lowering uniforms i/o at the st level instead of the backend, packing uniforms with no padding at all, and lowering to UBOs. Requiring backends to lower uniforms i/o for !PIPE_CAP_PACKED_UNIFORMS leads to the driver needing to either link against the type size function in mesa/st, or duplicating it in the backend. Given that all backends want this lower-io as far as I can tell, just move it to mesa/st to resolve the link issue and avoid the driver author needing to understand st's uniforms layout. Incidentally, fixes uniform layout failures in nouveau in: dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex and I think in Lima as well. v2: fix indents Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-10 11:44:20 -07:00
Lionel Landwerlin	3053d5a4f2	anv: don't use default pipeline cache for hits for VK_EXT_pipeline_creation_feedback If the user didn't provide a pipeline cache and we're using the default internal pipeline cache, then we shouldn't consider a cache hit for VK_EXT_pipeline_creation_feedback as the application did not provide a cache. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6601e5d6fc` ("anv: implement VK_EXT_pipeline_creation_feedback") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-10 18:45:04 +01:00
Marek Olšák	53f715fafb	Revert "glsl: fix shader_storage_blocks_write_access for SSBO block arrays" This reverts commit `b7ca074cc0`. It broke a lot of tests.	2019-04-10 10:48:56 -04:00
Karol Herbst	0c4706563a	glsl/standalone: add GLES3.1 and GLES3.2 compatibility also set some constants for SSBOs. With that it can compile the shader from: dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.18 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-10 16:16:36 +02:00
Erik Faye-Lund	7c05c95d05	virgl: use debug_printf instead of fprintf While we're at it, prefix the string with "VIRGL: ", to match similar code elsewhere in virgl. Fixes: `d7b3196976` ("virgl: Return an error if we use fp64 on top of GLES") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2019-04-10 14:27:45 +02:00
Gert Wollny	04e672257c	virgl: Enable passing arrays as input to fragment shaders This is needed to properly handle interpolateAt* when the input to be interpolated is passed as array in the original GLSL. Currently, the the GLSL compiler would lower selecting the correct input so that the interpolant parameter to interpolateAt* is a temporary, and this can not be used to create a valid shader on the host side, because here the parameter must a shader input. By allowing the passing the created TGSI allows to create proper GLSL. This is related to the virglrenderer bug https://gitlab.freedesktop.org/virgl/virglrenderer/issues/74 v2: Squash the two patches handling these flags into another Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-10 11:09:40 +02:00
Gert Wollny	872519c663	Gallium: Add new CAP that indicated whether IO array definitions can be shriked PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS is added to indicate whether the TGSI pass to shrink IO arrays should be skipped to enforce the originally declared array sizes and locations instead. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-10 11:09:40 +02:00
Samuel Pitoiset	a182adfd83	wsi: allow to override the present mode with MESA_VK_WSI_PRESENT_MODE This is common to all Vulkan drivers and all WSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-10 09:10:01 +02:00
Samuel Pitoiset	09b4049be3	radv: enable VK_AMD_gpu_shader_half_float Should be safe to enable as all instructions seem to support 16-bit. Unfortunately, there is no CTS test. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-10 09:07:17 +02:00
Rhys Perry	fd1fc255d9	ac: add 16-bit support to ac_build_ddxy() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-10 09:05:58 +02:00
Samuel Pitoiset	bc6d486c78	ac/nir: fix nir_op_b2f16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-10 09:05:55 +02:00
Lepton Wu	1f063c0bfb	virgl: Set bind when creating temp resource. virgl render complains about "Illegal resource" when running dEQP-EGL.functional.color_clears.single_context.gles2.rgb888_window, the reason is that a zero bind value was given for temp resource. Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-04-09 19:25:25 -07:00
Bas Nieuwenhuizen	028ce52739	radv: Add non-uniform indexing lowering. This patch does it as late as possible so the potential extra basic blocks don't inhibit other optimizations. Big thanks to Jason for writing the lowering pass. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-10 02:04:13 +02:00
Bas Nieuwenhuizen	282bacab4a	nir: Add access qualifiers on load_ubo intrinsic. Otherwise nir_lower_non_uniform_access crashes when it tries to get the access of a load_ubo. Fixes: `8ed583fe52` "spirv: Handle the NonUniformEXT decoration" Fixes: `e50ab2c0f2` "nir: Add access flags to deref and SSBO atomics" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-10 02:04:04 +02:00
Marek Olšák	b7ca074cc0	glsl: fix shader_storage_blocks_write_access for SSBO block arrays CTS: GL45-CTS.compute_shader.resources-max Fixes: `4e1e8f684b` "glsl: remember which SSBOs are not read-only and pass it to gallium" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-09 19:25:35 -04:00
Khaled Emara	f0fb73dcf6	freedreno: PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT unreachable statement There seems to be a duplicate return statement, as A2XX doesn't support shader buffers. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-09 17:31:06 -04:00
Lionel Landwerlin	ed009e68c5	genxml: sort xml files using new script Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 18:24:03 +01:00
Lionel Landwerlin	903e142f0d	genxml: add a sorting script Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 18:23:34 +01:00
Eric Engestrom	eb699c1575	bin: drop unused import from install_megadrivers.py Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-09 16:20:37 +00:00
Juan A. Suarez Romero	ec7a33af58	anv: advertise 8 subtexel/mipmap precision bits So far ANV was advertising 4 bits for both subTexelPrecisionBits and mipmapPrecisionBits. But these values were not actually verified. But it seems the right value is actually 8 bits for both cases. Unfortunately Intel PRM does not clarify how many bits the hardware use. For the mipmap case, there is the following reference in PRM Volume 6 (3D Media GPGPU), specifically in LOD Computation Pseudocode: ``` Bias: S4.8 MinLod: U4.8 MaxLod: U4.8 Base: U4.1 MIPCnt: U4 SurfMinLod: U4.8 ResMinLod: U4.8 `` We have other clues, though: - On one side, dEQP-VK.texture.explicit_lod.* tests fail when using 4 bits, but work when using 8 bits. These tests try to mimic the expected behaviour as much real as possible, and they use the reported subTexelPrecisionBits and mipmapPrecisionBits reported to get this. - On the other side, the equivalent driver for Windows is reporting 8 bits for both elements. Not sure if they got to verify it from the PRM or from a diffent source. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-09 15:28:42 +00:00
Boyuan Zhang	d507bcdcf2	st/va: reverse qt matrix back to its original order The quantiser matrix that VAAPI provides has been applied with inverse z-scan. However, what we expect in MPEG2 picture description is the original order. Therefore, we need to reverse it back to its original order. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110257 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-09 10:51:03 -04:00
Andres Gomez	75a3dd97aa	glsl/linker: location aliasing requires types to have the same width From the OpenGL 4.60.5 spec, section 4.4.1 Input Layout Qualifiers, Page 67, (Location aliasing): " Further, when location aliasing, the aliases sharing the location must have the same underlying numerical type and bit width (floating-point or integer, 32-bit versus 64-bit, etc.) and the same auxiliary storage and interpolation qualification." Additionally, we have improved the linker error descriptions. Specifically, when taking structs into account we were producing a linker error because we assumed that all components in each location were used and that would cause component aliasing. This is not accurate of the actual problem. Now, the failure specifies that the underlying numerical type incompatibility is the cause for the failure. Fixes the following piglit test: tests/spec/arb_enhanced_layouts/linker/component-layout/vs-to-fs-width-mismatch-double-float.shader_test v2: - Do not assert if we see invalid numerical types. These come straight from shader code, so we should produce linker errors if shaders attempt to do location aliasing on variables that are not numerical such as records. - While we are at it, improve error reporting for the case of numerical type mismatch to include the shader stage. v3: - Allow location aliasing of images and samplers. If we get these it means bindless support is active and they should be handled as 64-bit integers (Ilia) - Make sure we produce link errors for any non-numerical type for which we attempt location aliasing, not just structs. v4: - Rebased with minor fixes (Andres). - Added fixing tag to the commit log (Andres). v5: - Remove the helper function and check individually for the underlying numerical type and bit width (Timothy). - Implicitly, assume that any non-treated type which is checked for its underlying numerical type is either integer or float and has a defined bit width (Timothy). - Implicitly, assume that structs are the only non-treated non-numerical type (Timothy). - Improve the linker error descriptions and commit log (Andres). Fixes: `13652e7516` ("glsl/linker: Fix type checks for location aliasing") Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-09 12:56:50 +02:00
Gert Wollny	b999865f55	softpipe: Enable PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT The offset alignment must be set to s16 because the tile cache is implemented to require this. This enables ARB_buffer_texture_range and OES_texture_buffer for softpipe. The according deqp-gles31 tests pass. Also update the feature table. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-09 08:17:45 +00:00
Gert Wollny	8cf8dfe408	softpipe: Add an extra code path for the buffer texel lookup With buffers the addressing is done on a per-byte bases so the code path for normal textures doesn't work properly. Also add an assert to make sure that the bit cound for storing the X coordinate is large enough. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-09 08:17:44 +00:00
Gert Wollny	47dd7c4054	softpipe: raise number of bits used for X coordinate texture lookup With buffers the addressing is done on a per byte basis and we with a maximal block size of 16 byte we have to take into acount four more bits. For simplicity just remove the TEX_TILE_SIZE_LOG2, which is 5 bit. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-09 08:17:44 +00:00
Gert Wollny	11f219a5ee	softpipe: Don't use mag filter for gather op For the gather op no magnifictaion filter is provided, so always use the filter given for minification (which is the linear filter) Fixes: `0dff1533f2` softpipe: Use mag texture filter also for clamped lod == 0 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-09 09:50:13 +02:00
Jason Ekstrand	6279074de1	nir: Get rid of global registers We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 00:29:36 -05:00
Jason Ekstrand	b28bad89b9	nir: Get rid of nir_register::is_packed All we ever do is initialize it to zero, clone it, print it, and validate it. No one ever sets or uses it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 00:29:36 -05:00
Dave Airlie	ff852fdc05	virgl: add support for ARB_indirect_parameters The protocol changes are already in place for it. Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-04-09 14:25:01 +10:00
Dave Airlie	05ff2dbf13	virgl: add support for ARB_multi_draw_indirect This will pass the multi draw through to the host if it has support for it instead of using the st to emulate it Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-04-09 14:15:24 +10:00
Dave Airlie	316b785c59	virgl: add support for missing command buffer binding. When I added indirect support I forgot this, however to use it now we need to check for a new enough capability on the host side. Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-04-09 14:15:12 +10:00
Caio Marcelo de Oliveira Filho	899fd66b44	docs: Add NV_compute_shader_derivatives to 19.1.0 relnotes	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	45a4129392	anv: Implement VK_NV_compute_shader_derivatives Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	bd73531677	spirv: Add support for DerivativeGroup capabilities As defined in SPV_NV_compute_shader_derivatives. These control how the invocations are arranged in a CS when doing derivative and related operations (which are also enabled by the extension). Since we expect valid SPIR-V, we don't need to do more work at SPIR-V level to enable the derivative and related operations to be called. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	956226c8ba	iris: Enable NV_compute_shader_derivatives Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	f9b29c4a58	gallium: Add PIPE_CAP_COMPUTE_SHADER_DERIVATIVES To enable NV_compute_shader_derivatives, which allows derivatives (and texture lookups with implicit derivatives) in compute shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	c9d1569689	i965: Advertise NV_compute_shader_derivatives Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	94abc53030	intel/fs: Use NIR_PASS_V when lowering CS intrinsics This will make that step visible in NIR_PRINT=1. v2: Also use the macro for the cleanup passes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	0425b34b79	intel/fs: Don't loop when lowering CS intrinsics This was needed when certain intrinsics were lowered to other ones that were defined by the same pass. After `060817b2` "intel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_values" we don't need the loop anymore. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	3ee3024804	intel/fs: Add support for CS to group invocations in quads When using quads, instead of mapping the elements to the next 4 local invocation indices, we map the two next in the "current" row and two next in the "next row". A side effect is that a thread will execute the indices in a different order. We now perform the lowering of both local invocation ID and index together -- and don't rely anymore on lowering done by nir_lower_system_values. That is convenient when doing the math for quads, because we need X and Y to get the right invocation index. When the pass progresses, fold the constants and clean up to reduce the noise from the indexing math. This implements the derivative_group_quadsNV semantics from NV_compute_shader_derivatives. v2: Take subgroup_id into account, otherwise only values in the first subgroup would be used. (Jason) v3: Calculate invocation index and ID together, to avoid duplicating some math in the quads case when both index and ID are used. (Jason) v4: Don't call cleanup passes as part of the lowering, let that to the call site. (Jason) Change calculation to use less instructions. (Jason) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v3) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	ef0339d5ea	intel/fs: Use TEX_LOGICAL whenever implicit lod is supported Make sure we include compute shaders that have a derivative group defined. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	fcbc5ccaae	nir: Don't set LOD=0 for compute shader that has derivative group When using NV_compute_shader_derivatives to set a derivative group, a compute shader supports texture with implicit LOD calculation, so don't set an explicit LOD. Note if the extension is used but the derivative group is not specified, it will default to LOD=0 as before. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho	d08a74d2bf	nir/algebraic: Lower CS derivatives to zero when no group defined In compute shaders if no derivative group is defined, the derivatives will always be zero. Specified in NV_compute_shader_derivatives. To make the check more convenient, add a "info" local variable to the generated code so we can refer to it in the Python rules. (Jason) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:32 -07:00
Caio Marcelo de Oliveira Filho	3c5ddaeacd	glsl: Parse and propagate derivative_group to shader_info NV_compute_shader_derivatives allow selecting between two possible arrangements (quads and linear) when calculating derivatives and certain subgroup operations in case of Vulkan. So parse and propagate those up to shader_info.h. v2: Do not fail when ARB_compute_variable_group_size is being used, since we are still clarifying what is the right thing to do here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:32 -07:00
Caio Marcelo de Oliveira Filho	ca60f0b7ba	glsl: Enable texture builtins for NV_compute_shader_derivatives Renamed a few predicates from "fs_only" to be "derivative_only" (or similar pairs). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:32 -07:00
Caio Marcelo de Oliveira Filho	09a3273fe7	glsl: Enable derivative builtins for NV_compute_shader_derivatives Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:32 -07:00
Caio Marcelo de Oliveira Filho	289478ea89	glsl: Remove redundant conditions when asserting in_qualifier As the code evolved, we ended up with a redundant conditions. Clean this up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:32 -07:00
Caio Marcelo de Oliveira Filho	163655b33e	mesa: Extension boilerplate for NV_compute_shader_derivatives Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-08 19:29:32 -07:00
Timothy Arceri	e30804c602	nir/radv: remove restrictions on opt_if_loop_last_continue() When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-09 11:29:41 +10:00
Dave Airlie	c6cf602121	softpipe: add support for vertex streams (v2) This enables the ARB_gpu_shader5 vertex streams on softpipe. v2: only enable when not using llvm. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-09 11:20:39 +10:00
Dave Airlie	7720ce32aa	draw: add support to tgsi paths for geometry streams. (v2) This hooks up the geometry shader processing to the TGSI support added in the previous commits. It doesn't change the llvm interface other than to keep things building. v2: fix some regressions caused by primitiveoffsets Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-09 11:19:38 +10:00
Dave Airlie	ddb9ad363d	softpipe: add support for indexed queries. We need indexed queries to retrieve the geom shader info. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-09 11:19:38 +10:00
Dave Airlie	00fe67c015	tgsi: add support for geometry shader streams. This adds support to retrieve the primitive counts for each stream, along with the offset for each primitive into the output array. It also adds support for parsing the stream argument to the emit and end instructions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-09 11:19:38 +10:00
Dave Airlie	333746011d	draw: add stream member to stats callback This just adds space for the member to the callback, doesn't change anything else. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-09 11:19:38 +10:00
Chia-I Wu	63b823130d	vulkan/wsi: make wl_drm optional When wl_drm is missing and the driver supports modifiers, use zwp_linux_dmabuf_v1 for the list of supported formats and for buffer creation. Limit the supported formats to those with modifiers, which are WL_DRM_FORMAT_{ARGB8888,XRGB8888} currently. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-04-09 00:42:30 +00:00
Chia-I Wu	5318858f35	vulkan/wsi: add wsi_wl_display_dmabuf Add wsi_wl_display_dmabuf for zwp_linux_dmabuf_v1-related states. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-04-09 00:42:30 +00:00
Chia-I Wu	fd7fecf59a	vulkan/wsi: add wsi_wl_display_drm Add wsi_wl_display_drm for wl_drm-related states. We will move formats into the struct in a later commit. Remove the unnecessary check for wl_registry_bind failures. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-04-09 00:42:30 +00:00
Chia-I Wu	22dcb080d9	vulkan/wsi: refactor drm_handle_format Refactor the swtich statement in drm_handle_format out to wsi_wl_display_add_wl_format. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-04-09 00:42:30 +00:00
Chia-I Wu	2d214d9405	vulkan/wsi: create wl_drm wrapper as needed When modifiers are specified, we have to use dmabuf rather than wl_drm. We don't need the wrapper in that case. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-04-09 00:42:30 +00:00
Chia-I Wu	ab74937b2c	vulkan/wsi: move modifier array into wsi_wl_swapchain This avoids repeated checks for each wsi_wl_image. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-04-09 00:42:30 +00:00
Adam Jackson	52426ce4a9	drisw: Try harder to probe whether MIT-SHM works XQueryExtension merely tells you whether the extension exists, it doesn't tell you whether you're local enough for it to work. XShmQueryVersion is not enough to discover this either, you need to provoke the server to do actual work, and if it thinks you're remote it will throw BadRequest at you. So send an invalid ShmDetach and use the error code to distinguish local from remote. [airlied: fixed bug not resetting xshm_error to 0 on success, which made later stuff fail completely.] Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2019-04-09 09:50:24 +10:00
Jason Ekstrand	50f3535d1f	nir/search: Search for all combinations of commutative ops Consider the following search expression and NIR sequence: ('iadd', ('imul', a, b), b) ssa_2 = imul ssa_0, ssa_1 ssa_3 = iadd ssa_2, ssa_0 The current algorithm is greedy and, the moment the imul finds a match, it commits those variable names and returns success. In the above example, it maps a -> ssa_0 and b -> ssa_1. When we then try to match the iadd, it sees that ssa_0 is not b and fails to match. The iadd match will attempt to flip itself and try again (which won't work) but it cannot ask the imul to try a flipped match. This commit instead counts the number of commutative ops in each expression and assigns an index to each. It then does a loop and loops over the full combinatorial matrix of commutative operations. In order to keep things sane, we limit it to at most 4 commutative operations (16 combinations). There is only one optimization in opt_algebraic that goes over this limit and it's the bitfieldReverse detection for some UE4 demo. Shader-db results on Kaby Lake: total instructions in shared programs: 15310125 -> 15302469 (-0.05%) instructions in affected programs: 1797123 -> 1789467 (-0.43%) helped: 6751 HURT: 2264 total cycles in shared programs: 357346617 -> 357202526 (-0.04%) cycles in affected programs: 15931005 -> 15786914 (-0.90%) helped: 6024 HURT: 3436 total loops in shared programs: 4360 -> 4360 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 23675 -> 23666 (-0.04%) spills in affected programs: 235 -> 226 (-3.83%) helped: 5 HURT: 1 total fills in shared programs: 32040 -> 32032 (-0.02%) fills in affected programs: 190 -> 182 (-4.21%) helped: 6 HURT: 2 LOST: 18 GAINED: 5 Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-04-08 21:38:48 +00:00
Lionel Landwerlin	48e48b8560	intel: add dependency on genxml generated files Drivers using genxml will start compilation before generated files are created, so add a dependency to it. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Cc: mesa-stable@lists.freedesktop.org	2019-04-08 20:52:47 +00:00
Marek Olšák	4b63f57cbc	radeonsi: fix a crash when unbinding sampler states Acked-by: James Zhu <James.Zhu@amd.com>	2019-04-08 15:23:32 -04:00
Samuel Pitoiset	775191cd99	radv: fix getting the vertex strides if the bindings aren't contiguous Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110349 Fixes: `a66b186beb` ("radv: use typed buffer loads for vertex input fetches") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-08 21:17:15 +02:00
Lionel Landwerlin	ce790c96a9	anv: implement VK_KHR_swapchain revision 70 This revision allows for images to be : - created by reusing image parameters from swapchain - bound to memory from a swapchain v2: Add color attachment flag Use same implicit WSI parameters (tiling, samples, usage) v3: Fix missing break in vk_foreach_struct_const() switch (Lionel) v4: Fix accessing image aspects before android resolve (Tapani) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-08 18:27:02 +01:00
Eric Engestrom	ed91ca0629	vk/util: remove unneeded array index This is an array of 1, so [0] is the only content, and meson already flattens the list so this is unnecessary. Also, all the other uses of vk_api_xml don't do that. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-08 17:03:00 +00:00
Samuel Pitoiset	27b8f3ecc3	ac/nir: fix intrinsic names for atomic operations with LLVM 9+ This fixes the following LLVM error when using RADV_DEBUG=checkir: Intrinsic name not mangled correctly for type arguments! Should be: llvm.amdgcn.buffer.atomic.add.i32 i32 (i32, <4 x i32>, i32, i32, i1)* @llvm.amdgcn.buffer.atomic.add The cmpswap operation still uses the old intrinsic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-08 13:16:50 +02:00
Alyssa Rosenzweig	4209a27c61	panfrost: Remove "mali_unknown6" nonsense This structure was used maaaany moons ago as a placeholder for the varying meta (now unified with mali_attr_meta and essentially fully decoded). I don't know why it's still in the file. Let's wack it. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 16:05:42 +00:00
Alyssa Rosenzweig	b19d1a1e63	panfrost/midgard: Enable lower_find_lsb This is exactly what the blob does. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 16:01:49 +00:00
Alyssa Rosenzweig	65816ad6e8	panfrost/midgard: Add ibitcount8 op The mechanics of this opcode are a little opaque, but essentially, it's used in 8-bit mode to do a bit count in parallel of a uint and then doing a ton of clever iadd/imov ops to recombine. v2: Correct opcode. Thank you to jernej on IRC for noticing this awkward typo! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 16:01:12 +00:00
Alyssa Rosenzweig	6cba9acb75	panfrost/midgard: Add ilzcnt op Used for implementing findLSB/MSB Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 16:00:39 +00:00
Alyssa Rosenzweig	2e7555b14b	panfrost/midgard: Add umin/umax opcodes Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 15:59:05 +00:00
Alyssa Rosenzweig	d84ee49027	panfrost: Add tilebuffer load? branch Also document branches better. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 15:58:44 +00:00
Alyssa Rosenzweig	7cccc89f80	panfrost/decode: Add flags for tilebuffer readback These flags are set when reading back the tilebuffer from a fragment shader via various mechanisms (including ARM_shader_framebuffer_fetch and EXT_pixel_local_storage). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 15:58:19 +00:00
Karol Herbst	1aabb79bdc	panfrost/midgard: use nir_src_is_const and nir_src_as_uint Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-07 15:56:10 +00:00
Jason Ekstrand	10a2fdacfa	vc4: Prefer nir_src_comp_as_uint over nir_src_as_const_value Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-07 15:13:36 +02:00
Karol Herbst	5450f1c9fb	v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-07 15:13:36 +02:00
Kenneth Graunke	4e802089bc	gallium/util: Add const to u_range_intersect This doesn't modify the range, so it can accept a const pointer. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-07 00:21:12 -07:00
Greg V	c5a6e72e15	gallium/hud: add CPU usage support for FreeBSD Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-04-07 06:47:57 +00:00
Kenneth Graunke	9c46046f79	iris: Silence unused variable warnings in release mode	2019-04-06 15:58:16 -07:00
Jason Ekstrand	ad8c145658	nir/algebraic: Add some logical OR and AND patterns The new OR pattern has been seen in the wild and can end up being generated by GLSLang. Not sure about the other two new patterns but we may as well throw them in for completeness. While we're here, we can drop the '@bool' specifier from the one pattern because specifying True already implies 1-bit which basically implies boolean. Shader-db results on Kaby Lake: total instructions in shared programs: 15321227 -> 15321129 (<.01%) instructions in affected programs: 3594 -> 3496 (-2.73%) helped: 6 HURT: 0 total cycles in shared programs: 357481321 -> 357479725 (<.01%) cycles in affected programs: 44109 -> 42513 (-3.62%) helped: 6 HURT: 0 VkPipeline-DB results on Kaby Lake: total instructions in shared programs: 3770504 -> 3769734 (-0.02%) instructions in affected programs: 19058 -> 18288 (-4.04%) helped: 163 HURT: 0 total cycles in shared programs: 1417583701 -> 1417569727 (<.01%) cycles in affected programs: 750958 -> 736984 (-1.86%) helped: 158 HURT: 1 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-05 18:39:06 -05:00
Jason Ekstrand	03a72d96d8	nir/algebraic: Drop some @bool specifiers Now that we have one-bit booleans, we don't need to rely on looking at parent instructions in order to figure out if a value is a Boolean most of the time. We can drop these specifiers and now the optimizations will apply more generally. Shader-DB results on Kaby Lake: total instructions in shared programs: 15321168 -> 15321227 (<.01%) instructions in affected programs: 8836 -> 8895 (0.67%) helped: 1 HURT: 31 total cycles in shared programs: 357481781 -> 357481321 (<.01%) cycles in affected programs: 146524 -> 146064 (-0.31%) helped: 22 HURT: 10 total spills in shared programs: 23675 -> 23673 (<.01%) spills in affected programs: 11 -> 9 (-18.18%) helped: 1 HURT: 0 total fills in shared programs: 32040 -> 32036 (-0.01%) fills in affected programs: 27 -> 23 (-14.81%) helped: 1 HURT: 0 No change in VkPipeline-DB Looking at the instructions hurt, a bunch of them seem to be a case where doing exactly the right thing in NIR ends up doing the wrong-ish thing in the back-end because flags are dumb. In particular, there's a case where we have a MUL followed by a CMP followed by a SEL and when we turn that SEL into an OR, it uses the GRF result of the CMP rather than the flag result so the CMP can't be merged with the MUL. Those shaders appear to schedule better according to the cycle estimates so I guess it's a win? Also it helps spilling in one Car Chase compute shader. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-05 18:39:00 -05:00
Andrii Simiklit	cade9001b1	util: clean the 24-bit unused field to avoid an issues This is a field of FLOAT_32_UNSIGNED_INT_24_8_REV texture pixel. OpenGL spec "8.4.4.2 Special Interpretations" is saying: "the second word contains a packed 24-bit unused field, followed by an 8-bit index" The spec doesn't require us to clear this unused field however it make sense to do it to avoid some undefined behavior in some apps. Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110305 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-04-05 21:33:53 +00:00
Caio Marcelo de Oliveira Filho	c037dbb0ef	nir: Take if_uses into account when repairing SSA If a def is used as an condition before its definition, we should also consider this a case to repair. When repairing, make sure we rewrite any if conditions too. Found in while inspecting a SPIR-V conversion from a 'continue block' that contains a conditional branch. We pull the continue block up to the beggining of the loop, and the condition in the branch ends up defined afterwards. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `364212f1ed` "nir: Add a pass to repair SSA form"	2019-04-05 09:43:46 -07:00
Marek Olšák	26e161b1e9	tegra: fix the build after the set_shader_buffers change	2019-04-05 11:18:39 -04:00
James Zhu	0f416b85fb	gallium/auxiliary/vl: Add barrier/unbind after compute shader launch. Add memory barrier sync for multiple launch cases, and unbind completed resources after launch. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-05 09:50:52 -04:00
James Zhu	4bbc9c493f	gallium/auxiliary/vl: Fixed blank issue with compute shader Multiple init buffer within one open instance will cause blank issue. Updating viewport per frame will fix this issue. Signed-off-by: James Zhu <James.Zhu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-05 09:50:52 -04:00
James Zhu	32b861d46d	gallium/auxiliary/vl: Fixed blur issue with weave compute shader Correct wrong interpolatation with top/bottom row which caused blur issue. Signed-off-by: James Zhu <James.Zhu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-05 09:50:52 -04:00
Emil Velikov	a28dc6b57f	docs: update calendar, add news item and link release notes for 18.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-04-05 13:24:29 +01:00
Emil Velikov	d5ba84dc52	docs: add sha256 checksums for 18.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `eb9da68cbf`)	2019-04-05 13:20:26 +01:00
Emil Velikov	9b537f2d21	docs: add release notes for 18.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b03f51c4b4`)	2019-04-05 13:20:25 +01:00
Samuel Pitoiset	5eb17506e1	nir: do not pack varying with different types The current algorithm only supports packing 32-bit types. If a shader uses both 16-bit and 32-bit varyings, we shouldn't compact them together. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-05 13:57:42 +02:00
Gert Wollny	0dff1533f2	softpipe: Use mag texture filter also for clamped lod == 0 Follow the spec when selecting the magnification filter (OpenGL 4.5, section 8.14): If λ(x, y) is less than or equal to the constant c (see section 8.15) the texture is said to be magnified; While we're here also silence a potential warning about implicit float to double conversion. v2: Update commit message to contain a reference to the spec as pointed out by Eric. Fixes a number of dEQP GLES2 and GLES3 test out of: dEQP-GLES2.functional.texture.filtering.* dEQP-GLES2.functional.texture.vertex.2d.filtering.* dEQP-GLES3.functional.texture.vertex..filtering. dEQP-GLES3.functional.texture.filtering.* dEQP-GLES3.functional.texture.shadow.2d.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-05 09:07:45 +02:00
Tapani Pälli	361f3d19f1	iris: handle aux properly in iris_resource_get_handle Disable aux when resource seen the first time and EXPLICIT_FLUSH not being set. This fixes issues seen when launching Xorg and CCS_E getting utilized. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-04 23:35:24 -07:00
Eric Anholt	276d22c52d	v3d: Add some more new packets for V3D 4.x. The T/G shader references and common state will be needed for GLES 3.2.	2019-04-04 17:30:35 -07:00
Eric Anholt	4c70f276bc	v3d: Don't try to use the TFU blit path if a scissor is enabled. We'll need to do a render-based blit for scissors, since the TFU (as seen in this conditional) can only update a whole surface. Fixes: `976ea90bdc` ("v3d: Add support for using the TFU to do some blits.") Fixes piglit fbo-scissor-blit.	2019-04-04 17:30:35 -07:00
Eric Anholt	62360e92ec	v3d: Bump the maximum texture size to 4k for V3D 4.x. 4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in the CTS at 8k and 16k. 4k at least lets us get one 4k display working. Cc: mesa-stable@lists.freedesktop.org	2019-04-04 17:30:35 -07:00
Eric Anholt	e3063a8b2f	v3d: Add support for handling OOM signals from the simulator. I have v3d allocating enough initial allocation memory that we've been passing tests without it, but to match kernel behavior more it would be good to actually exercise the OOM path.	2019-04-04 17:30:35 -07:00
Illia Iorin	a113a42e73	mesa/main: Fix multisample texture initialize Sampler of Multisample textures wasn't initialized correct. So when texture object created as multisample its sampler is initialized in a individual case. We change the initial state of TEXTURE_MIN_FILTER and TEXTURE_MAG_FILTER to NEAREST. These changes are approved by KhronosGroup. https://github.com/KhronosGroup/OpenGL-API/issues/45 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109057	2019-04-05 11:28:10 +11:00
Sergii Romantsov	a7d40a13ec	glsl: Fix input/output structure matching across shader stages Section 7.4.1 (Shader Interface Matching) of the OpenGL 4.30 spec says: "Variables or block members declared as structures are considered to match in type if and only if structure members match in name, type, qualification, and declaration order." Fixes: * layout-location-struct.shader_test v2: rebased against master and small fixes Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108250	2019-04-05 11:02:23 +11:00
Dave Airlie	738921afd9	ddebug: add compute functions to help hang detection Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-05 10:01:08 +10:00
Dave Airlie	0ea386128b	iris: avoid use after free in shader destruction While playing with compute shaders, I was getting a random crash, noticed that bind_state was using the old shader info for comparision, but gallium allows the shader to be deleted while bound, so this could lead to a use after free. This can't happen using the cso cache. As it tracks all of this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-05 09:57:44 +10:00
Marek Olšák	42f63e6334	radeonsi: set exact shader buffer read/write usage in CS Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-04 19:28:52 -04:00
Marek Olšák	4e1e8f684b	glsl: remember which SSBOs are not read-only and pass it to gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-04 19:28:52 -04:00
Marek Olšák	66a82ec6f0	gallium: add writable_bitmask parameter into set_shader_buffers to indicate write usage per buffer. This is just a hint (it will be used by radeonsi). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-04 19:28:52 -04:00
Danylo Piliaiev	b19494c54e	iris: Fix assert when using vertex attrib without buffer binding The GL 4.5 spec says: "If any enabled array’s buffer binding is zero when DrawArrays or one of the other drawing commands defined in section 10.4 is called, the result is undefined." The result is undefined but it should not crash. Fixes: gl-3.1-vao-broken-attrib Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-04 22:57:24 +00:00
Tapani Pälli	61cc379371	iris: move iris_flush_resource so we can call it from get_handle Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-04 13:36:51 -07:00
Kenneth Graunke	8d9e169bdd	iris: Save/restore MI_PREDICATE_RESULT, not MI_PREDICATE_DATA. MI_PREDICATE_DATA is an intermediate storage for the MI_PREDICATE command's calculations - it holds the result of the subtraction when the compare operation is SRCS_EQUAL or DELTAS_EQUAL. But the actual result of the predication is MI_PREDICATE_RESULT, which is what we want to copy from the render context to the compute context.	2019-04-04 11:41:10 -07:00
Eric Engestrom	d1dd3cbcc7	util/process: document memory leak We consider it acceptable, but let's still document it in case people notice it and are not sure why it's there. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-04 16:09:52 +00:00
Eric Engestrom	05b114e526	simplify LLVM version string printing Figure it out once in the build system, then just use that all over the place. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-04 16:08:11 +00:00
Guido Günther	593614f4d4	gallium/u_dump: util_dump_sampler_view: Dump u.tex.first_level Dump u.tex.first_level instead of dumping u.tex.last_level twice. Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-04 17:30:19 +02:00
Guido Günther	a5e24dc416	gallium: ddebug: Add missing fence related wrappers Without that `GALLIUM_DDEBUG=always kmscube -A` would segfault like #0 0x0000000000000000 in () #1 0x0000ffffa72a3c54 in dri2_get_fence_fd (_screen=0xaaaaed4f2090, _fence=0xaaaaed9ef880) at ../src/gallium/state_trackers/dri/dri_helpers.c:140 #2 0x0000ffffa8744824 in dri2_dup_native_fence_fd (drv=0xaaaaed5010c0, disp=0xaaaaed5029a0, sync=0xaaaaed9ef7c0) at ../src/egl/drivers/dri2/egl_dri2.c:3050 #3 0x0000ffffa87339b8 in eglDupNativeFenceFDANDROID (dpy=0xaaaaed5029a0, sync=0xaaaaed9ef7c0) at ../src/egl/main/eglapi.c:2107 #4 0x0000aaaabd29ca90 in () #5 0x0000aaaabd401000 in () Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-04-04 17:30:15 +02:00
Danylo Piliaiev	3fdfface3e	st/mesa: Fix GL_MAP_COLOR with glDrawPixels GL_COLOR_INDEX Documentation for glDrawPixels with GL_COLOR_INDEX says: "If the GL is in color index mode, and if GL_MAP_COLOR is true, the index is replaced with the value that it references in lookup table GL_PIXEL_MAP_I_TO_I" We are always in RGBA mode and there is nothing in documentation about GL_MAP_COLOR in RGBA mode for GL_COLOR_INDEX. Scale and bias are also only applicable for RGBA format and not mentioned for GL_COLOR_INDEX. Thus the behaviour will be on par with i965. Fixes: gl-1.0-drawpixels-color-index Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-04-04 10:38:32 -04:00
Eric Engestrom	f6ceed205c	gallium/hud: fix rounding error in nic bps computation While at it, fix typo in "rounding error" :P Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-04 13:59:24 +00:00
Eric Engestrom	9d6ea55263	gallium/hud: prevent buffer overflow Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-04 13:59:24 +00:00
Eric Engestrom	4633d13854	gallium/hud: fix memory leaks Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-04 13:59:24 +00:00
Marek Olšák	b563460b49	radeonsi: enable displayable DCC on Ravens	2019-04-04 09:53:24 -04:00
Marek Olšák	1f21396431	radeonsi: add support for displayable DCC for multi-RB chips A compute shader is used to reorder DCC data from aligned to unaligned.	2019-04-04 09:53:24 -04:00
Marek Olšák	2c09eb4122	radeonsi: add support for displayable DCC for 1 RB chips This is the simpler codepath - just disable RB and pipe alignment for DCC.	2019-04-04 09:53:24 -04:00
Marek Olšák	029bfa3d25	radeonsi: add ability to bind images as image buffers so that we can bind DCC (texture) as an image buffer.	2019-04-04 09:53:24 -04:00
Marek Olšák	fe3bfd7971	radeonsi/gfx9: add support for PIPE_ALIGNED=0 Needed by displayable DCC. We need to flush L2 after rendering if PIPE_ALIGNED=0 and DCC is enabled.	2019-04-04 09:53:24 -04:00
Marek Olšák	e457454cb6	amd/addrlib: fix uninitialized values for Addr2ComputeDccAddrFromCoord Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-04 09:30:40 -04:00
Tapani Pälli	41f76dd513	iris: move variable to the scope where it is being used iris_upload_border_color is passed a pointer which points to variable that is introduced in a different scope. CID: 1444296 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-04 04:43:20 +00:00
Tapani Pälli	3cea9f981a	st/nir: run st_nir_opts after 64bit ops lowering CID: 1444309 Fixes: `9ab1b1d022` "st/nir: Move 64-bit lowering later" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-04 07:38:10 +03:00
Alyssa Rosenzweig	b34d8222c7	panfrost: Size tiled temp buffers correctly This should lower transient memory usage and improve performance slightly (due to less memory to malloc/free, better cache locality, etc). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-04 03:51:43 +00:00
Alyssa Rosenzweig	c0183e8eed	panfrost: Respect box->width in tiled stores This fixes a regression uploading partial tiled textures introduced sometime during the cubemap series. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-04 03:51:43 +00:00
Alyssa Rosenzweig	3b38a7e505	panfrost: Cleanup some indirection in pan_resource Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-04 03:51:43 +00:00
Alyssa Rosenzweig	7e8de5a707	panfrost: Implement system values This patch implements system values via specially-crafted uniforms. While we previously had an ad hoc system for passing the viewport into the vertex shader, this commit generalizes the system to allow for arbitrary system values to be added to both shader stages. While we're at it, we clean up uniform handling code (which was considerably muddied to handle the ad hoc viewport uniform). This commit serves as both a cleanup of the existing codebase and the precursor to new functionality, like implementing textureSize(). Concurrent with these changes is respecting the depth transform, which was not possible with the old fixed uniform system and here serves as a proof-of-correctness test (as well as justifying the NIR changes). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-04-04 03:44:15 +00:00
Alyssa Rosenzweig	a83862754e	nir: Add "viewport vector" system values While a partial set of viewport system values exist, these are scalar values, which is a poor fit for viewport transformations on vector ISAs like Midgard (where the vec3 values for scale and offset each need to be coherent in a vec4 uniform slot to take advantage of vectorized transform math). This patch adds vec3 scale/offset fields corresponding to the 3D Gallium viewport / glViewport+depth Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-04 03:44:09 +00:00
Erik Faye-Lund	b85ca86c1e	virgl: also destroy all read-transfers For texture write-transfers, we either free them on the transfer-queue or right away. But for read-transfers, we currently only destroy them in case they used a temp-resource. This leads to occasional resource-leaks. Let's add a call to virgl_resource_destroy_transfer in the missing case. Do the same thing for buffers as well, but the logic is a bit easier to follow there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `f0e71b1088` ("virgl: use transfer queue") Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-03 18:59:23 +02:00
Dylan Baker	4c332a1f9f	meson: Error if LLVM is turned off but clover it turned on Since clover has a hard requirement on LLVM v2: - make error message more specific Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-03 09:41:24 -07:00
Dylan Baker	29912f2ea4	meson: Error if LLVM doesn't have rtti when building clover We already do this for nouveau, but it's required for clover too.	2019-04-03 09:41:24 -07:00
Alyssa Rosenzweig	138865e676	panfrost: Remove support for legacy kernels Previously, there was minimal support for interoperating with legacy kernels (reusing kernel modules originally designed for proprietary legacy userspaces, rather than for upstream-friendly free software stacks). Now that the Panfrost kernel is stabilising, this commit drops the legacy code path. Panfrost users need to use a modern, mainline kernel supporting the Panfrost kernel driver from this commit forward. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-04-03 15:21:30 +00:00
Lucas Stach	43db0632e7	etnaviv: only try to construct scanout resource when on KMS winsys Trying to construct a scanout capable buffer will only ever work when when we are on top of a KMS winsys, as the render node isn't capable of allocating contiguous buffers. Tested-by: Marius Vlad <marius.vlad@collabora.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-04-03 12:54:09 +02:00
Lucas Stach	3d8da347ac	etnaviv: flush all pending contexts when accessing a resource with the CPU When setting up a transfer to a resource, all contexts where the resource is pending must be flushed. Otherwise a write transfer might be started in the current context before all contexts that access the resource in shared (read) mode have been executed. Fixes: `64813541d5` (etnaviv: fix resource usage tracking across different pipe_context's) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-By: Guido Günther <agx@sigxcpu.org>	2019-04-03 12:54:09 +02:00
Lucas Stach	f317ee1aff	etnaviv: don't flush own context when updating resource use The context is self synchronizing at the GPU side, as commands are executed in order. We must not flush our own context when updating the resource use, as that leads to excessive flushing on effectively every draw call, causing huge CPU overhead. Fixes: `64813541d5` (etnaviv: fix resource usage tracking across different pipe_context's) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-04-03 12:54:09 +02:00
Christian Gmeiner	c7cddc2787	etnaviv: shrink struct etna_3d_state Drop struct members which are only written to but never read from. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-04-03 12:54:09 +02:00
Dave Airlie	11e1fa11d6	intel/compiler: use defined size for vector components If we increase vector sizing later it would be nice to avoid tripped over this again. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-03 13:59:06 +10:00
Dave Airlie	eb8fefe090	nir: use proper array sizing define for vectors If we increase the vector size in the future it would be good to not have to fix these up, this should change nothing at present. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-03 13:59:06 +10:00
Timothy Arceri	d8ce915a61	Revert "nir: propagate known constant values into the if-then branch" This reverts commit `4218b6422c`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110311	2019-04-03 13:24:18 +11:00
Timothy Arceri	4218b6422c	nir: propagate known constant values into the if-then branch Helps Max Waves / VGPR use in a bunch of Unigine Heaven shaders. shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 5505440 -> 5505872 (0.01 %) VGPRS: 3077520 -> 3077296 (-0.01 %) Spilled SGPRs: 39032 -> 39030 (-0.01 %) Spilled VGPRs: 16326 -> 16326 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 744 -> 744 (0.00 %) dwords per thread Code Size: 123755028 -> 123753316 (-0.00 %) bytes Compile Time: 2751028 -> 2560786 (-6.92 %) milliseconds LDS: 1415 -> 1415 (0.00 %) blocks Max Waves: 972192 -> 972240 (0.00 %) Wait states: 0 -> 0 (0.00 %) vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 160 -> 160 (0.00 %) VGPRS: 88 -> 88 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 18268 -> 18152 (-0.63 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 26 -> 26 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-03 10:04:48 +11:00
Lepton Wu	250fffac15	virgl: close drm fd when destroying virgl screen. This fd was create in virgl_drm_screen_create and should be closed in virgl_drm_screen_destroy. Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-04-02 15:29:47 -07:00
Rafael Antognolli	08c44b47a9	iris: Enable fast clears on gen8. Since we are now properly storing the clear color with SCS bits, we can now enable fast clears on gen8 too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-02 15:26:48 -07:00
Rafael Antognolli	7339660e80	iris: Add aux.sampler_usages. We want to skip some types of aux usages (for instance, ISL_AUX_USAGE_HIZ when the hardware doesn't support it, or when we have multisampling) when sampling from the surface. Instead of checking for those cases while filling the surface state and leaving it blank, let's have a version of aux.possible_usages for sampling. This way we can also avoid allocating surface state for the cases we don't use. Fixes: `a8b5ea8ef0` "iris: Add function to update clear color in surface state." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-02 15:26:45 -07:00
Rafael Antognolli	dfc5620a41	iris: Do not allocate clear_color_bo for gen8. Since we are not using it for the clear color, there's no need to allocate it. Fixes: `a8b5ea8ef0` "iris: Add function to update clear color in surface state." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-02 15:26:41 -07:00
Rafael Antognolli	c26d8a887d	iris: Manually apply fast clear color channel overrides. At the fast clear time, the only swizzle we have available is actually the identity swizzle (which we use for most rendering). So the call to swizzle_color_value() becomes simply a no-op, and doesn't properly zero out the unused channels. We have to manually override those channels. Fixes: `a8b5ea8ef0` "iris: Add function to update clear color in surface state." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-02 15:26:38 -07:00
Rafael Antognolli	2660667284	iris/gen8: Re-emit the SURFACE_STATE if the clear color changed. The swizzle for rendering surfaces is always identity. So when we are doing the fast clear, we don't have enough information to store the clear color OR'ed with the Shader Channel Select bits for the dword in the SURFACE_STATE. Instead of trying to patch up the SURFACE_STATE correctly later, by reading the color from the clear color state buffer and then doing all the operations to store it, let's just re-emit the whole SURFACE_STATE. That should make things way simpler on gen8, and we can still use the clear color state buffer for gen9+. Fixes: `a8b5ea8ef0` "iris: Add function to update clear color in surface state." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-02 15:26:33 -07:00
Rafael Antognolli	6a02873687	iris: Only update clear color for gens 8 and 9. Newer gens can read it directly. Also properly skip updating the ISL_AUX_USAGE_NONE surface. Fixes: `a8b5ea8ef0` "iris: Add function to update clear color in surface state." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-02 15:24:15 -07:00
Alexander von Gluck IV	5f467fe08e	haiku: Fix hgl dispatch build. Tested under meson/scons. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-04-02 16:06:00 -05:00
Guido Günther	10b90570d1	docs: Fix 19.0.x version numbers The list has 19.0.2 twice. Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-02 09:12:47 -07:00
Marek Olšák	40b9eec8bd	docs/relnotes: document parallel_shader_compile changes in 19.1.0, not 19.0.0	2019-04-02 10:47:37 -04:00
Benjamin Tissoires	7f8a9a1fbb	CI: use wayland ci-templates repo to create the base image There shouldn't be a difference for users, but this way we do manage all of our containers from freedesktop.org note: compared to the provious Dockerfile, we need to manually add gcc, g++ and python*-wheel Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-04-02 13:41:05 +00:00
Marek Olšák	7be26976b8	radeonsi: don't use PFP_SYNC_ME with compute-only contexts Compute rings don't have PFP. Fixes: `a1378639ab` "radeonsi: always use compute rings for clover on CI and newer (v2)" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-04-02 08:46:49 -04:00
Gert Wollny	1e5381f934	virgl: define MAX_VERTEX_STREAMS based on availability of TF3 Since with gles hosts we lie about the GLSL feature level it is better to set the number of streams based on actual hosts capabilities. v2: Make use of feature check level to avoid regressions. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-02 11:28:09 +00:00
Gert Wollny	33d9b9436c	softpipe: Implement ATOMFADD and enable cap TGSI_ATOMFADD This enables the following piglits with PASS: nv_shader_atomic_float/execution/ shared-atomicadd-float shared-atomicexchange-float ssbo-atomicadd-float ssbo-atomicexchange-float v2: Minimize the patch by using type punning (Eric Anholt) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-02 09:58:16 +00:00
Erik Faye-Lund	4f153fcd5c	virgl: stricter usage of compressed 3d textures Using RGTC, ETC1, ETC2 or S3TC for 3D-textures isn't alowed by any of OpenGL 4.6, OpenGL ES 3.2, ARB_texture_compression_rgtc, EXT_texture_compression_rgtc, OES_compressed_ETC1_RGB8_texture, S3_s3tc or EXT_texture_compression_s3tc specifications. So let's not allow any of those compressed 3d-textures at all. It's not going to work once it hits the OpenGL driver in virglrenderer. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-02 07:48:46 +00:00
Erik Faye-Lund	f53001324f	virgl: do not allow compressed formats for buffers Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-04-02 07:48:45 +00:00
Eric Anholt	edc7deec42	dri3: Return the current swap interval from glXGetSwapIntervalMESA(). We were caching only the value set with glXSwapIntervalSGI(), missing out on the default setting of the swap interval by the loader. This fixes glxgears's warning about being vblank synchronized by default. Fixes: `9777c4234b` ("loader: drop the [gs]et_swap_interval callbacks") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-01 16:06:38 -07:00
Anuj Phogat	82f6a746e8	intel: Add support for Comet Lake Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-04-01 14:07:40 -07:00
Chris Wilson	80e1ca9d28	iris: Adapt to variable ppGTT size Not all hardware is made equal and some does not have the full complement of 48b of address space. Ask what the actual size of virtual address space allocated for contexts, and bail if that is not enough to satisfy our static partitioning needs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-01 10:01:02 -07:00
Samuel Pitoiset	c25f63872b	radv: partially enable VK_KHR_shader_float16_int8 Only 8-bit integers for now, float16 requires a bit more work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:59 +02:00
Samuel Pitoiset	d099bc5829	ac: add 8-bit and 64-bit support to ac_build_bitfield_reverse() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:57 +02:00
Samuel Pitoiset	2cecf6c5cc	ac: add 8-bit support to ac_build_umsb() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:55 +02:00
Samuel Pitoiset	a45d9e3e8d	ac: add 8-bit support to ac_find_lsb() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:53 +02:00
Samuel Pitoiset	89cf8ca0ae	ac: add 8-bit support to ac_build_bit_count() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:52 +02:00
Samuel Pitoiset	869af0464a	ac/nir: add support for nir_op_b2i8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:49 +02:00
Marek Olšák	b9d627e076	radeonsi: implement ARB/KHR_parallel_shader_compile callbacks	2019-04-01 12:37:52 -04:00
Marek Olšák	050fae3983	util/queue: add util_queue_adjust_num_threads for ARB_parallel_shader_compile Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-01 12:37:52 -04:00
Marek Olšák	b7317b6ce0	util/queue: hold a lock when reading num_threads in util_queue_finish Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-01 12:37:52 -04:00
Marek Olšák	bb111559f2	util/queue: add ability to kill a subset of threads for ARB_parallel_shader_compile	2019-04-01 12:37:52 -04:00
Marek Olšák	d99cdc9d59	util/queue: move thread creation into a separate function Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-01 12:37:52 -04:00
Marek Olšák	e871cbd625	gallium: implement ARB/KHR_parallel_shader_compile	2019-04-01 12:37:52 -04:00
Marek Olšák	c5c38e831e	mesa: implement ARB/KHR_parallel_shader_compile Tested by piglit.	2019-04-01 12:37:52 -04:00
Marek Olšák	3ad2a9b3fa	radeonsi: fix assertion failure by using the correct type src/gallium/drivers/radeonsi/si_state_viewport.c:196: si_emit_guardband: Assertion `vp_as_scissor.maxx <= max_viewport_size[vp_as_scissor.quant_mode] && vp_as_scissor.maxy <= max_viewport_size[vp_as_scissor.quant_mode]' failed. The comparison was unsigned, so negative maxx or maxy would fail. Fixes: `3c540e0a74` "radeonsi: Fix guardband computation for large render targets"	2019-04-01 12:21:20 -04:00
Leo Liu	d4e0fbc92f	radeon/vcn/vp9: search the render target from the whole list The number of render targets could be more than max of references, so we search the full list of the render pictures for the current render target index https://bugs.freedesktop.org/show_bug.cgi?id=109648 Signed-off-by: Leo Liu <leo.liu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Acked-by: James Zhu<James.Zhu@amd.com> Cc: <mesa-stable@lists.freedesktop.org>	2019-04-01 08:59:38 -04:00
Rhys Perry	0af95f0ffc	radv: lower 16-bit flrp Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-01 09:58:48 +02:00
Samuel Pitoiset	4d5fce29c3	ac: fix ac_build_umsb() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:56 +02:00
Samuel Pitoiset	7a088d1ac8	ac: fix ac_find_lsb() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:54 +02:00
Samuel Pitoiset	b16dffff23	ac: fix ac_build_bitfield_reverse() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:52 +02:00
Samuel Pitoiset	9d13b9e53e	ac: fix ac_build_bit_count() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:49 +02:00
Samuel Pitoiset	e39a6b940f	ac/nir: fix nir_op_b2i16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:47 +02:00
Eric Engestrom	aa7afe324c	meson: strip rpath from megadrivers More specifically, use the library file that has been post-processed by Meson when creating the hardlinks. Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108766 Fixes: `3218056e0e` "meson: Build i965 and dri stack" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-01 07:04:13 +00:00
Tapani Pälli	06f40f5765	spirv: fix a compiler warning Fixes implicit conversion from enumeration type 'SpvOp' warning. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-01 07:43:10 +03:00
Lionel Landwerlin	f0b472b301	i965: perf: update render basic configs for big core gen9/gen10 This updates allows an MI_LRI to trigger a OA report write in the global OA buffer. This isn't really useful for us, we just keep close to the internal public configs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-01 00:59:31 +03:00
Lionel Landwerlin	052ace0c81	i965: perf: add ring busyness metric for cfl gt2 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-01 00:59:26 +03:00
Lionel Landwerlin	7e54857b4a	i965: perf: enable Icelake metrics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:36:37 +01:00
Lionel Landwerlin	897efc2059	i965: perf: add Icelake metrics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:36:37 +01:00
Lionel Landwerlin	b910e40956	i965: perf: sklgt2: drop programming of an unused NOA register Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Lionel Landwerlin	29ce64a77a	i965: perf: hsw: drop register programming not needed on HSW This register is flagged as IVB only in the documentation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Lionel Landwerlin	46250d7dac	i965: perf: chv: fixup counters names Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Lionel Landwerlin	046041b2a0	i965: perf: add PMA stall metrics These are new metrics for Gen8/9 to measure the effect of the PMA stall workaround fix. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Lionel Landwerlin	dc9e598f3c	i965: perf: sklgt2: update memory write config This rework the programming between older pre-production steppings & new ones. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Lionel Landwerlin	0d618bb635	i965: perf: sklgt2: update compute metrics config This unifies some of the programming between pre-production stepping and production ones. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Lionel Landwerlin	4edaa6f003	i965: perf: sklgt2: update a priority for register programming This makes no difference in term of programming, it's just a cleanup. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-31 10:35:16 +01:00
Alyssa Rosenzweig	e4e6a3deaf	panfrost: Implement FIXED formats Fixes crash in dEQP-GLES2.functional.draw.random.9 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 04:42:37 +00:00
Alyssa Rosenzweig	ed160a1160	panfrost: Fix index calculation types and asserts Fixes crash in dEQP-GLES2.functional.draw.draw_elements.points.single_attribute. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 04:42:22 +00:00
Alyssa Rosenzweig	0e4c321c15	panfrost: Clean index state between indexed draws Fixes subsequent tests in dEQP-GLES2.functional.draw.draw_elements.indices.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 04:41:54 +00:00
Alyssa Rosenzweig	4fcd3189ae	panfrost/decode: Print negative_start This property slipped through.. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 04:41:06 +00:00
Alyssa Rosenzweig	9237204400	panfrost: Implement missing texture formats - Implements RGB565/RGBA5551 formats - Don't advertise support for flipped RGBA5551 and ETC Fixes remaining tests in dEQP-GLES2.functional.texture.format.* which is now at 36/36. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig	01fce794dc	panfrost: Extend tiling for cubemaps transfer_unmap now tiles for any tiled resource, not just TEXTURE_2D, which should more than just cubemaps! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig	c87f3ce97f	panfrost: Implement command stream for linear cubemaps Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig	70b3e5db7d	panfrost/midgard: Emit cubemap coordinates Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig	b5f02bdd99	panfrost: Include all cubemap faces in bitmap list Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig	3197b30c6e	panfrost/decode: Decode all cubemap faces Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:38 +00:00
Alyssa Rosenzweig	e658f7225d	panfrost: Preliminary work for cubemaps Again, not yet functional, but this sets up the memory management for cube maps. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:37 +00:00
Alyssa Rosenzweig	499f31aab8	panfrost/midgard: Add L/S op for writing cubemap coordinates Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:37 +00:00
Alyssa Rosenzweig	f67616ce60	panfrost/midgard: Disassemble `cube` texture op Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:37 +00:00
Alyssa Rosenzweig	28b234a092	panfrost: Fix vertex buffer corruption Fixes crash in dEQP-GLES2.functional.buffer.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-31 02:36:37 +00:00
Rob Clark	b2d651b862	iris: fix set_sampler_view Update to match docs. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-30 13:05:31 -04:00
Rob Clark	e167e8f8a2	gallium/docs: clarify set_sampler_views (v2) Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-30 13:04:00 -04:00
Rob Clark	7ff6705b8d	freedreno/ir3: convert to "new style" frag inputs Add support for load_barycentric_pixel, load_interpolated_input, and friends. For now, this retains support for old-style inputs, which can probably be dropped with some ttn work. Prep work for sample-shading support. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Rob Clark	fc865de777	freedreno/ir3: add pass to move varying loads Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Rob Clark	831f1a05c0	freedreno/ir3: rework varying packing Originally we kept track of a table of inputs. But with new-style frag inputs this becomes awkward. Re-work it so that initially we assigned un-packed varying locations, and then after the shader is compiled scan to find actual used inputs, and re-pack. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Rob Clark	91a1354cd6	freedreno/ir3: re-indent comment Make it more clear that it applies to the following 'case' statements, rather than the previous one. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Rob Clark	1ae0c030cb	nir: add lower_all_io_to_elements I need this part of lower_all_io_to_temps but without the actual lowering to temps part. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Rob Clark	e5e67228f5	nir: print var name for load_interpolated_input too Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-03-30 12:55:47 -04:00
Sergii Romantsov	72a921e12a	i965,iris/blorp: do not blit 0-sizes Seems there is no sense in blitting 0-sized sources or destinations. Additionaly it may cause segfaults for i965. v2: Function call replaced with inline check v3: Added check to avoid devision by zero (L. Landwerlin) v4: Added simillar check for Iris (L. Landwerlin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-30 11:50:40 +00:00
Vinson Lee	e757a2481f	gallium: Fix autotools build with libxatracker.la. CXXLD libxatracker.la /usr/bin/ld: ../../../../src/gallium/auxiliary/.libs/libgallium.a(tgsi_to_nir.o): in function `ttn_finalize_nir': src/gallium/auxiliary/nir/tgsi_to_nir.c:2111: undefined reference to `gl_nir_lower_samplers_as_deref' /usr/bin/ld: src/gallium/auxiliary/nir/tgsi_to_nir.c:2113: undefined reference to `gl_nir_lower_samplers' Fixes: `9a834447d6` ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2019-03-29 23:24:05 -07:00
Timur Kristóf	356ec7a219	gallium: fix autotools build of pipe_msm.la Signed-off-by: Vinson Lee <vlee@freedesktop.org> Fixes: `9a834447d6` ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929	2019-03-29 23:12:40 -07:00
Jason Ekstrand	7dbd934e26	nir: Lock around validation fail shader dumping This prevents getting mixed-up results if a multi-threaded app has two validation errors in different threads. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-29 21:57:51 -05:00
Brian Paul	b8e077daee	util: no-op __builtin_types_compatible_p() for non-GCC compilers __builtin_types_compatible_p() is GCC-specific and breaks the MSVC build. This intrinsic has been in u_vector_foreach() for a long time, but that macro has only recently been used in code (nir/nir_opt_comparison_pre.c) that's built with MSVC. Fixes: `2cf59861a` ("nir: Add partial redundancy elimination for compares") Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-29 15:33:43 -06:00
Caio Marcelo de Oliveira Filho	3b20ca34ae	iris: Clean up compiler warnings about unused Removed a few unused variables and iris_getparam_boolean(). Kept 'name' around since there's a commented debug that make use of it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-29 12:07:26 -07:00
Eric Engestrom	8d9c2044a4	egl: hide entrypoints that shouldn't be exported when using glvnd From GLVND author: > From a functional standpoint, exporting additional symbols doesn't > really matter, since libglvnd will load the vendor libraries with > RTLD_LOCAL. Suggested-by: Kyle Brenneman <kbrenneman@nvidia.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kyle Brenneman <kbrenneman@nvidia.com>	2019-03-29 16:54:08 +00:00
Karol Herbst	fea0caea2b	nir/validate: validate that tex deref sources are actually derefs Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-29 16:03:22 +01:00
Karol Herbst	6ffc72472c	nir/print: fix printing the image_array intrinsic index Fixes: `0de003be03` ("nir: Add handle/index-based image intrinsics") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-29 16:03:22 +01:00
Timothy Arceri	4478c5374b	Revert "ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations" This reverts commit `29132af234`. It seems the new intrinsic causes a hang on radeonsi (VEGA) when running the piglit test: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test	2019-03-29 21:04:01 +11:00
Samuel Pitoiset	cc752dea61	ac: fix return type for llvm.amdgcn.frexp.exp.i32.64 This fixes the following piglit with RadeonSI tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-29 09:18:24 +01:00
Gert Wollny	a0edceb00d	virgl: Add a caps feature check version When we add new feature checks on the host side that is used to enable a cap conditionally that was enabled unconditionally before we might end up with a feature regression when a new mesa version is used with an old virglrenderer version that doesn't check for that cap. To work around this problem add a version id to the caps that corresponds to the features that are actually checked on the host and check that version too when enabling the cap. Fixes: `2ee197d6e8` virgl: Enable mixed color FBO attachemnets only when the host supports it Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Pohsien Wang <pwang@chromium.org>	2019-03-29 07:55:31 +00:00
Samuel Pitoiset	62a9d757e6	radv: do not always initialize HTILE in compressed state Especially when performing a transtion from UNDEFINED->GENERAL, the driver shouldn't initialize HTILE metadata in compressed state because it doesn't decompress when the src layout is GENERAL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110259 Fixes: `3a2e93147f` ("radv: always initialize HTILE when the src layout is UNDEFINED") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-29 08:28:18 +01:00
Kenneth Graunke	3fee3d1319	iris: Print the memzone name when allocating BOs with INTEL_DEBUG=buf This gives me an idea of what kinds of buffers are being allocated on the fly which could help inform our cache decisions.	2019-03-28 23:37:32 -07:00
Brian Paul	4ee057eaef	nir: use {0} initializer instead of {} to fix MSVC build Trivial change. Fixes: `c6ee46a75` ("nir: Add nir_alu_srcs_negative_equal")	2019-03-28 20:34:23 -06:00
Ian Romanick	7832fb7889	intel/compiler: Use partial redundancy elimination for compares Almost all of the hurt shaders are repeated instances of the same shader in synmark's compilation speed tests. shader-db results: All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256389 (<.01%) instructions in affected programs: 54137 -> 53686 (-0.83%) helped: 288 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.06% max: 26.67% x̄: 1.99% x̃: 0.74% 95% mean confidence interval for instructions value: -1.76 -1.38 95% mean confidence interval for instructions %-change: -2.47% -1.50% Instructions are helped. total cycles in shared programs: 372286583 -> 372283851 (<.01%) cycles in affected programs: 833829 -> 831097 (-0.33%) helped: 265 HURT: 16 helped stats (abs) min: 2 max: 74 x̄: 11.81 x̃: 4 helped stats (rel) min: 0.04% max: 9.07% x̄: 0.99% x̃: 0.35% HURT stats (abs) min: 2 max: 130 x̄: 24.88 x̃: 8 HURT stats (rel) min: <.01% max: 12.31% x̄: 1.44% x̃: 0.27% 95% mean confidence interval for cycles value: -12.30 -7.15 95% mean confidence interval for cycles %-change: -1.06% -0.64% Cycles are helped. Iron Lake and GM45 had similar results. (GM45 shown) total instructions in shared programs: 5038653 -> 5038495 (<.01%) instructions in affected programs: 13939 -> 13781 (-1.13%) helped: 50 HURT: 1 helped stats (abs) min: 1 max: 15 x̄: 3.18 x̃: 4 helped stats (rel) min: 0.33% max: 13.33% x̄: 2.24% x̃: 1.09% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.83% max: 0.83% x̄: 0.83% x̃: 0.83% 95% mean confidence interval for instructions value: -3.73 -2.47 95% mean confidence interval for instructions %-change: -3.16% -1.21% Instructions are helped. total cycles in shared programs: 128118922 -> 128118228 (<.01%) cycles in affected programs: 134906 -> 134212 (-0.51%) helped: 50 HURT: 0 helped stats (abs) min: 2 max: 60 x̄: 13.88 x̃: 18 helped stats (rel) min: 0.06% max: 3.19% x̄: 0.74% x̃: 0.70% 95% mean confidence interval for cycles value: -16.54 -11.22 95% mean confidence interval for cycles %-change: -0.95% -0.53% Cycles are helped. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:53 -07:00
Ian Romanick	2cf59861a8	nir: Add partial redundancy elimination for compares This pass attempts to dectect code sequences like if (x < y) { z = y - x; ... } and replace them with sequences like t = x - y; if (t < 0) { z = -t; ... } On architectures where the subtract can generate the flags used by the if-statement, this saves an instruction. It's also possible that moving an instruction out of the if-statement will allow nir_opt_peephole_select to convert the whole thing to a bcsel. Currently only floating point compares and adds are supported. Adding support for integer will be a challenge due to integer overflow. There are a couple possible solutions, but they may not apply to all architectures. v2: Fix a typo in the commit message and a couple typos in comments. Fix possible NULL pointer deref from result of push_block(). Add missing (-A + B) case. Suggested by Caio. v3: Fix is_not_const_zero to work correctly with types other than nir_type_float32. Suggested by Ken. v4: Add some comments explaining how this works. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:53 -07:00
Ian Romanick	c6ee46a753	nir: Add nir_alu_srcs_negative_equal v2: Move bug fix in get_neg_instr from the next patch to this patch (where it was intended to be in the first place). Noticed by Caio. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:52 -07:00
Ian Romanick	be1cc3552b	nir: Add nir_const_value_negative_equal v2: Rebase on 1-bit Boolean changes. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:52 -07:00
Ian Romanick	ae21b52e1d	nir/algebraic: Add missing 16-bit extract_[iu]8 patterns No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2] Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]	2019-03-28 15:35:52 -07:00
Ian Romanick	cbad201c2b	nir/algebraic: Add missing 64-bit extract_[iu]8 patterns No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2] Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]	2019-03-28 15:35:52 -07:00
Ian Romanick	bc17f5a2a3	nir/algebraic: Remove redundant extract_[iu]8 patterns No shader-db changes on any Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-28 15:35:52 -07:00
Ian Romanick	c152672e68	nir/algebraic: Fix up extract_[iu]8 after loop unrolling Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256837 (<.01%) instructions in affected programs: 4713 -> 4710 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.06% max: 0.08% x̄: 0.06% x̃: 0.06% total cycles in shared programs: 372286583 -> 372286583 (0.00%) cycles in affected programs: 198516 -> 198516 (0.00%) helped: 1 HURT: 1 helped stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.01% max: 0.01% x̄: 0.01% x̃: 0.01% No changes on any other Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> [v2] Acked-by: Jason Ekstrand <jason@jlekstrand.net> [v2]	2019-03-28 15:35:52 -07:00
Dave Airlie	b779baa9bf	nir/deref: fix struct wrapper casts. (v3) llvm/spir-v spits out some struct a { struct b {} }, but it doesn't deref, it casts (struct a) to (struct b), reconstruct struct derefs instead of casts for these. v2: use ssa_def_rewrite uses, rework the type restrictions (Jason) v3: squish more stuff into one function, drop unused temp (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-29 08:10:50 +10:00
Rafael Antognolli	8e0469f629	i965/blorp: Remove unused parameter from blorp_surf_for_miptree. It seems pretty useless nowadays. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-28 14:38:23 -07:00
Anuj Phogat	9c421d6b47	iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 19:59:59 +00:00
Anuj Phogat	e0f4359ec1	iris/icl: Set Enabled Texel Offset Precision Fix bit h/w specification requires this bit to be always set. See Mesa commit `5eb173304b`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 19:59:59 +00:00
Rob Clark	78825ca2d0	freedreno/ir3: align const size to vec4 This is no longer true since PIPE_CAP_PACKED_UNIFORMS was enabled. Fixes: `3c8779af32` freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-28 14:36:24 -04:00
Rob Clark	26e2906382	freedreno/ir3: reads/writes to unrelated arrays are not dependent Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-28 14:36:24 -04:00
Rob Clark	d71ce69d9c	freedreno/ir3: sched fix Not sure why new-style frag inputs start triggering this. But we probably shouldn't consider src's from other blocks. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-28 14:36:24 -04:00
Rob Clark	c557fcaf2b	freedreno/a6xx: small cleanup Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-28 14:36:23 -04:00
Kenneth Graunke	ee8370c766	iris: Fix blits with S8_UINT destination For depth and stencil blits, we always want the main mask to be Z, and the secondary pass mask to be S. If asked to blit Z+S to S, we should handle the blit in the second pass which properly gets the stencil resources. Before, we were trying to handle S as the main mask, and accidentally blitting a Z source to a S destination, which doesn't work out well. Fixes Piglit's "framebuffer-blit-levels {draw,read} stencil" tests.	2019-03-28 10:47:26 -07:00
Kenneth Graunke	ce89c19b88	st/mesa: Fix blitting from GL_DEPTH_STENCIL to GL_STENCIL_INDEX Fixes assertion failures in Piglit's "framebuffer-blit-levels {draw,read} stencil" tests on iris. Also fixes assert failures in frameretrace, which tries to ReadPixels the stencil values (only) from a Z24S8 depth/stencil attachment. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-28 10:47:23 -07:00
Kristian H. Kristensen	107a8ec3b3	freedreno/ir3: Add workaround for VS samgq This instruction needs a workaround when used from vertex shaders. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-28 10:26:32 -07:00
Kristian H. Kristensen	f30d4a1cca	freedreno/ir3: Don't access beyond available regs emit_cat5() needs to check if the last optional reg is there before it accesses it. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-28 10:26:32 -07:00
Eric Engestrom	7fefa4610d	util/disk_cache: close fd in the fallback path There are multiple `goto path_fail` with an open fd, but none that go to `fail:` without going through `path_fail:` first, so let's just move the `close(fd)` there. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 16:41:27 +00:00
Samuel Pitoiset	6596eb2b30	radv: skip updating depth/color metadata for conditional rendering I don't think we should update metadata when conditional rendering is enabled. For some reasons, some CTS breaks only on SI. This fixes the following CTS on SI: dEQP-VK.conditional_rendering.draw_clear.clear.depth.* Cc: 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 17:37:12 +01:00
Kenneth Graunke	1d72de3bcc	st/nir: Free the GLSL IR after linking. i965 does this, and st's tgsi path does this. st/nir did not. Cuts 138MB of memory from a DiRT Rally trace, which is about 44% of the total GLSL IR memory. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-28 09:31:12 -07:00
Samuel Pitoiset	227b191206	radv: enable VK_AMD_gpu_shader_int16 This extension allows 16-bit support to Frexp/FrexpStruct. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:53 +01:00
Samuel Pitoiset	8a6e61cc52	radv: do not lower frexp_exp and frexp_sig Hardware has two instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:51 +01:00
Samuel Pitoiset	52c02d921f	ac: add ac_build_frex_exp() helper ans 16-bit/32-bit support Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:48 +01:00
Samuel Pitoiset	1bf9311c59	ac: add ac_build_frexp_mant() helper and 16-bit/32-bit support Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:46 +01:00
Kenneth Graunke	de783a6897	iris: Actually advertise some modifiers I neglected to fill out this driver function, causing us to advertise 0 modifiers. Now we advertise the various tilings and let the driver pick them. I've verified that X tiling works with Weston (by hacking the list to skip Y tiling). Y+CCS doesn't work yet because it's multiplane and the Gallium dri state tracker isn't really prepared for that. Leave it off for now.	2019-03-27 21:27:54 -07:00
Toni Lönnberg	505854f84b	intel/genxml: Media instructions and structures for gen11 v2: Lionel Landwerlin <lionel.g.landwerlin@intel.com> - fix missing type - fix _FQM_/_QM_ commands - shorten some media structs using groups - factor out memory attributes - switch MI_FLUSH_DW fields to bool Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	4dccf2edef	intel/genxml: Media instructions and structures for gen10 v2: Lionel Landwerlin <lionel.g.landwerlin@intel.com> - fix missing type - fix _FQM_/_QM_ commands - shorten some media structs using groups - factor out memory attributes - switch MI_FLUSH_DW fields to bool Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	8e74cacdad	intel/genxml: Media instructions and structures for gen9 v2: Lionel Landwerlin <lionel.g.landwerlin@intel.com> - fix missing type - fix _FQM_/_QM_ commands - shorten some media structs using groups - factor out memory attributes - switch MI_FLUSH_DW fields to bool Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	2f075c5ccc	intel/genxml: Media instructions and structures for gen8 v2: Lionel Landwerlin <lionel.g.landwerlin@intel.com> - switch MI_FLUSH_DW fields to bool Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	2bf89a05f4	intel/genxml: Media instructions and structures for gen7.5 v2: Fixed MI_WAIT_FOR_EVENT to be for video also Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	416e1567ee	intel/genxml: Media instructions and structures for gen7 v2: Fixed MI_WAIT_FOR_EVENT to be for blitter and video also Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	9e6ffe3741	intel/genxml: Media instructions and structures for gen6 v2: Fixed MI_WAIT_FOR_EVENT to be for blitter and video also Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Toni Lönnberg	b6f7b40d81	intel/genxml: Only handle instructions meant for render engine when generating headers v2: Fixed the check for engine v3: Changed engine into an argument given to the scripts Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-28 04:26:30 +00:00
Dave Airlie	ce6faa57ae	softpipe: add indirect store buffer/image unit The code to handle image unit indirect was missing Fixes piglit tests/spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-mixed-const-non-const-uniform-index.shader_test Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-03-28 14:13:08 +10:00
Dave Airlie	9f9d9c948d	softpipe/draw: fix vertex id in soft paths. This fixes the vertex id fetch in the non-llvm drawing paths. This vertex id in elt mode comes from the elts not just a linear value. Note we don't bad basevertex in the elts case as it's already included in the elts by the looks of it (at least tests fail if I add it) Fixes piglit end-primitive tests and some others. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-03-28 14:13:08 +10:00
Kristian H. Kristensen	893425a607	freedreno/ir3: Push UBOs to constant file We have a rather big constant file and it seems that the best way to use it is to upload all UBOs and lower UBO access the load_uniform. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-27 13:26:02 -07:00
Kristian H. Kristensen	3c8779af32	freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS This commit turns on the gallium cap and adds a pass to lower the load_ubo intrinsics for block 0 back to load_uniform intrinsics and adjust the backend where the cap switches units from vec4s to dwords. As we stop using ir3_glsl_type_size() for uniform layout, this also corrects an issue where we would allocate a vec4 slot for samplers in uniforms, fixing: dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_vertex dEQP-GLES3.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-27 13:26:02 -07:00
Kristian H. Kristensen	56b4bc292f	st/glsl_to_nir: Calculate num_uniforms from NumParameterValues We don't need to determine the number of uniform slots here, it's already available as prog->Parameters->NumParameterValues. The way we previously determined the number of slots was also broken for PackedDriverUniformStorage, where we would add loc (in dwords) and type_size() (in vec4s). Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-27 13:26:02 -07:00
Anuj Phogat	dce13e58b0	intel: Add Elkhart Lake PCI-IDs Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-27 19:34:48 +00:00
Anuj Phogat	a583f86305	intel: Add Elkhart Lake device info V2: Fix L3 bank count (Vivek) Fix simulator_id and num_eu_per_subslice (Lionel) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-27 19:34:48 +00:00
Leo Liu	f8ef8b56a6	radeon/vcn: add H.264 constrained baseline support VCN supports this profile as well as UVD, so add it Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: <mesa-stable@lists.freedesktop.org>	2019-03-27 14:33:55 -04:00
Gurchetan Singh	ac839bbf79	egl/android: chose node type based on swrast and preprocessor flags kms_swrast can work with primary nodes out of the box, but also with rendernodes if the build environment specifies the EGL_FORCE_RENDERNODE flag. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	a87096b79e	egl/android: use software rendering when appropriate Now the init logic fallbacks to or forces software rendering. v2: simplify flow (@eric) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	d4e7982b6e	egl/android: use swrast option in droid_load_driver Load the kms_swrast driver when specified. Doesn't work with drm_gralloc. v2: remove unneeded line (@eric) v3: Remove swrast_loader_extensions (@evelikov) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	f90fc102ed	egl/android: plumb swrast option It's good to have options. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	7d9719db83	egl/android: refactor droid_load_driver a bit This way, we can use primary nodes with kms_swrast too. Also fix up some whitespace issues. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	f1dd1be0c2	egl/android: droid_open_device_drm_gralloc --> droid_open_device Makes things easier to follow. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	95ad1744c1	egl/android: move droid_open_device_drm_gralloc down a bit 1) Removes a forward declaration. 2) Makes next patch easier. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Gurchetan Singh	49d52539fb	egl/android: move droid_image_loader_extension down a bit This removes some #ifdefs. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 17:26:21 +00:00
Dylan Baker	15f131b7b7	docs: update calendar, add news item and link release notes for 19.0.1	2019-03-27 10:14:50 -07:00
Dylan Baker	3f1a79989d	docs: Add SHA256 sums for mesa 19.0.1	2019-03-27 10:14:50 -07:00
Dylan Baker	fcf8be8a8a	docs: Add release notes for 19.0.1	2019-03-27 10:14:47 -07:00
Jason Ekstrand	ce47999cee	Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir" This reverts commit `4e1bbb000c`. It turns out that some DXVK apps due to some implementation detail of DXVK or other create and destroy instances in an interleaved way. Freeing the glsl_type memory without being a bit more careful causes use-after-free issues. Looks like we need to try again.	2019-03-27 11:24:58 -05:00
Tomeu Vizoso	b817d00278	panfrost: Wait for last job to finish in force_flush_fragment Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-27 17:03:34 +01:00
Tomeu Vizoso	53ab812230	panfrost: Pass the context BOs to the kernel so they aren't unmapped while in use Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-27 17:03:34 +01:00
Tomeu Vizoso	b0f67c066f	panfrost: Also tell the kernel about the checksum_slab Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-27 17:03:34 +01:00
Tomeu Vizoso	95748f6483	panfrost: Set the GEM handle for AFBC buffers Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-27 17:03:34 +01:00
Tomeu Vizoso	02081edfaf	panfrost: Fix sscanf format options Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-27 17:03:34 +01:00
Alexandros Frantzis	3bccf70211	virgl: Fake MSAA when max samples is 1 When the host is running on softpipe/llvmpipe the maximum number of samples for multisampling is 1. GL 3.0 requires at least 4 samples, and softpipe/llvmpipe get around this by enabling PIPE_CAP_FAKE_SW_MSAA. This patch mimics softpipe/llvmpipe behavior in virgl by enabling the same PIPE_CAP_FAKE_SW_MSAA workaround when the max sample count reported by the host is 1. This change allows virgl on a softpipe/llvmpipe host to advertise support for GL 3.0 and beyond. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-03-27 15:46:14 +02:00
Samuel Pitoiset	d6a07732c9	ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-27 14:45:52 +01:00
Michel Dänzer	6140ed3d2c	gitlab-ci: Automatically retry jobs after runner system failure Up to twice, for a total of 3 attempts maximum. This will hopefully avoid spurious CI pipeline failures due to intermittent GitLab/docker infrastructure issues. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 10:05:43 +01:00
Michel Dänzer	a3f34f9d85	gitlab-ci: Only pull/push cache contents in build+test stage jobs The containers-build stage job doesn't use the cache, so this might save some wasted time for it. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 10:05:43 +01:00
Michel Dänzer	1aca01dcf1	gitlab-ci: Make sure clang job actually uses ccache Meson didn't automatically pick up ccache in this job for some reason. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-27 10:05:43 +01:00
Samuel Pitoiset	bea540173c	spirv: propagate the access flag for store and load derefs It was only propagated when UBO/SSBO access are lowered to offsets. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: <Jason Ekstrand jason@jlekstrand.net>	2019-03-27 09:57:30 +01:00
Samuel Pitoiset	4d0b03c83d	nir: add nir_{load,store}_deref_with_access() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: <Jason Ekstrand jason@jlekstrand.net>	2019-03-27 09:57:27 +01:00
Timothy Arceri	d163780f81	spirv: make use of the select control support in nir Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Timothy Arceri	e76ae39ae2	nir: add support for user defined select control This will allow us to make use of the selection control support in spirv and the GL support provided by EXT_control_flow_attributes. Note this only supports if-statements as we dont support switches in NIR. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Timothy Arceri	24037ff228	spirv: make use of the loop control support in nir Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Timothy Arceri	b56451f82c	nir: add support for user defined loop control This will allow us to make use of the loop control support in spirv and the GL support provided by EXT_control_flow_attributes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Alyssa Rosenzweig	6170814c42	panfrost: Preliminary work for mipmaps This patch refactors a substantial amount of code in preparation for mipmaps. In particular, we know have a correct slice abstraction based on offsets; cpu/gpu are no longer arbitrary pointers. We additionally shuffle around other code to accompany these changes and cleanup how tiled textures are handled, while drawing some attention to the blit code. Mipmaps are still disabled at this point, as autogeneration is not yet implemented; enabling as-is would cause regressions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-27 02:11:24 +00:00
Alyssa Rosenzweig	04a72391f3	panfrost/midgard: fpow is a two-part operation In fact, the native "fpow" instruction only does half of it; more work is needed for the actual instruction. For now, just lower. Fixes: `1ea42894c` ("panfrost/midgard: Implement fpow") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig	12d1d99fee	panfrost/midgard: Handle i2b constant Fixes dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.int_to_bool_fragment Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig	7b78af8e00	panfrost/midgard: Expand fge lowering to more types Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig	b8739c24ee	panfrost/midgard: Add ult/ule ops Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:36:09 +00:00
Alyssa Rosenzweig	f277bd3c22	panfrost: Stub out ES3 caps/callbacks Although this is not functional (and the command stream side is not aiming for ES3 right now), this is enough to run dEQP-GLES3 shader tests with the version override directive; this is useful, as some ES3 shader feature can occur in ES2 class shaders due to lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:58 +00:00
Alyssa Rosenzweig	89989e653e	panfrost/midgard: Cleanup midgard_nir_algebraic.py Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:37 +00:00
Alyssa Rosenzweig	effe6fb08d	panfrost/midgard: Lower source modifiers for ints On Midgard, float ops support standard source modifiers (abs/neg) and destination modifiers (sat/pos/round). Integer ops do not support these, however. To cope, we use native NIR source modifiers for floats, but lower them away to iabs/ineg for integers, implementing those ops simultaneously to avoid regressions. Fixes the integer tests in dEQP-GLES2.functional.shaders.operator.unary_operator.minus.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:36 +00:00
Alyssa Rosenzweig	3208c9d9a2	panfrost/midgard: Implement b2i; improve b2f/f2b Fixes dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.bool_to_int_fragment Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:27 +00:00
Alyssa Rosenzweig	5b95fef493	panfrost/midgard: Lower i2b32 Fixes dEQP-GLES2.functional.shader.conversions.scalar_to_scalar.int_to_bool_vertex Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:26 +00:00
Alyssa Rosenzweig	ae43b8faa7	panfrost/midgard: Lower f2b32 to fne Fixes dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_bvec2_x_vertex Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:24 +00:00
Alyssa Rosenzweig	3fb884259b	panfrost/midgard: Lower bool_to_int32 Fixes dEQP-GLES2.functional.shaders.linkage.varying_type_vec2 (among many others). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:22 +00:00
Alyssa Rosenzweig	53664108c2	panfrost/midgard: Map more bany/ball opcodes Some of these are not yet fully functional due to related bugs, but this the correct op mapping. The native ball/bany opcodes act on vec4's unconditionally. That said, both ball and bany have the nice property that duplicating an argument does not affect their output, so the default "hanging swizzles" allow us to implement 2/3-component opcodes correctly, implicitly lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:20 +00:00
Alyssa Rosenzweig	88b2a6b451	panfrost/midgard: Add more ball/bany, iabs ops Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:18 +00:00
Alyssa Rosenzweig	72cd677bac	panfrost/midgard: Schedule ball/bany to vectors Though they output scalars, they need a vector unit to make sense. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:17 +00:00
Alyssa Rosenzweig	89fdbb6707	panfrost/midgard: Add fcsel_i opcode Whereas a normal fcsel acts on a boolean input in r31.w, the fcsel_i variant acts on an integer input in r31.w, which can be preloaded with an instruction like imov (with the appropriate negate flag on the source). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:15 +00:00
Alyssa Rosenzweig	121417ef1d	panfrost: Implement scissor test This preliminary implementation should handle some basic cases. Future work should scissor the FRAGMENT job as well for efficiency. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:14 +00:00
Alyssa Rosenzweig	bd9446e719	panfrost: Fix viewports Our viewport code hardcoded a number of wrong assumptions, which sort of sometimes worked but was definitely wrong (and broke most of dEQP). This corrects the logic, accounting for flipped-Y framebuffers, which fixes... most of dEQP. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:10 +00:00
Alyssa Rosenzweig	9da4603fb6	panfrost/midgard: Fix b2f32 swizzle for vectors Fixes issues in most of dEQP-GLES2.functional.shaders.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:08 +00:00
Dave Airlie	e77013fb7f	softpipe: fix clears to only clear specified color buffers. This fixes piglit clearbuffer-mixed-format Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-27 07:53:32 +10:00
Dave Airlie	7f7c9425a8	draw/vs: partly fix basevertex/vertex id This gets the basevertex from the draw depending on whether it's an indexed or non-indexed draw. We still fail a transform feedback test for vertex id, as the vertex id actually an index id, and isn't getting translated properly to a vertex id, suggestions on how/where to fix that welcome. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-27 07:52:28 +10:00
Nicolai Hähnle	e16ac33f37	amd/surface: provide firstMipIdInTail for metadata surface calculations This field was added in a recent addrlib update, and while there currently seems to be no issue with skipping it, we will have to set it correctly in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-03-26 10:00:55 +01:00
Bas Nieuwenhuizen	82075e3c42	ac/nir: Return frag_coord as integer. To preserve the invariant that nir ssa defs are integers or pointers in LLVM. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-03-26 09:41:15 +01:00
Kristian H. Kristensen	c7c432738a	freedreno/ir3: Fix operand order for DSX/DSY Most cat5 instructions are constructed using ir3_SAM, which uses regs[1] for the (sampler, tex) src. Not DSX/DSY though, so we look up src1 and src2 differently for those two. Fixes: `1dffb089` ("freedreno/ir3: fix sam.s2en encoding") Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-25 18:36:48 -07:00
Kristian H. Kristensen	a752422bd4	freedreno/ir3: Track whether shader needs derivatives In `1088b788` ("freedreno/ir3: find # of samplers from uniform vars") we started counting number of samplers based on the uniform vars instead of number of cat5 instructions. We used the number of samplers to determine whether to enable derivatives, but when we only use derivatives and no samplers, that now breaks. Track whether we need derivatives explicitly and use that to enable the state. Fixes: `1088b788` ("freedreno/ir3: find # of samplers from uniform vars") Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-25 18:36:48 -07:00
Andre Heider	12f11e6fe6	st/nine: enable csmt per default on iris iris is thread safe, enable csmt for a ~5% performace boost. Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2019-03-25 22:21:19 +01:00
Jason Ekstrand	8ed583fe52	spirv: Handle the NonUniformEXT decoration	2019-03-25 16:12:09 -05:00
Jason Ekstrand	e50ab2c0f2	nir: Add access flags to deref and SSBO atomics We will need them for a new ACCESS_NON_UNIFORM flag that's about to be added in the next commit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 16:12:09 -05:00
Jason Ekstrand	40074ebf74	nir: Add texture sources and intrinsics for bindless On Intel, we have both bindless and bindful and we'd like to use them at the same time if we can so we need to be able to distinguish at the NIR level between the two. This also fixes nir_lower_tex to properly handle bindless in its tex_texture_size and get_texture_lod helpers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 16:12:09 -05:00
Danylo Piliaiev	e0db0c74b9	intel/fs: Make alpha test work with MRT and sample mask Fix the order of src0_alpha and sample mask in fb payload. From SKL PRM Volume 7, "Data Payload Register Order for Render Target Write Messages": Type S0A oM sZ oS M2 M3 M4 SIMD8 1 1 0 0 s0A oM R SIMD16 1 1 0 0 1/0s0A 3/2s0A oM It also fixes working of alpha to coverage with sample mask on GEN6 since now they are in correct order. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-03-25 13:54:55 -07:00
Danylo Piliaiev	c8abe03f3b	i965,iris,anv: Make alpha to coverage work with sample mask From "Alpha Coverage" section of SKL PRM Volume 7: "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in hardware, regardless of the state setting for this feature." From OpenGL spec 4.6, "15.2 Shader Execution": "The built-in integer array gl_SampleMask can be used to change the sample coverage for a fragment from within the shader." From OpenGL spec 4.6, "17.3.1 Alpha To Coverage": "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value is generated where each bit is determined by the alpha value at the corresponding sample location. The temporary coverage value is then ANDed with the fragment coverage value to generate a new fragment coverage value." Similar wording could be found in Vulkan spec 1.1.100 "25.6. Multisample Coverage" Thus we need to compute alpha to coverage dithering manually in shader and replace sample mask store with the bitwise-AND of sample mask and alpha to coverage dithering. The following formula is used to compute final sample mask: m = int(16.0 * clamp(src0_alpha, 0.0, 1.0)) dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) \| 0x0808 * (m & 2) \| 0x0100 * (m & 1) sample_mask = sample_mask & dither_mask Credits to Francisco Jerez <currojerez@riseup.net> for creating it. It gives a number of ones proportional to the alpha for 2, 4, 8 or 16 least significant bits of the result. GEN6 hardware does not have issue with simultaneous usage of sample mask and alpha to coverage however due to the wrong sending order of oMask and src0_alpha it is still affected by it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-03-25 13:54:55 -07:00
Jason Ekstrand	3bd5457641	nir: Add a lowering pass for non-uniform resource access Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-25 15:00:36 -05:00
Jason Ekstrand	39da1deb49	nir/lower_io: Add a bounds-checked 64-bit global address format Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 14:40:54 -05:00
Dave Airlie	551950cacd	draw/gs: fix point size outputs from geometry shader. If the geom shader emits a point size we failed to find it here, use the correct API to look it up. Fixes: tests/spec/glsl-1.50/execution/geometry/point-size-out.shader_test Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-26 05:17:06 +10:00
Dave Airlie	d3836510d2	draw: bail instead of assert on instance count (v2) With indirect rendering it's fine to set the instance count parameter to 0, and expect the rendering to be ignored. Fixes assert in KHR-GLES31.core.compute_shader.pipeline-gen-draw-commands on softpipe v2: return earlier before changing fpstate Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-26 05:16:56 +10:00
Leo Liu	382401aab7	vl/dri3: remove the wait before getting back buffer The wait here is unnecessary since we got a pool of back buffers, and the wait for swap buffer will happen before the present pixmap, at the same time the previous back buffer will be put back to pool for reuse after the check for PresentIdleNotify event Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-03-25 12:20:31 -04:00
Iago Toral Quiroga	763c8aabed	compiler/nir: add lowering for 16-bit ldexp v2 (Topi): - Make bit-size handling order be 16-bit, 32-bit, 64-bit - Clamp lower exponent range at -28 instead of -30. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 16:08:25 +01:00
Iago Toral Quiroga	3766334923	compiler/nir: add lowering for 16-bit flrp And enable it on Intel. v2: - Squash the change to enable it on Intel (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 16:08:25 +01:00
Iago Toral Quiroga	ca31df6f1f	compiler/nir: add lowering option for 16-bit fmod And enable it on Intel. v2: - Squash the change to enable this lowering on Intel (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 16:08:25 +01:00
Brian Paul	08d97aadd1	st/mesa: fix texture deletion context mix-up issues (v2) When we destroy a context, we need to temporarily make that context the current one for the thread. That's because during context tear-down we make many calls to _mesa_reference_texobj(&texObj, NULL). Note there's no context parameter. If the texture's refcount goes to zero and we need to delete it, we use the thread's current context. But if that context isn't the context we're tearing down, we get into trouble when deallocating sampler views. See patch `593e36f956` ("st/mesa: implement "zombie" sampler views (v2)") for background information. Also, we need to release any sampler views attached to the fallback textures. Fixes a crash on exit with a glretrace of the Nobel Clinician application. v2: at end of st_destroy_context(), check if save_ctx == ctx and unbind the context if so. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-03-25 06:57:57 -06:00
Brian Paul	d13167cd21	nir: fix a few signed/unsigned comparison warnings Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 06:51:31 -06:00
Kishore Kadiyala	e1d8057160	android: static link with libexpat with Android O+ In Android O, MESA needs to statically link libexpat so that it's in same VNDK namespace. v2: apply change also to anv driver (Tapani) v3: use += in anv change (Eric Engestrom) Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33 Signed-off-by: Kishore Kadiyala <kishore.kadiyala@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-25 10:11:57 +02:00
Samuel Iglesias Gonsálvez	01cf390035	radv: write availability status vkGetQueryPoolResults() when the data is not available If VK_QUERY_RESULT_WITH_AVAILABILY_BIT is set and VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both not set, we need return to VK_NOT_READY only and set the availability status field for each query. From Vulkan spec: "If VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both not set then no result values are written to pData for queries that are in the unavailable state at the time of the call, and vkGetQueryPoolResults returns VK_NOT_READY. However, availability state is still written to pData for those queries if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set." Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-25 08:21:22 +01:00
Samuel Iglesias Gonsálvez	cb3ea50ec2	radv: don't overwrite results in VkGetQueryPoolResults() when queries are not available If the query is not available and VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both not set, the spec doesn't allow to modify its result. From Vulkan spec: "If VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both not set then no result values are written to pData for queries that are in the unavailable state at the time of the call, and vkGetQueryPoolResults returns VK_NOT_READY. However, availability state is still written to pData for those queries if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set." v2: - Move VK_NOT_READY change to next patch (Samuel Pitoiset) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-25 08:21:22 +01:00
Tapani Pälli	2c240a5216	st/mesa: fix warnings about implicit conversion on enumeration type These enums match but compiler warns about implicit conversion. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-03-25 07:44:27 +02:00
Tapani Pälli	ec12316489	st/mesa: fix compilation warning on storage_flags_to_buffer_flags (warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-03-25 07:44:05 +02:00
Dave Airlie	9417793fb1	nir/split_vars: fixup some more explicit_stride related issues. With vkpipelinedb Samuel discovered a regression since we stopped stripping types at the spir-v level. This adds a check to the var splitting for the case where it asserts the type hasn't changed, when it has just created a bare type, and it's different than the original type which has an explicit stride. This also removes a pointless assert that also triggers. Fixes: `3b3653c4cf` (nir/spirv: don't use bare types, remove assert in split vars for testing) Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 13:57:16 +10:00
Caio Marcelo de Oliveira Filho	9d0ae777dd	spirv: Use interface type for block and buffer block Also handle GLSL_TYPE_INTERFACE the same way we do GLSL_TYPE_STRUCT in various places. Motivated by ARB_gl_spirv work, that will take advantage of the interface types when handling NIR coming from SPIR-V. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-23 10:22:39 -07:00
Caio Marcelo de Oliveira Filho	fb024f5e72	intel/compiler: handle GLSL_TYPE_INTERFACE as GLSL_TYPE_STRUCT Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-23 10:22:39 -07:00
Caio Marcelo de Oliveira Filho	15012077bc	spirv: Add an execution environment to the options Also updates gl_spirv to pick the right one. At the moment nothing uses it, but upcoming functionality part of ARB_gl_spirv will use it, and we also later can be more assertful when handling certain features for each of the execution environments. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-03-23 09:29:21 -07:00
Eric Anholt	dacb11a585	egl: Add a 565 pbuffer-only EGL config under X11. The CTS requires a 565-no-depth-no-stencil (meaning d/s not-required, not not-present) config for ES 3.0, but at depth 24 of X11 we wouldn't do so. We can satisfy that bad requirement using a pbuffer-only visual with whatever other buffers the driver happens to have given us. I've tried to raise this as an absurd requirement with Khronos and made no progress. v2: Make sure it's single sample, no depth, no stencil. Comment typo fix Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-03-22 15:22:40 -07:00
Caio Marcelo de Oliveira Filho	e5830e1132	nir: Handle array-deref-of-vector case in loop analysis SPIR-V can produce those for SSBO and UBO access. Found when testing the ARB_gl_spirv series. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-22 13:50:39 -07:00
Rob Clark	cdd90a7502	docs: update freedreno status Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-22 16:39:14 -04:00
Rob Clark	6fd5a7ff8c	freedreno: add ESSL cap Report 320 for a6xx, which isn't quite true (no geom/tess, in particular), but other caps keep the reported GL and GLSL versions correct (3.1 / 3.10 es). But reporting 320 will switch on EXT_gpu_shader5, which is the goal. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-22 16:39:14 -04:00
Rob Clark	6cd9876047	mesa/st: use ESSL cap top enable gpu_shader5 For GLES2+ contexts, enable EXT_gpu_shader5 if the driver exposes a sufficiently high ESSL feature level, even if the GLSL feature level isn't high enough. This allows drivers to support EXT_gpu_shader5 in GLES contexts before they support all the additional features of ARB_gpu_shader5 in GL contexts. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-03-22 16:39:13 -04:00
Rob Clark	de481947d9	gallium: add PIPE_CAP_ESSL_FEATURE_LEVEL Adds a new cap to allow drivers to expose higher shading language versions in GLES contexts, to avoid having to report an artificially low version for the benefit of GL contexts. The motivation is to expose EXT_gpu_shader5 even though a driver may not support all the features needed for the corresponding GL extension (ARB_gpu_shader5). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-03-22 16:39:13 -04:00
Vinson Lee	93c81ca336	swr: Fix build with llvm-9.0. Fix build error after llvm-9.0svn r352827 ("[opaque pointer types] Add a FunctionCallee wrapper type, and use it."). In file included from ./rasterizer/jitter/builder.h:158:0, from swr_shader.cpp:35: ./rasterizer/jitter/gen_builder_meta.hpp: In member function ‘llvm::Value* SwrJit::Builder::VGATHERPD(llvm::Value, llvm::Value, llvm::Value, llvm::Value, llvm::Value, const llvm: :Twine&)’: ./rasterizer/jitter/gen_builder_meta.hpp:51:117: error: no matching function for call to ‘cast(llvm::FunctionCallee)’ Function pFunc = cast<Function>(JM()->mpCurrentModule->getOrInsertFunction("meta.intrinsic.VGATHERPD", pFuncTy)); ^ Suggested-by: Philip Meulengracht <the_meulengracht@hotmail.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-03-22 13:13:51 -07:00
Dylan Baker	ed96038e55	bin/install_megadrivers.py: Fix regression for set DESTDIR The previous patch tried to address a bug when DESTDIR is '', however, it introduces a bug when DESTDIR is not '', and fakeroot is used. This patch does fix that, and has been tested with the arch pkg-build to ensure it isn't regressed. Fixes: 093a1ade4e24b7dd701a093d30a71efd669fe9c8 ("bin/install_megadrivers.py: Correctly handle DESTDIR=''") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110221 Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-03-22 19:09:00 +00:00
Samuel Pitoiset	23d30f4099	spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-22 19:41:46 +01:00
Samuel Pitoiset	6ae5797243	nir: use generic float types for frexp_exp and frexp_sig Only the exponent needs to be 32-bit signed integer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-22 19:41:44 +01:00
Vinson Lee	77aa11ca32	nir: Fix anonymous union initialization with older GCC. Fix this build error with GCC 4.4.7. CC nir/nir_opt_copy_prop_vars.lo nir/nir_opt_copy_prop_vars.c: In function ‘load_element_from_ssa_entry_value’: nir/nir_opt_copy_prop_vars.c:454: error: unknown field ‘ssa’ specified in initializer nir/nir_opt_copy_prop_vars.c:455: error: unknown field ‘def’ specified in initializer nir/nir_opt_copy_prop_vars.c:456: error: unknown field ‘component’ specified in initializer nir/nir_opt_copy_prop_vars.c:456: error: extra brace group at end of initializer nir/nir_opt_copy_prop_vars.c:456: error: (near initialization for ‘(anonymous).<anonymous>’) nir/nir_opt_copy_prop_vars.c:456: warning: excess elements in union initializer nir/nir_opt_copy_prop_vars.c:456: warning: (near initialization for ‘(anonymous).<anonymous>’) Fixes: `96c32d7776` ("nir/copy_prop_vars: handle load/store of vector elements") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109810 Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-22 10:43:41 -07:00
Chris Wilson	db99d02fce	iris: Push heavy memchecker code to DEBUG Invoking VALGRIND_CHECK_MEM_IS_DEFINED pulls in enough code to convince gcc to not inline __gen_uint and results in a lot of packing code ending up out-of-line with lots of stack copying. To ameliorate this, only insert the check inside the packer if DEBUG is defined and instead perform the validation checking before submitting the batch to the kernel. This should give accurate results if --trace-origins=yes is used, and failing that we can recompile in full debug mode to check on insertion. Improve drawoverhead baseline by 25% with a default build with valgrind-dev installed (with effectively no loss of vg coverage). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-22 10:38:03 -07:00
Kenneth Graunke	87f865aab3	iris: Fix batch chaining map_next increment. Caught by Chris Wilson; split out from his valgrind patch.	2019-03-22 09:31:15 -07:00
Rob Clark	bf5a92811d	freedreno/ir3: disable early-z for SSBO/image writes Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil_fbo Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-22 08:53:28 -04:00
Rob Clark	dbac1a80d1	freedreno/ir3: rename has_kill to no_earlyz There are other cases where we need to disable early-z, like image writes. So rename to something more generic. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-22 08:53:28 -04:00
Rhys Perry	f736250ab4	ac/nir: implement 16-bit pack/unpack opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-22 12:50:16 +01:00
Lionel Landwerlin	87dadbce5b	vulkan/overlay: improve error reporting We can show the actual command & line where the failure happened Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-22 11:26:04 +00:00
Lionel Landwerlin	9f3727351d	vulkan/overlay: check return value of swapchain get images Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-22 11:26:01 +00:00
Lionel Landwerlin	1fbf355597	vulkan/overlay: silence validation layer warnings v2: Drop call to FreeDescriptorSet Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-22 11:25:58 +00:00
Lionel Landwerlin	de14107741	vulkan/overlay: properly register layer object with loader This is required by the validation layers if we want to validate the commands inserted by the overlay layer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-22 11:25:55 +00:00
Józef Kucia	c077d5d7de	radv: Fix driverUUID Fixes: `14cad8786a` ("radv: generate the same driver UUID as radeonsi") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-22 08:57:16 +01:00
Danylo Piliaiev	ea9bde151f	glsl: Cross validate variable's invariance by explicit invariance only 'invariant' qualifier is propagated on variables which are used to calculate other invariant variables, however when we are matching variable's declarations we should take into account only explicitly declared invariance because invariance propagation is an implementation specific detail. Thus new flag is added to ir_variable_data which indicates 'invariant' qualifier being explicitly set in the shader. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100316 Fixes: `89b60492` ('glsl: Add a pass to propagate the "invariant" and "precise" qualifiers') Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-03-21 23:28:08 -07:00
Józef Kucia	1d996ef714	mesa: Fix GL_NUM_DEVICE_UUIDS_EXT Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-22 07:37:14 +02:00
Kenneth Graunke	66c100a8d6	iris: Skip resolves and flushes altogether if unnecessary Improves drawoverhead baseline scores by 1.17x.	2019-03-21 20:28:17 -07:00
Kenneth Graunke	365886ebe1	iris: Skip framebuffer resolve tracking if framebuffer isn't dirty Improves drawoverhead baseline score by 1.86x.	2019-03-21 20:28:17 -07:00
Kenneth Graunke	1d05d24b1d	iris: Skip input resolve handling if bindings haven't changed This brings the drawoverhead 16 Tex w/ no state change score from 22% of baseline to 97% of baseline.	2019-03-21 20:28:17 -07:00
Kenneth Graunke	a342f2deb1	iris: Fix util_vma_heap_init size for IRIS_MEMZONE_SHADER Fixes assertions when disabling bucket allocators.	2019-03-21 19:07:17 -07:00
Dave Airlie	9dd92d08a5	softpipe: fix integer texture swizzling for 1 vs 1.0f The swizzling was putting float one in not integer 1. This fixes a lot of arb_texture_view-rendering-formats cases. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:35 +10:00
Dave Airlie	aae5ba72ab	softpipe: remove shadow_ref assert. I don't think this really buys us anything and TG4 with cubemap arrays falls over because sampler == 2, but otherwise works fine. Fixes: ./bin/textureGather fs shadow r CubeArray repeat on softpipe with ARB_gpu_shader5 enabled. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:29 +10:00
Dave Airlie	8dc8b1361a	softpipe: handle 32-bit bitfield inserts Fixes piglits if ARB_gpu_shader5 is enabled Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:26 +10:00
Dave Airlie	7b7cb1bc35	softpipe: fix 32-bit bitfield extract These didn't deal with the width == 32 case that TGSI is defined with. Fixes piglit tests if ARB_gpu_shader5 is enabled. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:21 +10:00
Timothy Arceri	a1bd9dd5bc	nir: fix opt_if_loop_last_continue() Rather than skipping code that looked like this: loop { ... if (cond) { do_work_1(); continue; } else { break; } do_work_2(); } Previously we would turn this into: loop { ... if (cond) { do_work_1(); continue; } else { do_work_2(); break; } } This was clearly wrong. This change checks for this case and makes sure we now leave it for nir_opt_dead_cf() to clean up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-22 09:58:18 +11:00
Gurchetan Singh	620df57dbb	anv: fix build on Nougat AHardwareBuffer is only available on O and above. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-21 15:36:39 -07:00
Gurchetan Singh	139f908d8f	anv: move anv_GetMemoryAndroidHardwareBufferANDROID up a bit No functional change, just makes the next patch a little easier. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-21 15:36:39 -07:00
Gurchetan Singh	b070861045	configure.ac / meson: depend on libnativewindow when appropriate libnativewindow is only available on O or greater, and it's required for some features. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-21 15:36:39 -07:00
Eric Anholt	bfed0a7099	v3d: Remove some dead members of struct v3d_compile. These are more vc4 leftovers.	2019-03-21 14:20:50 -07:00
Eric Anholt	16f2770eb4	v3d: Upload all of UBO[0] if any indirect load occurs. The idea was that we could skip uploading the constant-indexed uniform data and just upload the uniforms that are variably-indexed. However, since the VS bin and render shaders may have a different set of uniforms used, this meant that we had to upload the UBO for each of them. The first case is generally a fairly small impact (usually the uniform array is the most space, other than a couple of FSes in shader-db), while the second is a larger impact: 3DMMES2 was uploading 38k/frame of uniforms instead of 18k. Given that the optimization is of dubious value, has a big downside, and is quite a bit of code, just drop it. No change in shader-db. No change on 3DMMES2 (n=15).	2019-03-21 14:20:50 -07:00
Eric Anholt	320e96bace	v3d: Move constant offsets to UBO addresses into the main uniform stream. We'd end up with the constant offset in the uniform stream anyway, since they're bigger than small immediates. Avoids the extra uniforms and adds in the shader in favor of just adding once on the CPU. shader-db: total instructions in shared programs: 6496865 -> 6494851 (-0.03%) total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)	2019-03-21 14:20:50 -07:00
Eric Anholt	c36d2793ec	v3d: Rename v3d_tmu_config_data to v3d_unit_data. I want to reuse this for encoding small constant UBO/SSBO offsets into the uniform stream to reduce the extra uniform loads and adds for the small constant offsets.	2019-03-21 14:20:50 -07:00
Benjamin Gordon	b30aad552c	configure.ac/meson.build: Add options for library suffixes When building the Chrome OS Android container, we need to build copies of mesa that don't conflict with the Android system-supplied libraries. This adds options to create suffixed versions of EGL and GLES libraries: libEGL.so -> libEGL${egl-lib-suffix}.so libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so libGLESv2.so -> libGLES${gles-lib-suffix}.so This is similar to what happens when --enable-libglvnd is specified, but without the side effects of linking against libglvnd. To avoid unexpected clashes with the suffixed appended by libglvnd, make it an error to specify both --enable-libglvnd and --with-egl-lib-suffix. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-21 10:18:31 -07:00
Kenneth Graunke	e426c3a6cb	nir: Record non-vector/scalar varyings as unmovable when compacting In some cases, we can end up with varying structs that aren't split to their member variables. nir_compact_varyings attempted to record these as unmovable, so it would leave them be. Unfortunately, it didn't do it right for non-vector/scalar types. It set the mask to: ((1 << (elements * dmul)) - 1) << var->data.location_frac where elements is the number of vector elements. For structures and other non-vector/scalars, elements is 0...so the whole mask became 0. This caused nir_compact_varyings to assign other varyings on top of the structure varying's location (as it appeared to take up no space). To combat this, we just set elements to 4 for non-vector/scalar types, so that the entire slot gets marked as unmovable. Fixes KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in on iris. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-21 16:03:58 +00:00
Rob Clark	6e781a01b9	freedreno/ir3: dynamic UBO indexing vs 64b pointers Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_fragment and similar things with multiple UBOs Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	2e01c534f4	freedreno/ir3: fix bit_count Seems like it can only work 16b at a time. Fixes dEQP-GLES31.functional.shaders.builtin_functions.integer.bitcount.* TODO need to check if this limitation applies to a3xx as well. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	3d8349048b	freedreno/ir3: additional lowering For some things that show up when we expose higher glsl TODO check blob traces to see if we have instructions for some of this? I guess we don't but worth a check.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	bcd81d2387	freedreno/ir3: optimize sam.s2en to sam Detect when sampler/texture idx are immediate and switch to non s2en encoding. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	1443694ee5	freedreno/ir3: enable indirect tex/samp (sam.s2en) For now it uses indirect for everything. The next step is for the ir3_cp pass to detect the case that tex and samp idx are immediate and convert the sam instruction back to the non .s2en variant. But doing that in a following patch so we can shake out the bugs with .s2en more easily. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	1088b788d8	freedreno/ir3: find # of samplers from uniform vars When we have indirect samplers, we cannot tell the max sampler referenced. Instead just refer to the number of sampler uniforms. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	d4cbc94685	nir: move gls_type_get_{sampler,image}_count() I need at least the sampler variant in ir3.. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-21 09:13:05 -04:00
Rob Clark	8eb16ae8bf	freedreno/ir3: fix regmask for merged regs On a6xx+ with half-regs conflicting with full-regs, the legalize pass needs to set appropriate sync bits, such as (sy), on writes to full regs that conflict with half regs, and visa-versa. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	1dffb089f9	freedreno/ir3: fix sam.s2en encoding Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	45b7a581b4	freedreno/ir3: fix sam.s2en decoding Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	2d31cf9d3b	freedreno/ir3/ra: fix half-class conflicts On a6xx, half-regs conflict with full-regs. But we were only setting up conflicts for the first class (ie. scalar, but not hvec2/hvec3/hvec4), resulting in higher half-reg classes getting assigned to regs that overwrite full-regs. Noticed while trying to enable indirect-sampler (sam.s2en) which uses an hvec2 argument to pass the sampler/tex index. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Rob Clark	cc5ca9391c	freedreno/ir3 better cat6 encoding detection These two bits seem to be a better way to detect which encoding we are looking at. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-21 09:13:05 -04:00
Samuel Pitoiset	00327f827f	ac: fix incorrect argument type for tbuffer.{load,store} with LLVM 7 GLC/SLC are boolean. This fixes the following LLVM error when checkir is set: Intrinsic has incorrect argument type! void (i32, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32)* @llvm.amdgcn.tbuffer.store.i32 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-03-21 14:02:00 +01:00
Samuel Pitoiset	20cac1f498	ac: fix 16-bit shifts This fixes the following LLVM error when ckeckir is set: Type too small for ZExt Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-03-21 14:01:58 +01:00
Samuel Pitoiset	2ac5c5c1b5	ac: add 16-bit support to fract Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 12:13:09 +01:00
Samuel Pitoiset	0eb1478ac2	ac: add 16-bit support fo fsign Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 12:13:07 +01:00
Samuel Pitoiset	ff11c9dcc7	ac: add f16_0 and f16_1 constants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 12:13:05 +01:00
Timothy Arceri	427a6fee43	nir: only override previous alu during loop analysis if supported Users of this function expect alu to be a supported comparision if the induction variable is not NULL. Since we attempt to override the return values if the first limit is not a const, we must make sure we are dealing with a valid comparision before overriding the alu instruction. Fixes an unreachable in inverse_comparison() with the game Assasins Creed Odyssey. Fixes: `3235a942c1` ("nir: find induction/limit vars in iand instructions") Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110216	2019-03-21 21:51:21 +11:00
Michel Dänzer	6d0a7f798c	gitlab-ci: Use 8 CPU cores in autotools job This cuts down the job runtime from ~9.5 to ~7 minutes with my personal runner on an 8-core Ryzen 7 1700. While this might result in slightly higher load on shared runners, it should be OK, since libtool doesn't use the CPU cores as effectively as e.g. ninja does; a significant part of the CPU load tends to be in bash processes at any time, which should be relatively light on memory. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-21 09:58:31 +01:00
Michel Dänzer	a2cce701e6	gitlab-ci: List some longer-running jobs before others of the same stage This increases the chance of them running earlier, which can have an impact on the total duration of the pipeline. v2: * Minor style fix-up to moved comment (Eric Anholt) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-03-21 09:55:08 +01:00
Samuel Pitoiset	db07f0554a	radv: add missing initializations since VK_EXT_pipeline_creation_feedback This fixes the world. Fixes: `5f5ac19f13` ("radv: Implement VK_EXT_pipeline_creation_feedback.")" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:42:31 +01:00
Rhys Perry	037f11d42e	radv: enable VK_KHR_8bit_storage Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:27 +01:00
Rhys Perry	3cc72a88d8	ac/nir: implement 8-bit conversions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:25 +01:00
Rhys Perry	c73f8b6576	ac/nir: add 8-bit types to glsl_base_to_llvm_type Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:22 +01:00
Rhys Perry	9c5067acf1	ac/nir: implement 8-bit ssbo stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:20 +01:00
Samuel Pitoiset	b235d77e18	ac: add ac_build_tbuffer_store_byte() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:18 +01:00
Rhys Perry	b12e074b89	ac/nir: implement 8-bit push constant, ssbo and ubo loads Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:16 +01:00
Samuel Pitoiset	104dbc64a5	ac: add ac_build_tbuffer_load_byte() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:14 +01:00
Samuel Pitoiset	6e632eb24b	ac: add various int8 definitions Original patch by Rhys Perry. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:10 +01:00
Tapani Pälli	4e1bbb000c	anv/radv: release memory allocated by glsl types during spirv_to_nir Fixes leaks for each glsl_type generated: ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18 ==32470== at 0x483880B: malloc (vg_replace_malloc.c:309) ==32470== by 0x4C43F4A: ralloc_size (ralloc.c:119) ==32470== by 0x4C44014: rzalloc_size (ralloc.c:151) ==32470== by 0x4C44258: rzalloc_array_size (ralloc.c:215) ==32470== by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:114) ==32470== by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:1146) ==32470== by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501) ==32470== by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269) ==32470== by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018) ==32470== by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365) ==32470== by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490) ==32470== by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173) v2: move release call to vkDestroyInstance v3: apply fix also to radv driver Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-21 08:30:22 +02:00
Jason Ekstrand	6e19348ad1	spirv: Drop inline tg4 lowering Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Jason Ekstrand	08f804ec0c	anv,radv,turnip: Lower TG4 offsets with nir_lower_tex v2: turn on for turnip as well (Karol Herbst) Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Karol Herbst	d8a0658d8b	nir/lower_tex: Add support for tg4 offsets lowering Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Karol Herbst	99f202432b	nv50/ir/nir: support gather offsets v2: only emit offsets if those are !0 Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Karol Herbst	71c66c254b	nir: add support for gather offsets Values inside the offsets parameter of textureGatherOffsets are required to be constants in the range of [GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET, GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET]. As this range is never outside [-32, 31] for all existing drivers inside mesa, we can simply store the offsets as a int8_t[4][2] array inside nir_tex_instr. Right now only Nvidia hardware supports this in hardware, so we can turn this on inside Nouveau for the NIR path as it is already enabled with the TGSI one. v2: use memcpy instead of for loops add missing bits to nir_instr_set don't show offsets if they are all 0 v3: default offsets aren't all 0 v4: rename offsets -> tg4_offsets rename nir_tex_instr_has_explicit_offsets -> nir_tex_instr_has_explicit_tg4_offsets Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Dave Airlie	b95b33a5c7	nir/deref: remove casts of casts which are likely redundant (v3) Not sure how ptr_stride should be taken into account if at all here v2: reorder check to avoid src walking (Jason) v3: remove is_cast_cast checks, keep going afterwards (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-21 10:58:06 +10:00
Dave Airlie	3b3653c4cf	nir/spirv: don't use bare types, remove assert in split vars for testing For OpenCL we never want to strip the info from the types, and it makes type comparisons easier in later stages. We might later need a nir pass to strip this for GLSL, but so far the only regression is the assert and Jason said removing that is fine. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-03-21 10:25:40 +10:00
Rafael Antognolli	e7c8402163	iris: Let blorp update the clear color for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:26 -07:00
Rafael Antognolli	93123417dd	iris: Track fast clear color. v2: Update tracked clear color when we update the surface state. v3: Update all aux surface states when updating the clear color. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:26 -07:00
Rafael Antognolli	5658c661de	iris: Stall on the CPU and resolve predication during fast clears. Only if the clear color/depth is changing. In those cases, it's hard to keep track of the current clear color, and aux state of some layers, when predication is enabled. So simplify everything by stalling on the few cases where we would have a fast clear color change with predication. v2: - fix comment (Ken) - explicitly check for predicate state after resolving it (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:26 -07:00
Rafael Antognolli	ce830a364e	iris: Add iris_resolve_conditional_render(). This function can be used to stall on the CPU and resolve the predicate for the conditional render. It will convert ice->state.predicate from IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query. v2: - return void (Ken) - update the stored condition (Ken) - simplify the code leading to resolve the predicate (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	131b42f0aa	iris: Implement fast clear color. If all the restrictions are satisfied, do a fast clear instead of regular clear. v2: - add perf_debug() when we can't fast clear (Ken) - improve comment: s/miptree/resource/ (Ken) - use swizzle_color_value from blorp (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	bd6f51ec21	intel/blorp: Make swizzle_color_value public. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	d97eddff25	intel/isl: Add isl_format_has_color_component() function. v2: Get luminance bits from luminance component (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	7f6344a726	iris: Bring back check for srgb and fast clear color. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	a8b5ea8ef0	iris: Add function to update clear color in surface state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	32c8fa6411	iris: Add helper to convert fast clear color. It needs to be converted to a value that can be used by ISL (and our hardware SURFACE_STATE structure). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	51638cf18a	iris: Fast clear depth buffers. Check and do a fast clear instead of a regular clear on depth buffers. v3: - remove swith with some cases that we shouldn't wory about (Ken) - more parens into the has_hiz check (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	34d00b4410	iris: Use the clear depth when emitting 3DSTATE_CLEAR_PARAMS. Take the clear depth into account when IRIS_DIRTY_DEPTH_BUFFER is marked as dirty. Also update the blorp surface clear color. v2: Use a single if (zres && zres->aux.bo) (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	37f2692591	iris: Allocate buffer space for the fast clear color. Also store clear color in the iris_resource. Always allocate clear color state buffer. v2: - Make clear_color_offset be 64 bits (Ken). - Simplify the logic to decide when to memset the aux buffer (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Bas Nieuwenhuizen	5f5ac19f13	radv: Implement VK_EXT_pipeline_creation_feedback. Does what it says on the tin. The per stage time is only an approximation due to linking and the Vega merged stages. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-20 21:19:46 +00:00
Samuel Pitoiset	72e366b4c2	ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword() New buffer intrinsics have a separate soffset parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:19 +01:00
Samuel Pitoiset	9d960c17a8	ac: use new LLVM 8 intrinsic when storing 16-bit values vindex is always 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:14 +01:00
Samuel Pitoiset	2a9d331898	ac: add ac_build_{struct,raw}_tbuffer_store() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:12 +01:00
Samuel Pitoiset	30c2aca67f	ac: use new LLVM 8 intrinsics in ac_build_buffer_load() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:08 +01:00
Samuel Pitoiset	da46dbb1be	ac/nir: use ac_build_buffer_store_dword() for SSBO store operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:06 +01:00
Samuel Pitoiset	6b573c00c9	ac/nir: use ac_build_buffer_load() for SSBO load operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:02 +01:00
Samuel Pitoiset	29132af234	ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations Use the raw version (ie. IDXEN=0) because vindex is unused. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:56 +01:00
Samuel Pitoiset	b39844457f	ac/nir: remove one useless check in visit_store_ssbo() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:54 +01:00
Samuel Pitoiset	a2073f49f1	ac: add ac_build_buffer_store_format() helper Similar to ac_build_buffer_load_format(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:50 +01:00
Samuel Pitoiset	4debe49d44	ac/nir: set attrib flags for SSBO and image store operations For consistency regarding other store operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:37 +01:00
Samuel Pitoiset	1b553dd47f	ac: make use of ac_get_store_intr_attribs() where possible Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:35 +01:00
Dylan Baker	4188dd7879	bin/install_megadrivers.py: Correctly handle DESTDIR='' Currently if destdir is set to '' then the resulting libdir will have it's first character replaced by / instead of / being prepended to the string. This was the result of ensuring that that DESTDIR wouldn't be ignored if libdir was absolute, since the only cases that meson allows the libdir to be absolute is if the prefix is /, this won't be a problem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110211 Fixes: `ae3f45c11e` ("bin/install_megadrivers: fix DESTDIR and -D*-path") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-20 20:26:44 +00:00
Juan A. Suarez Romero	efcf9c9f9f	nir: deref only for OpTypePointer Fixes dEQP-VK.binding_model.buffer_device_address.* and dEQP-VK.ssbo.phys.layout* Vulkan CTS tests. v2: set val->type->stride in the section below (Jason) v3: restore val->type->type to original place (Jason) Fixes: `d0ba326f23` ("nir/spirv: support physical pointers") CC: Karol Herbst <kherbst@redhat.com> CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-20 19:26:32 +00:00
Dave Airlie	04189565a0	softpipe: fix texture view crashes I noticed we crashed piglit arb_texture_view-rendering-formats when run on softpipe. This fixes the clear tiles to use the surface format not the underlying storage format. This fixes a bunch of srgb piglits as well. Fixes: `396ac41fc2` (softpipe: add integer support) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-03-21 05:06:07 +10:00
Kenneth Graunke	3c3f250456	nvc0: Skip new update barrier bits I added new barrier bits in `220c1dce1e` and made most drivers skip them. I thought nvc0 was already skipping those but missed the else case here, which does something. So make it explicitly skip like I did everywhere else. Thanks to Ilia for catching this. Fixes: `220c1dce1e` gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.	2019-03-20 10:30:32 -07:00
Lionel Landwerlin	6601e5d6fc	anv: implement VK_EXT_pipeline_creation_feedback An extension reporting cache hit in the user supplied pipeline cache as well as timing information for creating the pipelines & stages. v2: Don't consider no cache for cache hits (Jason) Rework duration accumulation (Jason) v3: Fold feedback creation writing into pipeline compile functions (Jason/Lionel) v4: Get cache hit information from anv_device_search_for_kernel() (Jason) Only set cache hit from the whole pipeline if all stages also have that bit (Lionel) v5: Always user_cache_hit in anv_device_search_for_kernel() (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-20 16:18:35 +00:00
Rob Clark	70904eb99a	freedreno/ir3/a6xx: fix ssbo comp_swap One line left out of the conversion to ir3 ssbo intrinsics on a6xx. Fixes: `2e4525883f` ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsics Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-20 11:48:13 -04:00
Jason Ekstrand	0b7e5bdbd4	nir: Constant values are per-column not per-component Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-03-20 09:26:56 -05:00
Jason Ekstrand	9a129510f5	anv: Bump maxComputeWorkgroupInvocations We initially set this lower because we didn't have SIMD32 support yet but we've supported SIMD32 for quite some time now. We should bump it up to the real limit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-20 09:26:56 -05:00
Samuel Pitoiset	4fa61273a8	radv: fix binding transform feedback buffers The mask should be accumulated if two calls are used for binding two buffers at different indexes. Otherwise, the driver only accounts for the last one. Noticed while glancing at this code. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 09:06:40 +01:00
Samuel Pitoiset	f4f0e3a395	ac: use llvm.amdgcn.fract intrinsic for nir_op_ffract Noticed with a Doom shader. 29077 shaders in 15096 tests Totals: SGPRS: 1282125 -> 1282133 (0.00 %) VGPRS: 908716 -> 908616 (-0.01 %) Spilled SGPRs: 24811 -> 24779 (-0.13 %) Code Size: 49048176 -> 48936488 (-0.23 %) bytes Max Waves: 244232 -> 244226 (-0.00 %) Totals from affected shaders: SGPRS: 229584 -> 229592 (0.00 %) VGPRS: 163268 -> 163168 (-0.06 %) Spilled SGPRs: 8682 -> 8650 (-0.37 %) Code Size: 12819572 -> 12707884 (-0.87 %) bytes Max Waves: 24398 -> 24392 (-0.02 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 09:06:35 +01:00
Kenneth Graunke	220c1dce1e	gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits. The glMemoryBarrier() function makes shader memory stores ordered with respect to things specified by the given bits. Until now, st/mesa has ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT, saying that drivers should implicitly perform the needed flushing. This seems like a pretty big assumption to make. Instead, this commit opts to translate them to new PIPE_BARRIER bits, and adjusts existing drivers to continue ignoring them (preserving the current behavior). The i965 driver performs actions on these memory barriers. Shader memory stores go through a "data cache" which is separate from the render cache and other read caches (like the texture cache). All memory barriers need to flush the data cache (to ensure shader memory stores are visible), and possibly invalidate read caches (to ensure stale data is no longer visible). The driver implicitly flushes for most caches, but not for data cache, since ARB_shader_image_load_store introduced MemoryBarrier() precisely to order these explicitly. I would like to follow i965's approach in iris, flushing the data cache on any MemoryBarrier() call, so I need st/mesa to actually call the pipe->memory_barrier() callback. Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on the iris driver. Roland said this looks reasonable to him. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-19 23:43:33 -07:00
Tapani Pälli	3e534489ec	iris: mark switch case fallthrough CID: 1444103 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 08:21:50 +02:00
Tapani Pälli	03cbfbd913	iris: initialize num_cbufs Currently initialized only if 'ish' is non-NULL. CID: 1444106 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 08:20:09 +02:00
Daniel Stone	d258b787fa	panfrost: Properly align stride Handle buffers whose width is not aligned to 16px by padding the stride and storing it accordingly. This does not reject imports for images whose stride is not sufficiently aligned. v2: make sure bo->stride is set on imported buffers, and add missing variable definition. (Tomeu) Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-20 04:20:42 +00:00
Anuj Phogat	2be60e0c73	anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-19 14:42:19 -07:00
Anuj Phogat	85ecd14ef6	i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-19 14:42:02 -07:00
Eric Engestrom	b3aa37046b	gitlab-ci: drop most autotools builds With autotools this close to being not supported anymore, let's not waste half of the CI cycles on it. The default build will catch most issues, and the rest can be tested by the old Travis. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-19 17:40:05 +00:00
Eric Anholt	17115da6ad	v3d: Expose the dma-buf modifiers query. This allows DRI3 to pick between UIF and raster according to whether we're pageflipping or not and whether the pageflipping display can do UIF, avoiding copies for the windowed/composited case that previously was forced to linear. Improves windowed glmark2 -b build:use-vbo=false performance by 30.7783% +/- 13.1719% (n=3)	2019-03-19 08:59:01 -07:00
Eric Anholt	bf6973199d	v3d: Allow the UIF modifier with renderonly. We ask the other side to make a buffer with the right number of pages, and then just store the UIF in it. This avoids an extra silent copy of the buffer from linear to UIF if it gets used for texturing (X11 copy-based swapbuffers, GL compositors).	2019-03-19 08:54:46 -07:00
Eric Anholt	eb5903a908	v3d: Always lay out shared tiled buffers with UIF_TOP set. The samplers are already ready for this, we just needed to make sure that layout chose UIF for level 0.	2019-03-19 08:54:46 -07:00
Andres Gomez	ab28dca033	Revert "glsl: relax input->output validation for SSO programs" This reverts commit `1aa5738e66`. This patch incorrectly asumed that for SSOs no inner interface matching check was needed. From the ARB_separate_shader_objects spec v.25: " With separable program objects, interfaces between shader stages may involve the outputs from one program object and the inputs from a second program object. For such interfaces, it is not possible to detect mismatches at link time, because the programs are linked separately. When each such program is linked, all inputs or outputs interfacing with another program stage are treated as active. The linker will generate an executable that assumes the presence of a compatible program on the other side of the interface. If a mismatch between programs occurs, no GL error will be generated, but some or all of the inputs on the interface will be undefined." This completes the fix from commit: `3be05dd267` ("glsl/linker: don't fail non static used inputs without matching outputs") Fixes: `1aa5738e66` ("glsl: relax input->output validation for SSO programs") Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-19 17:36:20 +02:00
Andres Gomez	422882e78f	glsl/linker: simplify xfb_offset vs xfb_stride overflow check Current implementation uses a complicated calculation which relies in an implicit conversion to check the integral part of 2 division results. However, the calculation actually checks that the xfb_offset is smaller or a multiplier of the xfb_stride. For example, while this is expected to fail, it actually succeeds: " ... layout(xfb_buffer = 2, xfb_stride = 12) out block3 { layout(xfb_offset = 0) vec3 c; layout(xfb_offset = 12) vec3 d; // ERROR, requires stride of 24 }; ... " Fixes: `2fab85aaea` ("glsl: add xfb_stride link time validation") Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-19 17:23:27 +02:00
Andres Gomez	3be05dd267	glsl/linker: don't fail non static used inputs without matching outputs If there is no Static Use of an input variable, the linker shouldn't fail whenever there is no defined matching output variable in the previous stage. From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec: " Only the input variables that are statically read need to be written by the previous stage; it is allowed to have superfluous declarations of input variables." Now, we complete this exception whenever the input variable has an explicit location. Previously, `18004c338f` ("glsl: fail when a shader's input var has not an equivalent out var in previous") took care of the cases in which the input variable didn't have an explicit location. v2: do the location based interface matching check regardless on whether it is a separable program or not (Ilia). Fixes: `1aa5738e66` ("glsl: relax input->output validation for SSO programs") Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-19 17:23:27 +02:00
Andres Gomez	de1bc2d19a	glsl/linker: always validate explicit location among inputs Outputs are always validated when having explicit locations and we were trusting its outcome to catch similar problems with the inputs since, in case of having undefined outputs for existing inputs, we would be already reporting a linker error. However, consider this case: " Shader stage n: --------------- ... layout(location = 0) out float a; ... Shader stage n+1: ----------------- ... layout(location = 0) in float b; layout(location = 0) in float c; ... " Currently, this won't report a linker error even though location aliasing is happening for the inputs. Therefore, we also need to validate the inputs independently from the outcome of the outputs validation. Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-19 17:23:27 +02:00
Andres Gomez	a96093136b	glsl: correctly validate component layout qualifier for dvec{3,4} From page 62 (page 68 of the PDF) of the GLSL 4.50 v.7 spec: " A dvec3 or dvec4 can only be declared without specifying a component." Therefore, using the "component" qualifier with a dvec3 or dvec4 should result in a compiling error. v2: enhance the error message (Timothy). Fixes: `94438578d2` ("glsl: validate and store component layout qualifier in GLSL IR") Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-19 17:23:27 +02:00
Jason Ekstrand	cbfe31ccbe	Revert "nir: const `nir_call_instr::callee`" This reverts commit `db57db5317`. When building IR, nothing is really immutable and, since C has no concept of constness propagating beyond the first pointer, we have to be vary careful with how we use it. To just throw const into a function like this is a lie. Instead, we should just drop the unneeded const in spirv_to_nir which this commit does along with the revert.	2019-03-19 10:19:42 -05:00
Eric Engestrom	43b6dd05f7	gitlab-ci: add clang build `clang` has a different set of warnings and errors than `gcc`, so it's useful to do at least a generic pass over Mesa with it. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-19 12:59:38 +00:00
Eric Engestrom	db57db5317	nir: const `nir_call_instr::callee` Fixes: `c95afe56a8` "nir/spirv: handle kernel function parameters" Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-03-19 12:51:53 +00:00
Rafael Antognolli	76f9ca6cf9	iris: Make intel_hiz_exec public. Need to use it for fast clearing depth buffers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-18 22:27:02 -07:00
Rafael Antognolli	9c63ec26ea	iris: Enable HiZ for multisampled depth surfaces. Fix this check so that we can get a HiZ aux buffer for multisampled surfaces as well. Also make sure we don't try to emit a sampler view surface state for multisampled depth sufaces with HiZ enabled, as the sampler can't HiZ for multisampled buffers and isl would assert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-18 22:21:30 -07:00
Karol Herbst	d0ba326f23	nir/spirv: support physical pointers v2: add load_kernel_input Signed-off-by: Karol Herbst <kherbst@redhat.com> squash! nir/spirv: support physical pointers	2019-03-19 04:08:07 +00:00
Karol Herbst	c95afe56a8	nir/spirv: handle kernel function parameters the idea here is to generate an entry point stub function wrapping around the actual kernel function and turn all parameters into shader inputs with byte addressing instead of vec4. This gives us several advantages: 1. calling kernel functions doesn't differ from calling any other function 2. CL inputs match uniforms in most ways and we can just take advantage of most of nir_lower_io v2: move code into a seperate function v3: verify the entry point got a name fix minor typo v4: make vtn_emit_kernel_entry_point_wrapper take the old entry point as an arg Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-19 04:08:07 +00:00
Karol Herbst	0ccdf23a57	nir/lower_locals_to_regs: cast array index to 32 bit local memory is too small to require 64 bit pointers, so cast the array index to a 32 bit value to save up on 64 bit operations. Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-19 04:08:07 +00:00
Karol Herbst	44d32e62fb	glsl: add cl_size and cl_alignment Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-03-19 04:08:07 +00:00
Karol Herbst	659f333b3a	glsl: add packed for struct types We need this for OpenCL kernels because we have to apply C rules for alignment and padding inside structs and for this we also have to know if a struct is packed or not. v2: fix for kernel params Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-03-19 04:08:07 +00:00
Alyssa Rosenzweig	b98955e128	panfrost: Rewrite varying assembly There are two stages to varying assembly in the command stream: creating the varying buffers in the command stream, and creating the varying meta descriptors (also in the command stream) linked to the aforementioned buffers. The previous code for this was ad hoc and brittle, making some invalid assumptions causing unmaintainable workarounds to pile up across the driver (both compiler and command stream side). This patch completely rewrites the varying assembly code. There's a trivial performance penalty (we now memcpy the varying meta to the command stream on draw, rather than on compile). That said, the improvement in flexibility and clarity is well-worth it. The motivator for these changes was support for gl_PointCoord (and eventually point sprites for legacy GL), which was impossible to implement with the old varying assembly code. With the new refactor, it's super easy; support for gl_PointCoord is included with this patch. All in all, I'm quite happy with how this turned out. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-19 03:55:10 +00:00
Alyssa Rosenzweig	5e6d33a7b6	panfrost: Replay more varying buffers This is required for gl_PointCoord to show up on decodes. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-19 03:53:56 +00:00
Alyssa Rosenzweig	b517e36842	panfrost/decode: Respect primitive size pointers Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-19 03:53:48 +00:00
Alyssa Rosenzweig	4f89e4437c	panfrost: Disable PIPE_CAP_TGSI_TEXCOORD I don't know why this was on to begin with...? Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-19 03:52:43 +00:00
Alyssa Rosenzweig	7c02c4f114	panfrost: Fix primconvert check In addition to fixing actual primconvert bugs, this prevents an infinite loop when trying to draw POINTS. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-19 03:52:20 +00:00
Alyssa Rosenzweig	60d5b85261	panfrost: Workaround buffer overrun with mip level Mipmaps are still broken, but at least this way we don't crash on some apps using mipmaps. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-19 03:50:59 +00:00
Bas Nieuwenhuizen	a777c3d7cb	radv: Use correct image view comparison for fast clears. The if is actually returning true on success, enabling fast clears, so we need to have the test succeed when the iview dimensions are right. Fixes: `d5400a5ec2` "radv: provide a helper for comparing an image extents." Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-03-19 00:39:47 +01:00
Jason Ekstrand	493b3ada9b	anv,radv: Implement VK_KHR_surface_capability_protected Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-03-18 17:02:10 +00:00
Danylo Piliaiev	ecb98c6898	anv: Treat zero size XFB buffer as disabled Vulkan spec doesn't explicitly forbid zero size transform feedback buffers. Having zero size xfb caused SurfaceSize overflow and triggered assert in debug build. The only way to have zero size SO_BUFFER is to disable SO_BUFFER as stated in hardware spec. From SKL PRM, Vol 2a, "3DSTATE_SO_BUFFER": "If set, stream output to SO Buffer is enabled, if 3DSTATE_STREAMOUT::SO Function ENABLE is also enabled. If clear, the SO Buffer is considered "not bound" and effectively treated as a zero- length buffer for the purposes of SO output and overflow detection. If an enabled stream's Stream to Buffer Selects includes this buffer it is by definition an overflow condition. That stream will cause no writes to occur, and only SO_PRIM_STORAGE_NEEDED[<stream>] will increment." Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-18 16:09:42 +00:00
Emil Velikov	f5b71b18ef	docs: update calendar, add news item and link release notes for 18.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-18 16:02:27 +00:00
Emil Velikov	d4e26b36b2	docs: add sha256 checksums for 18.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ec770b43b9`)	2019-03-18 15:58:06 +00:00
Emil Velikov	cb9fe1e89b	docs: add release notes for 18.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `022708cb40`)	2019-03-18 15:58:05 +00:00
Bas Nieuwenhuizen	d1aa37dfff	radv: Implement VK_EXT_host_query_reset. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-03-18 14:48:41 +00:00
Jason Ekstrand	887041c763	anv: Implement VK_EXT_host_query_reset Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-18 14:48:41 +00:00
Bas Nieuwenhuizen	42ea88c673	vulkan: Update the XML and headers to 1.1.104 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-18 14:48:41 +00:00
Bas Nieuwenhuizen	eb5cda1c3e	vulkan/util: Handle enums that are in platform-specific headers. VkFullScreenExclusiveEXT comes from the win32 header. Mostly took the logic from the entrypoint scripts: 1) If there is an ext that has it in the requires and has a platform, take the guard for that platform. 2) Otherwise assume it is from the core headers. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-03-18 14:48:41 +00:00
Lionel Landwerlin	5abe488d18	vulkan: factor out wsi dependencies In commit `530927d3f6` ("vulkan/util: generate instance/device dispatch tables") we started generating instance dispatch tables some of them (like wayland) require external headers. This commit moves the dependencies up one level so that they apply the whole vulkan directory. We use them for both the util & overlay layer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `530927d3f6` ("vulkan/util: generate instance/device dispatch tables") Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-03-18 12:05:13 +00:00
Tapani Pälli	791198a54b	android: Build fixes for OMR1 Some of the header file locations are changed between Android versions (when VNDK is used), patch makes sure we get all the required headers. v2: cleanups, put SDK version checks in all places (Tapani) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Chen Lin Z <lin.z.chen@intel.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-18 11:53:59 +02:00
Bas Nieuwenhuizen	8ebc7dcb59	radv: Allow fast clears with concurrent queue mask for some layouts. For VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL and VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL we do not care about the queue mask because 1) using these is only allowed on the gfx queue 2) transitions for these are only allowed on the gfx queue. This enables some fast clears for Doom that uses VK_SHARING_MODE_CONCURRENT. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-18 09:10:55 +00:00
Kenneth Graunke	d5974aeeae	iris: Slightly better bounds on buffer sizes	2019-03-18 01:39:43 -07:00
Kenneth Graunke	836b47ca4e	iris: Don't flush the batch for unsynchronized mappings I messed this up when adding the GPU copy path.	2019-03-18 01:02:18 -07:00
Tapani Pälli	a1cd0040b6	isl: fix automake build when sse41 is not supported Fixes: `864cc419eb` "intel/isl: move tiled_memcpy static libs from i965 to isl" Cc: mesa-stable@lists.freedesktop.org Reported-by: Milav Soni <milav.soni@teqdiligent.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-18 08:51:57 +02:00
Brian Paul	f7332fbc08	gallium/util: remove pipe_sampler_view_release() It's no longer used. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	c473090b09	i915g: remove calls to pipe_sampler_view_release() As with previous patches for svga, llvmpipe, swr drivers. Compile tested only. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	768b770a86	swr: remove call to pipe_sampler_view_release() As with svga, llvmpipe drivers in previous patches. Compile tested only. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	2ff2a58774	llvmpipe: stop using pipe_sampler_view_release() This was used to avoid freeing a sampler view which was created by a context that was already deleted. But the state tracker does not allow that. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	a7afab7952	svga: stop using pipe_sampler_view_release() This function was used in the past to avoid deleting a sampler view for a context that no longer exists. But the Mesa state tracker ensures that cannot happen. Use the standard refcounting function instead. Also, remove the code which checked for context mis-matches in svga_sampler_view_destroy(). It's no longer needed since implementing the zombie sampler view code in the state tracker. Testing Done: google chrome, variety of GL demos/games Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	20de0359b5	st/mesa: stop using pipe_sampler_view_release() In all instances here we can replace pipe_sampler_view_release(pipe, view) with pipe_sampler_view_reference(view, NULL) because the views in question are private to the state tracker context. So there's no danger of freeing a sampler view with the wrong context. Testing done: google chrome, misc GL demos, games Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	41c4c49463	st/mesa: implement "zombie" shaders list As with the preceding patch for sampler views, this patch does basically the same thing but for shaders. However, reference counting isn't needed here (instead of calling cso_delete_XXX_shader() we call st_save_zombie_shader(). The Redway3D Watch is one app/demo that needs this change. Otherwise, the vmwgfx driver generates an error about trying to destroy a shader ID that doesn't exist in the context. Note that if PIPE_CAP_SHAREABLE_SHADERS = TRUE, then we can use/delete any shader with any context and this mechanism is not used. Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos and a few Linux games. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	593e36f956	st/mesa: implement "zombie" sampler views (v2) When st_texture_release_all_sampler_views() is called the texture may have sampler views belonging to several contexts. If we unreference a sampler view and its refcount hits zero, we need to be sure to destroy the sampler view with the same context which created it. This was not the case with the previous code which used pipe_sampler_view_release(). That function could end up freeing a sampler view with a context different than the one which created it. In the case of the VMware svga driver, we detected this but leaked the sampler view. This led to a crash with google-chrome when the kernel module had too many sampler views. VMware bug 2274734. Alternately, if we try to delete a sampler view with the correct context, we may be "reaching into" a context which is active on another thread. That's not safe. To fix these issues this patch adds a per-context list of "zombie" sampler views. These are views which are to be freed at some point when the context is active. Other contexts may safely add sampler views to the zombie list at any time (it's mutex protected). This avoids the context/view ownership mix-ups we had before. Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos a few Linux games. If anyone can recomment some other multi-threaded, multi-context GL apps to test, please let me know. v2: avoid potential race issue by always adding sampler views to the zombie list if the view's context doesn't match the current context, ignoring the refcount. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-03-17 20:07:22 -06:00
Brian Paul	e547a1ccb5	docs: link to the meson_options.txt file gitlab.freedesktop.org	2019-03-17 20:07:22 -06:00
Brian Paul	16fb82d189	docs: separate information for compiler selection and compiler options Split up the "Environment Variables" section into "Compiler Options" and "Compiler Specification". I think this makes the information easier to find and understand.	2019-03-17 20:07:22 -06:00
Mauro Rossi	bfba0ecc1c	android: nouveau: add support for nir Add the necessary build rules for android, to avoid building errors. Fixes: `f014ae3` ("nouveau: add support for nir") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-03-18 00:29:39 +01:00
Timothy Arceri	010570c8e3	ac/nir_to_llvm: add assert to emit_bcsel() nir to llvm assumes we have already split vectors to scalars via nir_lower_alu_to_scalar(). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-18 09:39:04 +11:00
Timothy Arceri	de8ec6e117	radeonsi/nir: call some more var optimisation passes shader-db results (VEGA64): Totals from affected shaders: SGPRS: 5328912 -> 5329680 (0.01 %) VGPRS: 2969308 -> 2969164 (-0.00 %) Spilled SGPRs: 37921 -> 37917 (-0.01 %) Spilled VGPRs: 32882 -> 29024 (-11.73 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 1400 -> 1200 (-14.29 %) dwords per thread Code Size: 121126000 -> 121282784 (0.13 %) bytes LDS: 1501 -> 1501 (0.00 %) blocks Max Waves: 933188 -> 933229 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-18 09:29:40 +11:00
Tobias Klausmann	29179f58c6	vulkan/util: meson build - add wayland client include Without this the build breaks with: In file included from ../src/vulkan/util/vk_util.h:32, from ../src/vulkan/util/vk_util.c:28: ../include/vulkan/vulkan.h:51:10: fatal error: wayland-client.h: No such file or directory #include <wayland-client.h> ^~~~~~~~~~~~~~~~~~ compilation terminated. The above misses the include directory for wayland: -I/usr/include/wayland Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-17 17:55:29 +00:00
Karol Herbst	58376c6b9b	nv50ir/nir: move immediates before use Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 17:14:54 +01:00
Karol Herbst	4ded1cdef9	nv50/ir/nir: handle user clip planes for each emitted vertex v9: convert to C++ style comments handle for tess eval shaders as well Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 17:14:21 +01:00
Karol Herbst	b866012f7b	nv50/ir/nir: implement intrinsic shader_clock v9: mark as fixed Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	c00d45cb45	nv50/ir/nir: implement load_per_vertex_output v4: use smarter getIndirect helper use new getSlotAddress helper v5: use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	9c44f4e043	nv50/ir/nir: add memory barriers v5: add more barrier intrinsics Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	753ae68ca0	nv50/ir/nir: implement images v3: fix compiler warnings v4: use loadFrom helper v5: fix signed min/max v6: set tex mask add support for indirect image access set cache mode v7: make compatible with `884d27bcf6` rework the whole deref thing to prepare for bindless v8: port to deref instructions don't require C++11 features v9: implement MS images rebase on master (image modifiers) fix regressions due to variable src compnents replace '(*it).' with 'it->' convert to C++ style comments Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	2cdcb364f0	nv50/ir/nir: implement ssbo intrinsics v4: use loadFrom helper v5: support indirect buffer access v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	8dca02955a	nv50/ir/nir: implement nir_intrinsic_load_ubo v4: use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	1bef2b7bf5	nv50/ir/nir: implement geometry shader nir_intrinsics v4: use smarter getIndirect helper use new getSlotAddress helper use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	d2de40f07e	nv50/ir/nir: implement variable indexing We store those arrays in local memory and reserve some space for each of the arrays. With NIR we could store those arrays packed, but we don't do that yet as it causes MemoryOpt to generate unaligned memory accesses. v3: use fixed size vec4 arrays until we fix MemoryOpt v4: fix for 64 bit types v5: use loadFrom helper v8: don't require C++11 features v9: convert to C++ style comments Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	fa361a3c1e	nv50/ir/nir: implement vote and ballot v2: add vote_eq support use the new subop intrinsic helper add ballot v3: add read_(first_)invocation v8: handle vectorized intrinsics don't require C++11 features v9: lower_subgroups to 32 bit (produces less instructions) use getSSA and getScratch instead of new_LValue Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	4dec7f81e0	nv50/ir/nir: add skeleton getOperation for intrinsics v7: don't assert in default case for getSubOp Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	bb032d8b62	nv50/ir/nir: implement nir_instr_type_tex a lot of those fields are not valid for a lot of tex ops. Not quite sure if it's worth the effort to check for those or just keep it like that. It seems to kind of work. v2: reworked offset handling add tex support with indirect R/S arguments handle GLSL_SAMPLER_DIM_EXTERNAL drop reference in convert(glsl_sampler_dim&, bool, bool) fix tg4 component selection v5: fill up coords args with scratch values if coords provided is less than TexTarget.getArgCount() v7: prepare for bindless_texture support v8: don't require C++11 features v9: convert to C++ style comments fix txf with a uniform constant 0 lod Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	83cb790bf0	nv50/ir/nir: implement nir_ssa_undef_instr v2: use mkOp v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	ad61f7e20d	nv50/ir/nir: implement loading system values v2: support more sys values fixed a bug where for multi component reads all values ended up in x v3: add load_patch_vertices_in v4: add subgroup stuff v5: add helper invocation v6: fix loading 64 bit system values v8: don't require C++11 features v9: convert to C++ style comments Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	b05494c216	nv50/ir/nir: implement intrinsic_discard(_if) v9: use getSSA instead of new_LValue Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	9e68b7bef2	nv50/ir/nir: implement load_(interpolated_)input/output v3: and load_output v4: use smarter getIndirect helper use new getSlotAddress helper v5: don't use const_offset directly fix for indirects v6: add support for interpolateAt v7: fix compiler warnings add load_barycentric_sample handle load_output for fragment shaders v8: set info->prop.fp.readsSampleLocations for at_sample interpolation don't require C++11 features v9: convert to C++ style comments Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	6bc32bf653	nv50/ir/nir: implement nir_intrinsic_store_(per_vertex_)output v3: add workaround for RA issues indirects have to be multiplied by 0x10 fix indirect access v4: use smarter getIndirect helper use storeTo helper v5: don't use const_offset directly v8: don't require C++11 features v9: convert to C++ style comments handle clip planes correctly Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	8c257a0201	nv50/ir/nir: implement nir_intrinsic_load_uniform v2: use new getIndirect helper fixes symbols for 64 bit types v4: use smarter getIndirect helper simplify address calculation use loadFrom helper v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	6513c675ad	nv50/ir/nir: implement nir_alu_instr handling v2: user bitfield_insert instead of bfi rework switch helper macros remove some lowering code (LoweringHelper is now used for this) v3: add pack_half_2x16_split add unpack_half_2x16_split_x/y v5: replace first argument with nullptr in loadImm calls prefer getSSA over getScratch v8: fix setting precise modifier for first instruction inside a block add guard in case no instruction gets inserted into an empty block don't require C++11 features v9: use CC_NE for integer compares convert to C++ style comments fix b2f for doubles remove macros around nir ops to make it easier to grep them add handling for fpow Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	c69b814728	nv50/ir/nir: add skeleton for nir_intrinsic_instr Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	8379dc300d	nv50/ir/nir: implement nir_load_const_instr v8: fix loading 8/16 bit constants Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	88c909e9a7	nv50/ir/nir: parse NIR shader info v2: parse a few more fields v3: add special handling for GL_ISOLINES v8: set info->prop.fp.readsSampleLocations don't require C++11 features v9: replace '(*it).' with 'it->' convert to C++ style comments Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	e8d9be40cb	nv50/ir/nir: add loadFrom and storeTo helpler v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	39929a8164	nv50/ir/nir: run assignSlots v2: add support for geometry shaders set idx add some missing mappings fix for 64bit inputs/outputs fix up some FP color output index messup parse centroid flag v3: fix arrays in outputs as well fix input/ouput size calculation for tessellation shaders v4: add getSlotAddress helper fix for 64 bit typed inputs v5: change getSlotAddress interface for easier use fix sample inputs fix slot counting for mat v7: fix driver_location of images v8: don't require C++11 features v9: convert to C++ style comments support VERT_ATTRIB_POINT_SIZE add more error checking to slots Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	ccc4de0bdd	nv50/ir/nir: add nir type helper functions v4: treat imul as unsigned v5: remove pointless !! v7: inot is unsigned as well v8: don't require C++11 features v9: convert to C++ style comments improve formatting print error in all cases where codegen doesn't support a given type Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	7481abcd0c	nv50/ir/nir: track defs and provide easy access functions v2: add helper function for indirects v4: add new getIndirect overload for easier use v5: use getSSA for ssa values we can just create the values for unassigned registers in getSrc v6: always create at least 32 bit values v8: don't require C++11 features v9: include unordered_map on supported stdlibs replace '(*it).' with 'it->' Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:28 +01:00
Karol Herbst	9298664a5f	nv50/ir/nir: run some passes to make the conversion easier v2: add constant_folding v6: print non final NIR only for verbose debugging v8: add passes we will need for OpenCL compute shaders v9: move type_size into anonymous namespace convert to C++ style comments lower bools to int32 Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	78c5336ca9	nouveau: fix nir and TGSI shader cache collision v9: rename variable to driver_flags use constants for shader cache flags Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	f014ae3c7c	nouveau: add support for nir not all those nir options are actually required, it just made the work a little easier. v2: fix asserts parse compute shaders don't lower bitfield_insert v3: fix memory leak v4: don't lower fmod32 v5: set lower_all_io_to_temps to false fix memory leak because we take over ownership of the nir shader merge: use the lowering helper v6: include TGSI debug header for proper assert call add nv50 support v7: fix Automake build v8: free shader only for the set shader type v9: check for IR type inside get_compiler_options squash "nouveau: add env var to make nir default" fix memory leak when creating compute shaders use debug_get_bool_option as it is available in non debug builds return failure if unsupported IR is encountered don't lower fpow in nir lower int 64 divmod inside nir to prevent crashes Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	a211c92c4b	nv50/ir: add lowering helper if we start supporting multiple input IRs we might want to move lowering code into a common place and keep the initial translation simplier. This will also allows us to react on ISA changes more easily. v5: also handle SAT v6: rename type variables fixed lowering of NEG add lowering of NOT v8: don't require C++11 features Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	a0393010c4	nv50/ir: move common converter code in base class v2: remove TGSI related bits Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-03-17 10:33:28 +01:00
Karol Herbst	bb50cb66f0	nvc0: print the shader type when dumping headers this makes debugging the shader header a little easier Acked-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-17 10:33:27 +01:00
Bas Nieuwenhuizen	213de3ea99	radeonsi: Remove implicit const cast. Fixes: `b9e02fe138` "gallium: add pipe_grid_info::last_block" Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-03-17 00:07:38 +01:00
Bas Nieuwenhuizen	158d45db0c	gitlab-ci: Build turnip. No autotools build to care about. The half baked turnips param is kind of ugly, but felt like a waste defining more variables for it now. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-16 14:38:51 +00:00
Bas Nieuwenhuizen	42ed6d9789	turnip: Deconflict vk_format_table regeneration Avoids src/freedreno/vulkan/meson.build:42:0: ERROR: Tried to create target "vk_format_table.c", but a target of that name already exists. when building both radv and turnip. Fixes: `26380b3a9f` "turnip: Add driver skeleton (v2)" Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-16 14:38:51 +00:00
Bas Nieuwenhuizen	e1161d2ea7	turnip: Fix GCC compiles. Apparently GCC does not consider static const variables to be integer constants, and hence the array size and the static assert result in compile failures. Fixes: `4b9f967cd1` "turnip: add a more complete format table" Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-16 14:38:51 +00:00
Jason Ekstrand	d3386e73c5	intel/nir: Lower array-deref-of-vector UBO and SSBO loads This fixes a serious performance issue with DXVK: https://github.com/doitsujin/dxvk/issues/937 This was caused by a recent change that to improve performance on RADV which back-fired on ANV and killed performance for some apps: `e5a06d3f4a` Throwing in this bit of lowering lets us come along and CSE those UBO loads (or copy-prop for SSBO load) and get one load where we previously would have gotten several. VkPipeline-db results on Kaby Lake: total instructions in shared programs: 5115361 -> 5073185 (-0.82%) instructions in affected programs: 1754333 -> 1712157 (-2.40%) helped: 5331 HURT: 63 total cycles in shared programs: 2544501169 -> 2481144545 (-2.49%) cycles in affected programs: 2531058653 -> 2467702029 (-2.50%) helped: 9202 HURT: 4323 total loops in shared programs: 3340 -> 3331 (-0.27%) loops in affected programs: 9 -> 0 helped: 9 HURT: 0 total spills in shared programs: 3246 -> 3053 (-5.95%) spills in affected programs: 384 -> 191 (-50.26%) helped: 10 HURT: 5 total fills in shared programs: 4626 -> 4452 (-3.76%) fills in affected programs: 439 -> 265 (-39.64%) helped: 10 HURT: 5 All of the shaders with hurt spilling were in Rise of the Tomb Raider which also had shaders solidly helped in the spilling department. Not shown in those results (because I've not had success dumping the shaders) is Witcher 3 where this reduces spilling and improves over-all perf by around 20-25%. There were no shader-db changes. Apparently, this just isn't a pattern that happens in OpenGL. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: "19.0" mesa-stable@lists.freedesktop.org	2019-03-15 23:10:27 -05:00
Jason Ekstrand	35b8f6f40b	nir: Add a new pass to lower array dereferences on vectors This pass was originally written for lowering TCS output reads and writes but it is also applicable just about anything including UBOs, SSBOs, and shared variables. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 23:10:27 -05:00
Jason Ekstrand	fe9a6c0f14	nir/builder: Add a vector extract helper This one's a tiny bit better than what we had in spirv_to_nir because it emits a binary tree rather than a linear walk. It also doesn't leave around unneeded bcsel instructions for a constant index and returns an undef for constant OOB access. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 23:10:26 -05:00
Gert Wollny	9bb63e9a7c	softpipe: Enable PIPE_CAP_MIXED_COLORBUFFER_FORMATS It seems softpipe actually supports this. This change enables the following piglits as passing without regressions in the gpu test set: gl-3.1-mixed-int-float-fbo gl-3.1-mixed-int-float-fbo int_second fbo-blending-format-quirks Changes for deqp: dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_none_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_none_rbo QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_none_tex QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_rbo_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.rbo_tex_tex_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_none_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_none_rbo QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_none_tex QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_rbo_none QualityWarning -> Pass dEQP-GLES2.functional.fbo.completeness.attachment_combinations.tex_rbo_tex_none QualityWarning -> Pass dEQP-GLES3.functional.fbo.completeness.samples.rbo0_rbo0_tex Fail -> Pass dEQP-GLES3.functional.fbo.completeness.samples.rbo0_tex_none Fail -> Pass dEQP-GLES3.functional.fbo.completeness.samples.rbo1_rbo1_rbo1 Fail -> Pass dEQP-GLES3.functional.fragment_out.random.* NotSupported -> Pass dEQP-GLES31.functional.shaders.builtin_functions.common.frexp._fragment Fail -> Pass dEQP-GLES31.functional.shaders.builtin_functions.common.frexp._vertex Fail -> Pass dEQP-GLES31.functional.shaders.builtin_functions.precision.frexp._fragment. Fail -> Pass dEQP-GLES31.functional.shaders.builtin_functions.precision.frexp._vertex. Fail -> Pass Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-15 19:04:05 +01:00
Rob Clark	ca11f9263e	freedreno/ir3/cp: fix ldib bug Something that we didn't hit earlier because of the extra shr.b Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-15 10:52:11 -07:00
James Zhu	abfd572bd2	gallium/auxiliary/vl: Change weave compute shader implementation Use 2D_ARRARY instead of RECT to fetch texels for weave compute shader. Problem 2,3: Fixed interpolation issue with weave de-interlace Fixes: `9364d66cb7` (Add video compositor compute shader render) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-03-15 11:53:15 -04:00
James Zhu	a8ee07d83e	gallium/auxiliary/vl: Change grid setting Using draw area for grid setting instead of destination buffer size. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-03-15 11:53:15 -04:00
James Zhu	998dca4dbb	gallium/auxiliary/vl: Increase shader_params size Increase shader_params size to pass sampler data to compute shader during weave de-interlace. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-03-15 11:53:15 -04:00
Marek Olšák	b276e8358a	omx: add a compute path in enc_LoadImage_common Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Marek Olšák	323e7be91c	omx: clean up enc_LoadImage_common - add *pipe - add documentation Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Marek Olšák	b9e02fe138	gallium: add pipe_grid_info::last_block The OpenMAX state tracker will use this. RadeonSI is adapted to use pipe_grid_info::last_block instead of its internal state. Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Alejandro Piñeiro	34b3b92bbe	nir/xfb: move varyings info out of nir_xfb_info When varyings was added we moved to use to dynamycally allocated pointers, instead of allocating just one block for everything. That breaks some assumptions of some vulkan drivers (like anv), that make serialization and copying easier. And at the same time, varyings are not needed for vulkan. So this commit moves them out. Although it seems a little an overkill, fixing the anv side would require a similar, or more, changes, so in the end it is about to decide where do we want to put our effort. v2: (from Jason review) * Don't use a temp variable on the _create methods, just return result of rzalloc_size * Wrap some lines too long. Fixes: `cf0b2ad486` ("nir/xfb: adding varyings on nir_xfb_info and gather_info") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-15 11:59:32 +01:00
Samuel Pitoiset	d5befdbe4a	radv: always load 3 channels for formats that need to be shuffled This fixes a rendering issue with Hellblade and DXVK. Fixes: `a66b186beb` ("radv: use typed buffer loads for vertex input fetches") Reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-15 11:35:52 +01:00
Mathias Fröhlich	ebc15ecde5	mesa: Add assert to _mesa_primitive_restart_index. Make sure the inde_size parameter is meant to be in bytes. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	d66faa54b2	vbo: Fix GL_PRIMITIVE_RESTART_FIXED_INDEX in display list compiles. The maximum value primitive restart index is different for each index data type. Use the appropriate fixed restart index value. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	a503f0562a	vbo: Fix basevertex handling in display list compiles. The standard requires that the primitive restart comparison happens before the basevertex value is added. Do this now, drop a reference to the standard why this happens at this place. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	94b64eb462	mesa: Use mapping tools in debug prints. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	a8183c1334	mesa: Remove _ae_{,un}map_vbos and dependencies. Since mapping and unmapping the buffer objects in a VAO is handled directly from the VAO, this part of the _NEW_ARRAY state is no longer used. So remove this part of array element state. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	b89ae55a70	mesa: Replace _ae_{,un}map_vbos with _mesa_vao_{,un}map_arrays Due to the use of bitmaps, the _mesa_vao_{,un}map_arrays functions should provide comparable runtime efficienty to the currently used _ae_{,un}map_vbos functions. So use this functions and enable further cleanup. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	b43fae364f	mesa: Use _mesa_array_element in dlist save. Make use of the newly factored out _mesa_array_element function in display list compilation. For now that duplicates out the primitive restart logic. But that turns out to need a fix in display list handling anyhow. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	80e319485a	mesa: Factor out _mesa_array_element. The factored out function handles emitting the vertex attributes at the given index. The now public accessible function gets used in the following patches. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Mathias Fröhlich	85fd380878	mesa: Implement helper functions to map and unmap a VAO. Provide a set of functions that maps or unmaps all VBOs held in a VAO. The functions will be used in the following patches. v2: Update comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-15 06:06:42 +01:00
Jason Ekstrand	efa4fc0ebd	st/mesa: Let NIR lower UBO and SSBO access when we have it Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	be2990d8fb	i965: Stop setting LowerBuferInterfaceBlocks Instead, we do UBO and SSBO deref lowering in NIR after we've given it a chance to optimize SSBO access: Shader-db results on Kaby Lake: total instructions in shared programs: 15235775 -> 15235484 (<.01%) instructions in affected programs: 14992 -> 14701 (-1.94%) helped: 19 HURT: 20 total cycles in shared programs: 339220331 -> 339027307 (-0.06%) cycles in affected programs: 79831981 -> 79638957 (-0.24%) helped: 540 HURT: 602 total loops in shared programs: 4402 -> 4348 (-1.23%) loops in affected programs: 186 -> 132 (-29.03%) helped: 27 HURT: 0 total spills in shared programs: 23261 -> 23234 (-0.12%) spills in affected programs: 38 -> 11 (-71.05%) helped: 1 HURT: 0 total fills in shared programs: 31442 -> 31371 (-0.23%) fills in affected programs: 98 -> 27 (-72.45%) helped: 1 HURT: 0 LOST: 12 GAINED: 12 Most of the help and hurt in instruction counts was just churn caused by re-ordering of optimizations and the fact that the NIR deref lowering code is emitting slightly different instructions. Nothing was hurt by more than three instructions and most things weren't helped by more than four. The primary exception to this is one Car Chase shader: shaders/non-free/gfxbench4/carchase/341.shader_test CS SIMD32: 1144 -> 821 (-28.23%) There is also one compute shader in Manhattan 3.1 and a fragment shader in the UE4 Shooter Game demo that now get a loop partially unrolled. Those showed up in the results as hurt instructions but were manually removed to get the results above. The lost/gained was a dozen Car Chase shaders that went from SIMD8 to SIMD16 thanks to improved register pressure: shaders/non-free/gfxbench4/carchase/366.shader_test CS shaders/non-free/gfxbench4/carchase/368.shader_test CS shaders/non-free/gfxbench4/carchase/370.shader_test CS shaders/non-free/gfxbench4/carchase/372.shader_test CS shaders/non-free/gfxbench4/carchase/376.shader_test CS shaders/non-free/gfxbench4/carchase/378.shader_test CS shaders/non-free/gfxbench4/carchase/380.shader_test CS shaders/non-free/gfxbench4/carchase/382.shader_test CS shaders/non-free/gfxbench4/carchase/384.shader_test CS shaders/non-free/gfxbench4/carchase/388.shader_test CS shaders/non-free/gfxbench4/carchase/4.shader_test CS shaders/non-free/gfxbench4/carchase/6.shader_test CS Given how much it appeared to be improved, I ran Car Chase on my laptop. Unfortunately, I wasn't able to see any measurable improvement. It might be helped by 1-2% but it's in the noise. It does render correctly as far as I can tell so the improvement is legitimate. All of the loops that got delete were in dolphin uber shaders. I've had no opportunity to test them for correctness or performance. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	810dde2a6b	glsl/nir: Add a pass to lower UBO and SSBO access Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	77e5ec394e	glsl/nir: Handle unlowered SSBO atomic and array_length intrinsics We didn't have any of these before because all NIR consumers always called lower_ubo_references. Soon, we want to pass the derefs straight through to NIR so we need to handle these intrinsics directly. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	76ba225184	glsl/nir: Set explicit types on UBO/SSBO variables We want to be able to use variables and derefs for UBO/SSBO access in NIR. In order to do this, the rest of NIR needs to know the type layout information. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	8f3ab8aa78	glsl: Don't lower vector derefs for SSBOs, UBOs, and shared All of these are backed by some sort of memory so if you have multiple threads writing to different components of the same vector at the same time, the load-vec-store pattern that GLSL IR emits won't work. This shouldn't affect any drivers today as they all call GLSL IR lowering which lowers access to these variables to index+offset intrinsics before we get to this point. However, NIR will start handling the derefs itself and won't want the lowering. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	3c11fc7654	nir/lower_io: Add a new buffer_array_length intrinsic and lowering Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	c8d42c8cf6	nir: Rename nir_address_format_vk_index_offset to not be vk It's just a 32-bit index and offset. We're going to want to use it in GL as well so stop talking about Vulkan. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	60af3a93e9	nir/deref: Consider COHERENT decorated var derefs as aliasing If we get to two deref_var paths with different variables, we usually know they don't alias. However, if both of the paths are marked coherent, we don't have to worry about it. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	8b073832ff	compiler/types: Add helpers to get explicit types for standard layouts We also need to modify the current size/align helpers to not blow up when they encounter an explicitly laid out type. Previously we considered using the size/align helpers mutually exclusive with standard layouts but now we just assert that they match. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	5b2b144566	compiler/types: Add a C wrapper to get full struct field data Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	ef4ca44780	compiler/types: Add a new is_interface C wrapper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	b315f6f82b	nir/validate: Allow 32-bit boolean load/store intrinsics With UBOs and SSBOs we have boolean types but they're actually 32-bit values. Make the validator a little less strict so that we can do a 32-bit load/store on boolean types. We're about to add a lowering pass called gl_nir_lower_buffers which will lower boolean load/store operations to 32-bit and insert i2b and b2i instructions to convert to/from 1-bit booleans. We want that to be legal. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	5d26f2d3d5	nir/validate: Only require bare types to match for copy_deref If we want to be able to use copy_deref instructions on explicitly laid out types, we have to be a little more flexible about what types we allow. Instead, of requiring the types to exactly match, only require the bare types to match. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	2b76de9b5d	nir/algebraic: Add a couple optimizations for iabs and ishr Shader-db results on Kaby Lake: total instructions in shared programs: 15225213 -> 15222365 (-0.02%) instructions in affected programs: 43524 -> 40676 (-6.54%) helped: 203 HURT: 0 Lots of shaders in Shadow Warrior had this pattern along with Deus Ex, Civ, Shadow of Mordor, and several others. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-15 01:02:19 +00:00
Eric Anholt	0803bef006	mesa/st: Fix leaks of TGSI tokens in VP variants. Starting a glxgears and closing it, I was seeing a lot of leaked TGSI for the fixed function VPs. v2: drop unused delete_ir() arg. Fixes: `3b4929ec6e` ("st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-14 16:18:59 -07:00
Eric Anholt	e0806c1ea0	mesa/st: Make sure that prog_to_nir NIR gets freed. GLSL NIR gets freed on relink by _mesa_delete_program(), but for ARB programs we need to free the old NIR when PSN is used to set up new NIR in the same gl_program. Additionally, set the base .nir field so that it will get freed by _mesa_delete_program(). Fixes: `3d7611e9a6` ("st/nir: use NIR for asm programs") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-14 16:18:38 -07:00
Alyssa Rosenzweig	1ea42894c7	panfrost/midgard: Implement fpow We have a native op for this, which was just found in a disassembly -- so instead of lowering, use it! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:50:24 +00:00
Alyssa Rosenzweig	2eb65c2173	panfrost: Compute viewport state on the fly Previously, we were caching this incorrectly; there's no real reason to given how variable it is (sensitive to changes in viewport, framebuffer dimensions, and scissors) and how cheap it is to recompute. So, just do it on the fly each draw. Fixes glmark-es2 -bshadow and -brefract. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:12 +00:00
Alyssa Rosenzweig	c6a725888f	panfrost; Disable AFBC for depth buffers For inexplicable reasons, the depth buffer is faster if kept as linear, whereas the colour buffers are faster if AFBC. Given both code paths are available, we'll choose the faster one of each (which also helps with testing coverage). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:12 +00:00
Alyssa Rosenzweig	54e45d1d73	panfrost: Allocate extra data for depth buffer It's not clear why the hardware "spills" a little bit, but if we don't do this, we get MMU faults with linear depth buffers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:12 +00:00
Alyssa Rosenzweig	79e474fa46	panfrost: Comment spelling fix Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:12 +00:00
Alyssa Rosenzweig	8c26890ac2	panfrost/mfbd: Respect per-job depth write flag While a depth buffer may be supplied, it only needs to be written to if the depth writemask is set for any draw AND if the depth buffer is not immediately invalidated (as is the case for scanout). This refactors panfrost_job to provide a depth write requirement, which is now implemented for MFBD depth buffers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:11 +00:00
Alyssa Rosenzweig	9bf6024c6b	panfrost/mfbd: Implement linear depth buffers This removes a clunky hack where the depth buffer was enabled during the clear, instead of during depth buffer linking. That said, this does not yet support writeback like AFBC depth buffers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:11 +00:00
Alyssa Rosenzweig	23e0135723	panfrost: Minor comment cleanup (version detection) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:11 +00:00
Alyssa Rosenzweig	c119c282af	panfrost: Remove staging MFBD Same idea as the previous commit, but for the MFBD this time instead of the SFBD. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:11 +00:00
Alyssa Rosenzweig	d47f090738	panfrost: Remove staging SFBD for pan_context The fragment framebuffer descriptor should not be a context entry; rather, it should be constructed only at fragment time to keep analysis tractable. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:11 +00:00
Alyssa Rosenzweig	9dd84db7a5	panfrost: Break out fragment to SFBD/MFBD files This substantially cleans up the corresponding logic at the expense of a bit of code duplication; nevertheless, it's a net win since otherwise incompatible hardware code is mixed confusingly. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-14 22:47:11 +00:00
Alyssa Rosenzweig	4d1a356a57	freedreno: Use shared drm_find_modifier util Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-14 22:43:08 +00:00
Alyssa Rosenzweig	dd12142e34	vc4: Use shared drm_find_modifier util Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-14 22:43:06 +00:00
Alyssa Rosenzweig	cca270bb03	v3d: Use shared drm_find_modifier util Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-14 22:42:51 +00:00
Alyssa Rosenzweig	8a1ab9a166	util: Add a drm_find_modifier helper This function is replicated across vc4/v3d/freedreno and is needed in Panfrost; let's make this shared code. v2: Supply generic util_array_contains_u64 version (Eric Engestrom). Add missing stdbool.h include (Eric Anholt). Mark inline (Christian Gmeiner). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-14 22:41:08 +00:00
Mark Janes	16d108b502	mesa: add logging function for formatted string Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-03-14 12:56:59 -07:00
Mark Janes	b8a1a3214a	mesa: rename logging functions to reflect that they format strings In preparation for the definition of a function to log a formatted string. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-03-14 12:56:45 -07:00
Mark Janes	eb1a869a5d	mesa: properly report the length of truncated log messages _mesa_log_msg must provide the length of the string passed into the KHR_debug api. When the string formatted by _mesa_gl_vdebugf exceeds MAX_DEBUG_MESSAGE_LENGTH, the length is incorrectly set to the number of characters that would have been written if enough space had been available. Fixes: `3025680578` ("mesa: Add support for GL_ARB_debug_output with dynamic ID allocation.") Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-03-14 12:56:19 -07:00
Jason Ekstrand	162286eb75	anv: Only set 3DSTATE_PS::VectorMaskEnable on gen8+ We don't set it on HSW and earlier in i965 and disabling it appears to make derivatives somewhat more reliable. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-14 12:22:20 -05:00
Eric Engestrom	b63fe65bf6	travis: fix osx meson build	2019-03-14 17:06:03 +00:00
Samuel Pitoiset	3a2e93147f	radv: always initialize HTILE when the src layout is UNDEFINED HTILE should always be initialized when transitioning from VK_IMAGE_LAYOUT_UNDEFINED to other image layouts. Otherwise, if an app does a transition from UNDEFINED to GENERAL, the driver doesn't initialize HTILE and it tries to decompress the depth surface. For some reasons, this results in VM faults. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-14 17:22:23 +01:00
Tomeu Vizoso	27b0661e30	panfrost: Adapt to uapi changes Two ioctls had wrong DRM_IO* flags. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Rob Herring <robh@kernel.org>	2019-03-14 15:24:27 +01:00
Plamena Manolova	19ab082001	i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9 ARB_fragment_shader_interlock depends on memory fences to ensure fragment ordering and this ordering guarantee is only supported from GEN9 onwards. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980 Fixes: `939312702e` "i965: Add ARB_fragment_shader_interlock support." Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-14 13:04:12 +00:00
Kenneth Graunke	0c3adaad22	iris: Don't mutate box in transfer map code Not mutating the boxes is arguably cleaner. Split from a patch by Chris Wilson but reworked to use a pointer to the original box rather than making a copy at all.	2019-03-13 23:31:51 -07:00
Tapani Pälli	3b41175c22	i965: remove scaling factors from P010, P012 Patch removes scaling factors introduced in `2a2e69f975` but leaves option to use scaling in place as it could be useful with other upcoming YUV formats. We did this scaling because ffmpeg was shifting channel bits down, however it seems this is not the right place as compositor wants to flip same buffers directly to display as well and therefore bitshifting needs to be done by the client when receiving frame from ffmpeg. Now P0x formats are treated the same, e.g. P010 is same as P016 but with lower 6 bits set to zeros. Fixes: `2a2e69f975` "i965: add P0x formats and propagate required scaling factors" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-14 07:41:44 +02:00
Jason Ekstrand	489bf2de23	anv/pass: Flag the need for a RT flush for resolve attachments Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-03-13 17:58:27 -05:00
Jason Ekstrand	13099d4490	anv: Stop using VK_TRUE/FALSE We've been fairly inconsistent about this so we should really choose whether we're going to use VK_TRUE/FALSE or the C boolean values. The Vulkan #defines are set to 1 and 0 respectively so it's the same value as C gives you when you cast a boolean expression to an integer. Since there are several places where we set a VkBool32 to a C logical expression, let's just embrace C booleans and stop using the VK defines. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-13 17:58:27 -05:00
Gurchetan Singh	d6dc68e7b5	virgl: use uint16_t mask instead of separate booleans This should save some space. Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-13 22:58:22 +00:00
Albert Pal	56717e13a6	Fix link release notes for 19.0.0. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-03-13 22:36:42 +00:00
Rafael Antognolli	2b2b449dd1	iris: Enable auxiliary buffer support again Now that we are properly resolving buffers before giving them to the window system, let's enable aux support again. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-13 14:45:13 -07:00
Rafael Antognolli	1281368d02	iris: Convert RGBX to RGBA always. In i965, we disable the use of RGBX formats, so the higher layers of Mesa choose the equivalent RGBA format, and swizzle the alpha channel to 1.0. However, Gallium won't do that. We need to explicitly convert it to RGBA. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-13 14:45:13 -07:00
Rafael Antognolli	9159a5bbf8	iris: Add resolve on iris_flush_resource. The flush_resource hook is supposedly called when the resource content needs to be made visible to external (okay, that's pretty vague). For instance, it gets called before a surface gets handled to the window system. So we need to resolve it if it's not resolved yet. v2 (Ken): - Check mod_info in iris_flush_resource instead of ISL_AUX_USAGE_NONE - Drop my old broken resolve code from iris_resource_get_handle() now that Rafael's got it hooked up in the right place. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-13 14:45:13 -07:00
Eduardo Lima Mitev	759ceda07e	ir3/lower_io_offsets: Try propagate SSBO's SHR into a previous shift instruction While we lack value range tracking, this patch tries to 'manually' propogate the division by 4 to calculate SSBO element-offset, into a possible previous shift operation (shift left or right); checking that it is safe to do so. This should help in cases like ie. when accessing a field in an array of structs, where the offset is likely defined as base plus a multiplication by a struct or array element size. See dEQP test 'dEQP-GLES31.functional.ssbo.atomic.xor.highp_uint' for an example of a shader that benefits from this. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-13 21:19:44 +01:00
Eduardo Lima Mitev	2e4525883f	ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsics These intrinsics have the offset in dwords already computed in the last source, so the change here is basically using that instead of emitting the ir3_SHR to divide the byte-offset by 4. The improvement in shader stats is significant, of up to ~15% in instruction count in some cases. Tested only on a5xx. shader-db is unfortunately not very useful here because shaders that use SSBO require GLSL versions that are not supported by freedreno yet. For examples, most Khronos CTS tests under 'dEQP-GLES31.functional.ssbo.*' are helped. A random case: dEQP-GLES31.functional.ssbo.layout.2_level_array.packed.row_major_mat3x2 with current master: ; CL prog 14/1: 1252 instructions, 0 half, 48 full ; 8 const, 8 constlen ; 61 (ss), 43 (sy) with the SSBO dword-offset moved to NIR: ; CL prog 14/1: 1053 instructions, 0 half, 45 full ; 7 const, 7 constlen ; 34 (ss), 73 (sy) The SHR previously emitted for every single SSBO instruction disappears in most cases, and the dword-offset ends up embedded in the STGB instruction as immediate in many cases as well. There are also a few of those tests that are currently failing on register allocation, that start to pass as a result of reducing the pressure. At least these, probably more: dEQP-GLES31.functional.ssbo.layout.random.unsized_arrays.24 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.17 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays.14 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.5 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.7 No regressions observed with relevant CTS and piglit tests. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-13 21:19:44 +01:00
Eduardo Lima Mitev	9dd0cfafc9	ir3/nir: Add a new pass 'ir3_nir_lower_io_offsets' This NIR->NIR pass implements offset computations that are currently done on the IR3 backend compiler, to give NIR a better chance of optimizing them. For now, it supports lowering the dword-offset computation for SSBO instructions. It will take an SSBO intrinsic and replace it with the new ir3-specific version that adds an extra source. That source will hold the SSA value resulting from inserting a division by 4 (an SHR op) of the original byte-offset source already provided by NIR in one of the intrinsic sources. Note that on a6xx the original byte-offset is not needed, so we could potentially replace that source instead of adding a new one. But to keep things simple and consistent we always add the new source and a6xx will just ignore the original one. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-13 21:19:44 +01:00
Eduardo Lima Mitev	6ff50a488a	nir: Add ir3-specific version of most SSBO intrinsics These are ir3 specific versions of SSBO intrinsics that add an extra source to hold the element offset (dword), which is what the backend instructions need. The original byte-offset source provided by NIR is not replaced because on a4xx and a5xx the backend still needs it. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-13 21:19:44 +01:00
Dylan Baker	03a0801bcb	docs: update calendar, add news item, and link release notes for 19.0.0	2019-03-13 12:36:27 -07:00
Dylan Baker	0cd487f375	docs: Add SHA256 sums for 19.0.0	2019-03-13 12:22:58 -07:00
Dylan Baker	44273b4806	docs: Add release notes for 19.0.0	2019-03-13 12:22:57 -07:00
Kevin Strasser	70b36c0ef9	egl/dri: Avoid out of bounds array access indexConfigAttrib iterates over every index in the dri driver, possibly exceeding __DRI_ATTRIB_MAX. In other words, if the dri driver has newer attributes libEGL will end up reading from uninitialized memory through dri2_to_egl_attribute_map[]. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-13 18:28:53 +00:00
Chris Wilson	97ad0efba0	iris: Use streaming loads to read from tiled surfaces Always use the streaming load (since we know we have Broadwell+, all of our target CPU support sse41) for reading back form the tiled surface for mapping the resource. This means we hit the fast WC handling paths on Atoms (without LLC), and for big Core (with LLC) using the streaming load is no less efficient as we do not require the tiled buffer to be pulled into the CPU cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-13 10:54:16 -07:00
Chris Wilson	797fb6c6ac	iris: Use coherent allocation for PIPE_RESOURCE_STAGING On !llc machines (Atoms), reading from a linear buffers is slow and so copying from one resource into the linear staging buffer is still slow. However, we can tell the GPU to snoop the CPU cache when reading from and writing to the staging buffer eliminating the slow uncached reads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-13 10:54:16 -07:00
Chris Wilson	01b224047b	iris: Use PIPE_BUFFER_STAGING for the query objects We prefer fast CPU access to read back the query results. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-13 10:54:16 -07:00
Caio Marcelo de Oliveira Filho	65e8761474	intel/nir: Combine store_derefs to improve code from SPIR-V Due to lack of write mask in SPIR-V store, generators may produce multiple stores to the same vector but using different array derefs. Use the combining store pass to clean this up. For example, layout(binding = 3) buffer block { vec4 v; }; void main() { v.x = 11; v.y = 22; } after going to SPIR-V and NIR, ends up with in two store_derefs to v[0] and v[1] vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) /* &((block )ssa_2)->field0 / vec2 32 ssa_6 = deref_array &(ssa_4)[0] (ssbo float) / &((block )ssa_2)->field0[0] / intrinsic store_deref (ssa_6, ssa_7) (1, 0) /* wrmask=x / / access=0 / vec1 32 ssa_13 = load_const (0x00000001 / 0.000000 /) vec2 32 ssa_14 = deref_array &(ssa_4)[1] (ssbo float) /* &((block )ssa_2)->field0[1] / intrinsic store_deref (ssa_14, ssa_15) (1, 0) /* wrmask=x / / access=0 / producing two different sends instructions in skl. The combining pass transform the snippet above into vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) / &((block )ssa_2)->field0 / vec4 32 ssa_18 = vec4 ssa_7, ssa_15, ssa_16, ssa_17 intrinsic store_deref (ssa_4, ssa_18) (3, 0) /* wrmask=xy / / access=0 */ producing a single sends instruction. v2: Move this from spirv_to_nir into the general optimization pass for intel compiler. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-13 08:39:16 -07:00
Caio Marcelo de Oliveira Filho	10dfb0011e	intel/nir: Combine store_derefs after vectorizing IO Shader-db results for skl: total instructions in shared programs: 15232903 -> 15224781 (-0.05%) instructions in affected programs: 61246 -> 53124 (-13.26%) helped: 221 HURT: 0 total cycles in shared programs: 371440470 -> 371398018 (-0.01%) cycles in affected programs: 281363 -> 238911 (-15.09%) helped: 221 HURT: 0 Results for bdw are very similar. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-13 08:39:16 -07:00
Caio Marcelo de Oliveira Filho	822a8865e4	nir: Add a pass to combine store_derefs to same vector v2: (all from Jason) Reuse existing function for the end of the block combinations. Check the SSA values are coming from the right place in tests. Document the case when the store to array_deref is reused. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-13 08:39:16 -07:00
Samuel Pitoiset	cbf022cb31	ac: use the raw tbuffer version for 16-bit SSBO loads vindex is always 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 14:16:14 +01:00
Samuel Pitoiset	045fae0f73	ac: add ac_build_{struct,raw}_tbuffer_load() helpers The struct version sets IDXEN=1, while the raw version sets IDXEN=0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 14:15:05 +01:00
Samuel Pitoiset	a66b186beb	radv: use typed buffer loads for vertex input fetches This drastically reduces the number of SGPRs because the driver now uses descriptors per vertex binding, instead of per vertex attribute format. 29077 shaders in 15096 tests Totals: SGPRS: 1354285 -> 1282109 (-5.33 %) VGPRS: 909896 -> 908800 (-0.12 %) Spilled SGPRs: 24840 -> 24811 (-0.12 %) Code Size: 49221144 -> 48986628 (-0.48 %) bytes Max Waves: 243930 -> 244229 (0.12 %) Totals from affected shaders: SGPRS: 390648 -> 318472 (-18.48 %) VGPRS: 288432 -> 287336 (-0.38 %) Spilled SGPRs: 94 -> 65 (-30.85 %) Code Size: 11548412 -> 11313896 (-2.03 %) bytes Max Waves: 86460 -> 86759 (0.35 %) This gives a really tiny boost. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 13:31:11 +01:00
Samuel Pitoiset	0b9a06a1a0	radv: store more vertex attribute infos as pipeline keys They are required for using typed buffer loads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 13:31:08 +01:00
Samuel Pitoiset	489dac0d21	ac: rework typed buffers loads for LLVM 7 Be more generic, this will be used by an upcoming series. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 13:31:06 +01:00
Tomeu Vizoso	56e04f67f9	panfrost: Set bo->gem_handle when creating a linear BO So we can free it later. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-13 07:35:39 +01:00
Tomeu Vizoso	bfbad30543	panfrost: Set bo->size[0] in the DRM backend So we can unmap it later. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-13 07:35:25 +01:00
Kenneth Graunke	3570d15b6d	intel/fs: Fix opt_peephole_csel to not throw away saturates. We were not copying the saturate bit from the original instruction to the new replacement instruction. This caused major misrendering in DiRT Rally on iris, where comparisons leading to discards failed due to the missing saturate, causing lots of extra garbage pixels to be drawn in text rendering, trees, and so on. This did not show up on i965 because st/nir performs a more aggressive version of nir_opt_peephole_select, yielding more b32csel operations. Fixes: `52c7df1643` i965/fs: Merge CMP and SEL into CSEL on Gen8+ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 20:11:55 -07:00
Jason Ekstrand	bd17bdc56b	glsl/lower_vector_derefs: Don't use a temporary for TCS outputs Tessellation control shader outputs act as if they have memory backing them and you can have multiple writes to different components of the same vector in-flight at the same time. When this happens, the load vec store pattern that gets used by ir_triop_vector_insert doesn't yield the correct results. Instead, just emit a sequence of conditional assignments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-03-13 02:10:31 +00:00
Jason Ekstrand	20c4578c55	glsl/list: Add a list variant of insert_after Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-13 02:10:31 +00:00
Jason Ekstrand	83fdefc062	nir/loop_unroll: Fix out-of-bounds access handling The previous code was completely broken when it came to constructing the undef values. I'm not sure how it ever worked. For the case of a copy that reads an undefined value, we can just delete the copy because the destination is a valid undefined value. This saves us the effort of trying to construct a value for an arbitrary copy_deref intrinsic. Fixes: `e8a8937a04` "nir: add partial loop unrolling support" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-12 21:06:39 -05:00
Jason Ekstrand	c056609c43	anv: Ignore VkRenderPassInputAttachementAspectCreateInfo We don't care about the information but there's no sense in throwing a debug warning about it. It's harmless but annoying to users. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-03-12 21:06:39 -05:00
Eric Anholt	486b181fd7	v3d: Fix leak of the renderonly struct on screen destruction. This makes v3d match vc4's destroy path. Fixes: `e113b21cb7` ("v3d: Add renderonly support.")	2019-03-12 16:15:40 -07:00
Eric Anholt	0c874c18cd	v3d: Fix leak of the mem_ctx after the DAG refactor. Noticed while trying to get a CTS run again. Fixes: `33886474d6` ("v3d: Use the DAG datastructure for QPU instruction scheduling.")	2019-03-12 16:15:40 -07:00
Grigori Goronzy	acfd88204e	glx: add support for GLX_ARB_create_context_no_error (v3) v2: Only reject no-error contexts for too-old GL if we're actually trying to create a no-error context (Adam Jackson) v3: Fix share contexts (Adam Jackson) Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-12 19:12:21 -04:00
Samuel Pitoiset	ae77f12368	radv: set the maximum number of IBs per submit to 192 This fixes random SteamVR corruption, see https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181 Fixes: `4d30f2c6f4` ("radv/winsys: remove the max IBs per submit limit for the fallback path") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-12 22:15:45 +01:00
Danylo Piliaiev	9c80be956f	anv: Fix destroying descriptor sets when pool gets reset pool->next and pool->free_list were reset before their usage in anv_descriptor_pool_free_set Fixes: 775aabdd "anv: destroy descriptor sets when pool gets reset" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-12 17:09:37 +00:00
Eric Anholt	ccce940947	v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER. This reduces the runtime of dEQP-GLES3.functional.shaders.precision.* from 11.5s to 3.3s. This brings CTS runs down to 4 hours on one of my target devices.	2019-03-12 09:04:25 -07:00
Jason Ekstrand	6d5d89d25a	intel/nir: Vectorize all IO The IO scalarization pass that we run to help with linking end up turning some shader I/O such as that for tessellation and geometry shaders into many scalar URB operations rather than one vector one. To alleviate this, we now vectorize the I/O once again. This fixes a 10% performance regression in the GfxBench tessellation test that was caused by scalarizing. Shader-db results on Kaby Lake: total instructions in shared programs: 15224023 -> 15220871 (-0.02%) instructions in affected programs: 342009 -> 338857 (-0.92%) helped: 1236 HURT: 443 total spills in shared programs: 23471 -> 23465 (-0.03%) spills in affected programs: 6 -> 0 helped: 1 HURT: 0 total fills in shared programs: 31770 -> 31766 (-0.01%) fills in affected programs: 4 -> 0 helped: 1 HURT: 0 Cycles was just a lot of churn do to moves being different places. Most of the pure churn in instructions was +/- one or two instructions in fragment shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107510 Fixes: `4434591bf5` "intel/nir: Call nir_lower_io_to_scalar_early" Fixes: `8d8222461f` "intel/nir: Enable nir_opt_find_array_copies" Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-03-12 15:34:06 +00:00
Jason Ekstrand	5ef2b8f1f2	nir: Add a pass for lowering IO back to vector when possible This pass tries to turn scalar and array-of-scalar IO variables into vector IO variables whenever possible. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	2019-03-12 15:34:06 +00:00
Rhys Perry	0f025bbccc	ac/nir: fix 16-bit ssbo stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-12 15:51:52 +01:00
pal1000	7f89fd17ed	scons: Compatibility with Scons development version string This ensures Mesa3D build doesn't fail in this case as encountered when bisecting Scons source code while regression testing https://bugs.freedesktop.org/show_bug.cgi?id=109443 and when testing 3.0.5.a.2 Technical details: Scons version string has consistently been in this format: MajorVersion.MinorVersion.Patch[.alpha/beta.yyyymmdd] so these formulas should strip alpha/beta flags and return Scons version: - as string - `'.'.join(SCons.__version__.split('.')[:3])` - as tuple of integers - `tuple(map(int, SCons.__version__.split('.')[:3]))` - v2: Fixed Scons version retrieval formulas as string and tuple of integers. - v3: Fixed Scons version string format description. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-03-12 14:22:34 +00:00
Tapani Pälli	bef354321b	anv: revert "anv: release memory allocated by glsl types during spirv_to_nir" This reverts commit `47fc359822`. Reason is that patch did not take in to account situation where we might have both OpenGL and Vulkan using glsl_types at the same time. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-12 14:12:36 +02:00
Connor Abbott	1bbe58c214	radeonsi/nir: Use nir stripping pass This reduces compilation time for my shader-db collection from around 40 seconds to 30, vs. 19 seconds for TGSI. There are still some shaders that TGSI caches but NIR doesn't, partly because of more aggressive cross-stage optimizations with NIR. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-12 10:49:48 +01:00
Connor Abbott	5b2ec9c81e	nir: Add a stripping pass for improved cacheability Oftentimes various nir shaders after lowering will be the same, or almost the same. For example, this can happen when the same shader is linked with different shaders to form different pipelines and cross-stage optimizations don't kick in to change it. We want to avoid running the backend twice on these shaders. We were already doing this with radeonsi, but we were storing a few extra pieces of information that made this much less effective compared to TGSI. The worse offender by far was the program name, which caused most of the cache misses. This pass strips out these pieces of information, controlled by the NIR_STRIP debug env variable. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-12 10:49:48 +01:00
Samuel Pitoiset	6403171843	radv: fix pointSizeRange limits The values should match the ones that are emitted. This fixes new CTS dEQP-VK.rasterization.primitive_size.points.*. Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-12 09:00:32 +01:00
Sagar Ghuge	bbef6c2d5f	iris: Flag fewer dirty bits in BLORP v2: 1) Skip flagging IRIS_DIRTY_DEPTH_BUFFER if BLORP_BATCH_NO_EMIT_DEPTH_STENCIL is set (Kenneth Graunke) 2) Add missing flags (Kenneth Graunke) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-11 22:46:39 -07:00
Timothy Arceri	cb2898f478	st/glsl_to_nir: fix incorrect arrary access This fixes a segfault when we try to access the array using a -1 when the array wasn't allocated in the first place. Before `7536af670b` we would just access a pre-allocated array that was also load/stored to/from the shader cache. But now the cache will no longer allocate these arrays if they are empty. The change resulted in tests such as the following segfaulting when run with a warm shader cache. tests/spec/arb_arrays_of_arrays/execution/sampler/fs-struct-const-index.shader_test	2019-03-12 14:47:21 +11:00
Brian Paul	02c2863df5	nir: silence a couple new compiler warnings [33/630] Compiling C object 'src/compiler/nir/nir@sta/nir_loop_analyze.c.o'. ../src/compiler/nir/nir_loop_analyze.c: In function ‘try_find_trip_count_vars_in_iand’: ../src/compiler/nir/nir_loop_analyze.c:846:29: warning: suggest parentheses around ‘&&’ within ‘\|\|’ [-Wparentheses] if (ind == NULL \|\| ind && (ind)->type != basic_induction \|\| ^ [85/630] Compiling C object 'src/compiler/nir/nir@sta/nir_opt_loop_unroll.c.o'. ../src/compiler/nir/nir_opt_loop_unroll.c: In function ‘complex_unroll_single_terminator’: ../src/compiler/nir/nir_opt_loop_unroll.c:494:17: warning: unused variable ‘unroll_loc’ [-Wunused-variable] nir_cf_node unroll_loc = ^ Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-12 14:34:51 +11:00
Alyssa Rosenzweig	587ad37e72	panfrost: Identify fragment_extra flags The fragment_extra structure contains additional fields extending the MRT framebuffer descriptor, snuck in between the main framebuffer descriptor and the render targets. Its fields include those related to transaction elimination and depth/stencil buffers. This patch identifies the flags field (previously just "unk" with some magic values) as well as identifying some (but not all) flags set by the driver. The process of identifying flags brought a bug to light where transaction elimination (checksumming) could not be enabled unless AFBC was in-use. This issue is now resolved. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-03-12 02:37:42 +00:00
Alyssa Rosenzweig	e57ea53acf	panfrost: Document "depth-buffer writeback" bit This bit, if set, causes the depth buffer to be copied from GPU tile memory to the provided depth buffer in main memory. If not set, the GPU will not access the main memory (saving considerable memory bandwidth if depth results are not actually used). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-12 02:37:42 +00:00
Alyssa Rosenzweig	2df4537f91	panfrost: Support linear depth textures This combination has not yet been seen "in the wild" in traces, but to support linear depth FBOs, ~bruteforce reveals this bit pattern is necessary. It's not yet clear why the meanings of 0x1 and 0x2 are essentially flipped (tiled vs linear for colour, linear vs some sort of tiled for depth). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-03-12 02:37:41 +00:00
Alyssa Rosenzweig	9f25a4e65c	panfrost: Allocate dedicated slab for linear BOs Previously, linear BOs shared memory with each other to minimize kernel round-trips / latency, as well as to work around a bug in the free_slab function. These concerns are invalid now, but continuing to use the slab allocator for BOs resulted in memory allocation errors. This issue was aggravated, though not introduced (so not a real regression) in the previous commit. v2 (unreviewed): Fix bug in v1 preventing munmaps from working Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-03-12 02:37:41 +00:00
Alyssa Rosenzweig	f9dc1ebc0d	panfrost: Determine framebuffer format bits late Again, these formats are only properly known at the time of fragment job emit. Rather than hardcoding the format, at least for MFBD we begin to construct the format bits on-demand. This cleans up the code, futureproofs for ES3 framebuffer formats, and should fix bugs regarding FBO colour swizzles. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.visozo@collabora.com>	2019-03-12 02:37:41 +00:00
Alyssa Rosenzweig	7ba18cdfa9	panfrost: Delay color buffer setup In an effort to cleanup framebuffer management code, we delay colour buffer setup until the FRAGMENT job is actually emitted, allowing the AFBC and linear codepaths to be unified. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.visozo@collabora.com>	2019-03-12 02:37:41 +00:00
Alyssa Rosenzweig	536bcaa68f	panfrost: Combine has_afbc/tiled in layout enum AFBC, tiled, and linear BO layouts are mutually exclusive; they should be coupled via a single enum rather than ad hoc checks of booleans. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.visozo@collabora.com>	2019-03-12 02:37:41 +00:00
Alyssa Rosenzweig	d93c5c3148	panfrost: Cleanup needless if in create_bo I'm not sure why we were checking for these additional criteria (likely inherited from some other driver); remove the needless checks to cleanup the code and perhaps fix some bugs down the line. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.visozo@collabora.com>	2019-03-12 02:37:41 +00:00
Kenneth Graunke	1467deb543	i965: Reimplement all the PIPE_CONTROL rules. This implements virtually all documented PIPE_CONTROL restrictions in a centralized helper. You now simply ask for the operations you want, and the pipe control "brain" will figure out exactly what pipe controls to emit to make that happen without tanking your system. The hope is that this will fix some intermittent flushing issues as well as GPU hangs. However, it also has a high risk of causing GPU hangs and other regressions, as this is a particularly sensitive area and poking the bear isn't always advisable. Mark Janes noted that this patch helps with some GPU hangs on Icelake. This does re-enable the VF Invalidate => Write Immediate workaround on Gen8, which had been disabled (bug 103787) due to GPU hangs. The old code did this workaround after another which would have added CS stall bits, so it missed a workaround. The new code orders them properly and appears to work. v4: Don't pass "bo, offset, imm" to a recursive CS stall (caught by Topi Pohjolainen), drop Gen10 workarounds that are unnecessary for production hardware. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-03-11 19:32:40 -07:00
Kenneth Graunke	c6af96d1bc	i965: Use genxml for emitting PIPE_CONTROL. While this does add a bunch of boilerplate, it also protects us against the hardware moving bits, or changing their meaning. For something as finnicky as PIPE_CONTROL, the extra safety seems worth it. We turn PIPE_CONTROL_* into an bitfield of arbitrary flags, and then pack them appropriately. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-03-11 19:32:40 -07:00
Kenneth Graunke	2c6f712408	i965: Rename ISP_DIS to INDIRECT_STATE_POINTERS_DISABLE. Clearer name. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-03-11 19:32:40 -07:00
Kenneth Graunke	aa139f0980	i965: Move some genX infrastructure to genX_boilerplate.h. This will let us make multiple genX_*.c files, without copy and pasting all this boilerplate. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2019-03-11 19:32:40 -07:00
Brian Paul	ecb708fada	gallium/winsys/kms: fix incomplete type compilation failure Fixes: ../src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c: In function ‘kms_sw_displaytarget_from_handle’: ../src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c:402:60: error: dereferencing pointer to incomplete type ‘const struct pipe_resource’ templ->format, ^ Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-11 20:08:16 -06:00
Brian Paul	04544d852c	drisw: fix incomplete type compilation failure Fixes: ../src/gallium/winsys/sw/dri/dri_sw_winsys.c: In function ‘dri_sw_displaytarget_display’: ../src/gallium/winsys/sw/dri/dri_sw_winsys.c:255:39: error: dereferencing pointer to incomplete type ‘struct pipe_box’ offset = dri_sw_dt->stride * box->y; ^ Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-11 20:08:16 -06:00
Brian Paul	45c6da5a48	docs: try to improve the Meson documentation (v2) Add new Introduction and Advanced Usage sections. Spell out a few more details, like "ninja install". Improve the layout around example commands. Fix grammatical errors and tighten up the text. Explain the --prefix option. v2: Remove language about 'ninja clean' and move link to Meson information about separate build directories earlier in the page. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-11 20:08:16 -06:00
Brian Paul	187a527ed7	st/mesa: minor refactoring of texture/sampler delete code Rename st_texture_free_sampler_views() to st_delete_texture_sampler_views() to align with st_DeleteTextureObject(), its only caller. Move the call to st_texture_release_all_sampler_views() from st_DeleteTextureObject() to st_delete_texture_sampler_views() so all the sampler view clean-up code is in one place. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-11 20:08:16 -06:00
Brian Paul	70a2ede112	st/mesa: rename st_texture_release_sampler_view() To st_texture_release_context_sampler_view() to be more clear that it's context-specific. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-11 20:08:16 -06:00
Brian Paul	41adb3d6df	st/mesa: add/improve sampler view comments Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-11 20:08:16 -06:00
Brian Paul	c7d2504625	st/mesa: move around some code in st_context.c st_init_driver_functions() is only called in st_context.c so there's no need for the prototype in st_context.h To avoid a forward declaration of st_init_driver_functions() in st_context.c, we need to move around several other functions. No functional change. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-11 20:08:16 -06:00
Brian Paul	b29d827f09	st/mesa: move utility functions, macros into new st_util.h file To de-clutter st_context.h. Clean up remaining function prototypes in st_context.h. The st_vp_uses_current_values() helper is only used in st_context.c so move it there. The st_get_active_states() function is only used in st_context.c so remove its prototype in st_context.h Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-11 20:08:16 -06:00
Juan A. Suarez Romero	775aabdd01	anv: destroy descriptor sets when pool gets reset As stated in Vulkan spec: "Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed." This fixes dEQP-VK.api.descriptor_pool.* Fixes: `14f6275c92` "anv/descriptor_set: add reference counting for..." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-03-11 20:40:31 -05:00
Timothy Arceri	3235a942c1	nir: find induction/limit vars in iand instructions This will be used to help find the trip count of loops that look like the following: while (a < x && i < 8) { ... i++; } Where the NIR will end up looking something like this: vec1 32 ssa_1 = load_const (0x00000004 /* 0.000000 */) loop { ... vec1 1 ssa_12 = ilt ssa_225, ssa_11 vec1 1 ssa_17 = ilt ssa_226, ssa_1 vec1 1 ssa_18 = iand ssa_12, ssa_17 vec1 1 ssa_19 = inot ssa_18 if ssa_19 { ... break } else { ... } } On RADV this unrolls a bunch of loops in F1-2017 shaders. Totals from affected shaders: SGPRS: 4112 -> 4136 (0.58 %) VGPRS: 4132 -> 4052 (-1.94 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 515444 -> 587720 (14.02 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 194 -> 196 (1.03 %) Wait states: 0 -> 0 (0.00 %) It also unrolls a couple of loops in shader-db on radeonsi. Totals from affected shaders: SGPRS: 128 -> 128 (0.00 %) VGPRS: 64 -> 64 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 6880 -> 9504 (38.14 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 16 -> 16 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	67c3478482	nir: pass nir_op to calculate_iterations() Rather than getting this from the alu instruction this allows us some flexibility. In the following pass we instead pass the inverse op. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	11e8f8a166	nir: add get_induction_and_limit_vars() helper to loop analysis This helps make find_trip_count() a little easier to follow but will also be used by a following patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	f219f6114d	nir: add helper to return inversion op of a comparison This will be used to help find the trip count of loops that look like the following: while (a < x && i < 8) { ... i++; } Where the NIR will end up looking something like this: vec1 32 ssa_1 = load_const (0x00000004 /* 0.000000 */) loop { ... vec1 1 ssa_12 = ilt ssa_225, ssa_11 vec1 1 ssa_17 = ilt ssa_226, ssa_1 vec1 1 ssa_18 = iand ssa_12, ssa_17 vec1 1 ssa_19 = inot ssa_18 if ssa_19 { ... break } else { ... } } So in order to find the trip count we need to find the inverse of ilt. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	090feaacdc	nir: simplify the loop analysis trip count code a little Here we create a helper is_supported_terminator_condition() and use that rather than embedding all the trip count code inside a switch. The new helper will also be used in a following patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	7571de8eaa	nir: unroll some loops with a variable limit For some loops can have a single terminator but the exact trip count is still unknown. For example: for (int i = 0; i < imin(x, 4); i++) ... Shader-db results radeonsi (all affected are from Tropico 5): Totals from affected shaders: SGPRS: 144 -> 152 (5.56 %) VGPRS: 124 -> 108 (-12.90 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5180 -> 6640 (28.19 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 17 -> 21 (23.53 %) Wait states: 0 -> 0 (0.00 %) Shader-db results i965 (SKL): total loops in shared programs: 3808 -> 3802 (-0.16%) loops in affected programs: 6 -> 0 helped: 6 HURT: 0 vkpipeline-db results RADV (Unrolls some Skyrim VR shaders): Totals from affected shaders: SGPRS: 304 -> 304 (0.00 %) VGPRS: 296 -> 292 (-1.35 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 15756 -> 25884 (64.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 29 -> 29 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: fix bug where last iteration would get optimised away by mistake. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	68ce0ec222	nir: calculate trip count for more loops This adds support to loop analysis for loops where the induction variable is compared to the result of min(variable, constant). For example: for (int i = 0; i < imin(x, 4); i++) ... We add a new bool to the loop terminator struct in order to differentiate terminators with this exit condition. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	e8a8937a04	nir: add partial loop unrolling support This adds partial loop unrolling support and makes use of a guessed trip count based on array access. The code is written so that we could use partial unrolling more generally, but for now it's only use when we have guessed the trip count. We use partial unrolling for this guessed trip count because its possible any out of bounds array access doesn't otherwise affect the shader e.g the stores/loads to/from the array are unused. So we insert a copy of the loop in the innermost continue branch of the unrolled loop. Later on its possible for nir_opt_dead_cf() to then remove the loop in some cases. A Renderdoc capture from the Rise of the Tomb Raider benchmark, reports the following change in an affected compute shader: GPU duration: 350 -> 325 microseconds shader-db results radeonsi VEGA (NIR backend): SGPRS: 1008 -> 816 (-19.05 %) VGPRS: 684 -> 432 (-36.84 %) Spilled SGPRs: 539 -> 0 (-100.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 39708 -> 45812 (15.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 105 -> 144 (37.14 %) Wait states: 0 -> 0 (0.00 %) shader-db results i965 SKL: total instructions in shared programs: 13098265 -> 13103359 (0.04%) instructions in affected programs: 5126 -> 10220 (99.38%) helped: 0 HURT: 21 total cycles in shared programs: 332039949 -> 331985622 (-0.02%) cycles in affected programs: 289252 -> 234925 (-18.78%) helped: 12 HURT: 9 vkpipeline-db results VEGA: Totals from affected shaders: SGPRS: 184 -> 184 (0.00 %) VGPRS: 448 -> 448 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 26076 -> 24428 (-6.32 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 5 -> 5 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	fba5d275db	nir: add new partially_unrolled bool to nir_loop In order to stop continuously partially unrolling the same loop we add the bool partially_unrolled to nir_loop, we add it here rather than in nir_loop_info because nir_loop_info is only set via loop analysis and is intended to be cleared before each analysis. Also nir_loop_info is never cloned. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	03a452b7d0	nir: add guess trip count support to loop analysis This detects an induction variable used as an array index to guess the trip count of the loop. This enables us to do a partial unroll of the loop, which can eventually result in the loop being eliminated. v2: check if the induction var is used to index more than a single array and if so get the size of the smallest array. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Tomeu Vizoso	97f2d04d5e	panfrost: Add support for PAN_MESA_DEBUG Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-12 00:30:27 +00:00
Tomeu Vizoso	f0b1bbebdd	panfrost/midgard: Add support for MIDGARD_MESA_DEBUG Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-12 00:30:27 +00:00
Xavier Bouchoux	c5236fc6e2	nir/spirv: Fix assert when unsampled OpTypeImage has unknown 'Depth' 'dxc' hlsl-to-spirv compiler appears to emit 2 (Unknown) in the depth field, when the image is not sampled and the value is not needed. Previously, shaders failed with: SPIR-V parsing FAILED: In file ../src/compiler/spirv/spirv_to_nir.c:1412 !is_shadow 632 bytes into the SPIR-V binary Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 23:28:39 +01:00
Kenneth Graunke	d75f84cb65	iris: Fix write enable in pinning of depth/stencil resources We may bind new Z/S buffers (which come via the framebuffer CSO, triggering IRIS_DIRTY_DEPTH_BUFFER), but with writes disabled. The next draw may enable Z or S writes (which come via the ZSA CSO, triggering IRIS_DIRTY_WM_DEPTH_STENCIL), which requires us to update our pin to have the write flag. So, update pinning if either dirty flag changes. To clarify, pass cso_zsa to the pinning function rather than pulling the random values out of ice->state, which unfortunately have to exist for the resolve code since iris_depth_stencil_alpha_state only exists in iris_state.c.	2019-03-11 15:04:08 -07:00
Kenneth Graunke	863e810a19	iris: Refactor depth/stencil buffer pinning into a helper. This avoids the code duplication that caused me to put things in the wrong place in the previous commit. One used to have extra flushes, but we moved those out so now these are identical and can be easily shared.	2019-03-11 15:04:08 -07:00
Kenneth Graunke	9302414f8b	iris: Move depth/stencil flushes so they actually do something Commit `d6dd57d43c` (iris: Add missing depth cache flushes) added the depth/stencil flushes to the wrong place. I meant to add them to the iris_upload_dirty_render_state code that emits the packets, but I accidentally added them to the nearly identical looking code in iris_restore_render_saved_bos. This meant we missed the actual flushing at draw time, but instead did pointless flushing on the first draw in a batch where things are already flushed anyway. This commit moves them to iris_resolve.c, next to the depth prepares, similar to what we do for color buffers. i965 does them elsewhere, but I'm not sure why - this seems like the most consistent place.	2019-03-11 15:04:08 -07:00
Christian Gmeiner	076a7095bb	st/dri: allow direct UYVY import Push this format to the pipe driver unchanged. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-11 22:19:11 +01:00
Kenneth Graunke	04ff2e3fbb	iris: Fix TES gl_PatchVerticesIn handling. 1. If we switch the TCS for one with a different number of output vertices, then the TES's gl_PatchVerticesIn value will change. We need to re-upload in this case. For now, re-emit constants whenever the TCS/TES are swapped out. 2. If there is no TCS, then we can't grab gl_PatchVerticesIn from the TCS info. Since it's a passthrough, we can just use the primitive's patch count (like the TCS gl_PatchVerticesIn does). Fixes KHR-GL45.tessellation_shader.single.max_patch_vertices and KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-11 14:07:16 -07:00
Kenneth Graunke	2f51cb5e67	iris: Rework default tessellation level uploads Now that we've added a system value uploading mechanism, we may as well reuse the same system for default tessellation levels. This simplifies the state upload code a bit. Also fixes: KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_tessLevel Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-11 14:07:12 -07:00
Timur Kristóf	fd5075e059	iris: Face should be a system value. This patch adds PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL which despite its name is not a TGSI-specific capability, just lets the state tracker know that it should generate a system value for FACE. This is needed if we want to run tgsi_to_nir on iris. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-11 14:02:40 -07:00
Eric Anholt	3a9e2d6085	vc4: Switch the post-RA scheduler over to the DAG datastructure. Just a small code reduction from shared infrastructure.	2019-03-11 13:14:37 -07:00
Eric Anholt	33886474d6	v3d: Use the DAG datastructure for QPU instruction scheduling. Just a small code reduction from shared infrastructure.	2019-03-11 13:14:32 -07:00
Eric Anholt	d6d83b34ee	vc4: Reuse list_for_each_entry_rev().	2019-03-11 13:14:32 -07:00
Eric Anholt	7c01ddbf7f	v3d: Reuse list_for_each_entry_rev().	2019-03-11 13:14:32 -07:00
Eric Anholt	7a727c1a12	vc4: Switch over to using the DAG datastructure for QIR scheduling. Just a small code reduction from shared infrastructure.	2019-03-11 13:14:18 -07:00
Eric Anholt	0533d2d95c	util: Add a DAG datastructure. I keep writing this for various schedulers. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-11 13:13:52 -07:00
Kristian H. Kristensen	5f0a922c27	freedreno/a6xx: Remove extra parens There's a warning about this now. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-11 11:37:53 -07:00
Kristian H. Kristensen	08c452bef7	freedreno: Use c_vis_args and no_override_init_args Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-11 11:37:53 -07:00
Chia-I Wu	24af64baa5	turnip: preliminary support for Wayland WSI	2019-03-11 10:02:13 -07:00
Chia-I Wu	ae82b5df88	turnip: preliminary support for tu_GetImageSubresourceLayout	2019-03-11 10:02:13 -07:00
Chad Versace	6cb5fd0d71	turnip: Use Vulkan 1.1 names instead of KHR That is, drop KHR from all tokens that were promoted to Vulkan 1.1. The consistency makes ctags more useful (it now jumps directly to the real definitions in vulkan_core.h instead of the typedefs); and it makes the code slightly less verbose.	2019-03-11 10:02:13 -07:00
Chia-I Wu	4f863dc0f7	turnip: guard -Dvulkan-driver=freedreno Require -DI-love-half-baked-turnips=true as well to enable freedreno vulkan driver.	2019-03-11 10:02:13 -07:00
Chia-I Wu	949ce2745d	turnip: preliminary support for tu_CmdDraw	2019-03-11 10:02:13 -07:00
Chia-I Wu	f9b34622cd	turnip: preliminary support for draw state binding This adds support for tu_CmdBindPipeline, tu_CmdBindVertexBuffers, etc.	2019-03-11 10:02:13 -07:00
Chia-I Wu	54b7a57c22	turnip: add draw_cs to tu_cmd_buffer It will hold draw commands.	2019-03-11 10:02:13 -07:00
Chia-I Wu	1cdbab016e	turnip: parse VkPipelineVertexInputStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	d17096b9b1	turnip: parse VkPipelineShaderStageCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	a7d842c97c	turnip: compile VkPipelineShaderStageCreateInfo Compile all shaders and upload the binaries to a BO.	2019-03-11 10:02:13 -07:00
Chia-I Wu	970a8fec96	turnip: preliminary support for shader modules Save SPIR-V in tu_shader_module. Tranlation to NIR happens in tu_shader_create, and compilation to binary code happens in tu_shader_compile. Both will be called during pipeline creation.	2019-03-11 10:02:13 -07:00
Chia-I Wu	9e0d878787	turnip: parse VkPipeline{Multisample,ColorBlend}StateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	bec0abf294	turnip: parse VkPipelineDepthStencilStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	9496b377ff	turnip: parse VkPipelineRasterizationStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	b4884761e8	turnip: parse VkPipelineViewportStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	1bea6a91cb	turnip: parse VkPipelineInputAssemblyStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	c584c2e86c	turnip: parse VkPipelineDynamicStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	df48cb7b3e	turnip: create a less dummy pipeline Still dummy, but at least it is created from tu_pipeline_builder.	2019-03-11 10:02:13 -07:00
Chia-I Wu	57327626dc	turnip: simplify tu_cs sub-streams usage Let tu_cs_begin_sub_stream imply tu_cs_reserve_space, and tu_cs_end_sub_stream imply tu_cs_sanity_check. Callers are no longer required to call them (but can still do if they choose to).	2019-03-11 10:02:13 -07:00
Chia-I Wu	59419bb691	turnip: fix tu_cs sub-streams Update cs->start in tu_cs_end_sub_stream. Otherwise, the entry would include commands from all prior sub-streams.	2019-03-11 10:02:13 -07:00
Chia-I Wu	c0567e84db	turnip: tu_cs_emit_array Array version of tu_cs_emit. Useful for updating multiple consecutive array-like registers, or loading a shader binary with SS6_DIRECT.	2019-03-11 10:02:13 -07:00
Chia-I Wu	fffaa9b4b3	turnip: add tu_cs_discard_entries We will start a draw IB at the beginning of a subpass and consume it at the end of the subpass. With tu_cs_discard_entries, we can reuse the same tu_cs for all subpasses.	2019-03-11 10:02:13 -07:00
Chia-I Wu	10c5013442	turnip: more/better asserts for tu_cs Asserting (cur < end) in tu_cs_emit catches much less programming errors comparing to asserting (cur < reserved_end). We should never write more commands than what we have reserved. Assert IB is non-empty and sane in tu_cs_emit_ib.	2019-03-11 10:02:13 -07:00
Chia-I Wu	aa7dd6cb7f	turnip: use 32-bit offset in tu_cs_entry We don't support nor expect BOs to be that big in tu_cs.	2019-03-11 10:02:13 -07:00
Chia-I Wu	b8a5e10d0d	turnip: mark IBs for dumping Includes IBs in kernel cmdbuf dumps.	2019-03-11 10:02:13 -07:00
Eric Engestrom	4a48dd9fb8	turnip: use the platform defines in vk.xml instead of hard-coding them Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	0d12bcbfa7	turnip: Add todo for copies.	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	51115e7201	turnip: Add buffer->image DMA copies. Passes dEQP-VK.api.copy_and_blit.core.buffer_to_image.*	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	6616563472	turnip: Add image->buffer DMA copies. Passes dEQP-VK.api.copy_and_blit.core.image_to_buffer.*	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	d76a1e2aa1	turnip: Implement buffer->buffer DMA copies. Passes dEQP-VK.api.copy_and_blit.core.buffer_to_buffer.*	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	bafbf3bafe	turnip: Add tu6_rb_fmt_to_ifmt.	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	148876d424	turnip: Make tu6_emit_event_write shared.	2019-03-11 10:02:13 -07:00
Bas Nieuwenhuizen	7238471587	turnip: Add buffer memory binding.	2019-03-11 10:02:13 -07:00
Chia-I Wu	08b1c3fc7f	turnip: respect color attachment formats Make tu6_get_native_format available to tu_cmd_buffer and start using of it.	2019-03-11 10:02:13 -07:00
Chia-I Wu	68c27ea92b	turnip: preliminary support for fences This should be quite complete feature-wise. External fences are still missing. We probably also want to add a simpler path to tu_WaitForFences for when fenceCount == 1.	2019-03-11 10:02:13 -07:00
Chia-I Wu	15319963fa	turnip: fix VkClearValue packing Add tu_pack_clear_value to correctly pack VkClearValue according to VkFormat. It ignores the component order defined by VkFormat, and always packs to WZYX order.	2019-03-11 10:02:13 -07:00
Chia-I Wu	6545461041	turnip: add support for VK_KHR_external_memory_{fd,dma_buf}	2019-03-11 10:02:13 -07:00
Chia-I Wu	6d1c4049de	turnip: advertise VK_KHR_external_memory AFAICT, it is supported. We don't need to handle any of the new structs because our BOs can always be exported.	2019-03-11 10:02:13 -07:00
Chia-I Wu	0253845272	turnip: advertise VK_KHR_external_memory_capabilities AFAICT, it is supported.	2019-03-11 10:02:13 -07:00
Chia-I Wu	de89436216	turnip: add functions to import/export prime fd Add tu_bo_init_dmabuf, tu_bo_export_dmabuf, tu_gem_import_dmabuf, and tu_gem_export_dmabuf.	2019-03-11 10:02:13 -07:00
Chad Versace	d5239bc59c	turnip: Fix error behavior for VkPhysicalDeviceExternalImageFormatInfo If the handle type is unsupported, then the spec requires us to return VK_ERROR_FORMAT_NOT_SUPPORTED. Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Closes: https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/merge_requests/17	2019-03-11 10:02:13 -07:00
Chia-I Wu	4b9f967cd1	turnip: add a more complete format table A format table is an array of tu_native_format. Table lookup is done through array indexing. This commit defines a single format table for core VkFormat. It is derived from the table in the gallium driver. There might be errors introduced in the process of the conversion. When an extension that defines new VkFormat is supported, we need to add a new table for the extension.	2019-03-11 10:02:13 -07:00
Chia-I Wu	f3bf779184	turnip: preliminary support for loadOp and storeOp - create tile_load_ib and tile_store_ib at the beginning of each subpass - execute the IBs at the end of each subpass - no DONT_CARE support - no subpass dependency analysis and subpass merging - no zs support - no true VkImageView support - assume VK_FORMAT_B8G8R8A8_UNORM - no tiling - no MSAA This also removes cur_cs from tu_cmd_buffer.	2019-03-11 10:02:13 -07:00
Chia-I Wu	0aeef7c8bd	turnip: add TU_CS_MODE_SUB_STREAM When in TU_CS_MODE_SUB_STREAM, tu_cs_begin_sub_stream (or tu_cs_end_sub_stream) should be called instead of tu_cs_begin (or tu_cs_end). It gives the caller a TU_CS_MODE_EXTERNAL cs to emit commands to.	2019-03-11 10:02:13 -07:00
Chia-I Wu	f59c381423	turnip: add tu_cs_mode Add tu_cs_mode and TU_CS_MODE_EXTERNAL. When in TU_CS_MODE_EXTERNAL, tu_cs wraps an external buffer and can not grow. This also moves tu_cs* up in tu_private.h, such that other structs can embed tu_cs_entry.	2019-03-11 10:02:13 -07:00
Chia-I Wu	5c63fc626f	turnip: provide both emit_ib and emit_call tu_cs_emit_ib emits a CP_INDIRECT_BUFFER for a BO. tu_cs_emit_call emits a CP_INDIRECT_BUFFER for each entry of a target cs.	2019-03-11 10:02:13 -07:00
Chia-I Wu	741a4325df	turnip: add tu_cs_sanity_check It replaces tu_cs_reserve_space_assert and can be called at any time to sanity check tu_cs.	2019-03-11 10:02:13 -07:00
Chia-I Wu	29f1110003	turnip: never fail tu_cs_begin/tu_cs_end Error checking tu_cs_begin/tu_cs_end is too tedious for the callers. Move tu_cs_add_bo and tu_cs_reserve_entry to tu_cs_reserve_space such that tu_cs_begin/tu_cs_end never fails.	2019-03-11 10:02:13 -07:00
Chia-I Wu	0d81be3959	turnip: specify initial size in tu_cs_init We will drop size parameter from tu_cs_begin shortly, such that tu_cs_begin never fails.	2019-03-11 10:02:13 -07:00
Chia-I Wu	2774a1b97d	turnip: add tu_cs_{reserve,add}_entry We will stop calling tu_cs_reserve_entry in tu_cs_end shortly, such that tu_cs_end never fails.	2019-03-11 10:02:13 -07:00
Chia-I Wu	c11580373f	turnip: add internal helpers for tu_cs Add tu_cs_get_offset, tu_cs_get_size, tu_cs_get_space, and tu_cs_is_empty.	2019-03-11 10:02:13 -07:00
Chia-I Wu	429e2d5755	turnip: add tu_tiling_config We need the current color/depth/stencil attachments and the current render area to compute the tiling config. We compute the tiling config at the beginning of each subpass for the moment. We should change that when the driver can reorder/merge subpasses. It is very common that the render area is the entire framebuffer. We might want to optimize for the case and compute the tiling config in tu_framebuffer ctor.	2019-03-11 10:02:13 -07:00
Chia-I Wu	7c4483de0e	turnip: preliminary support for tu_GetRenderAreaGranularity Set it to tile alignments, 32x32 on 6xx.	2019-03-11 10:02:13 -07:00
Chia-I Wu	9c83a7572b	turnip: emit HW init in tu_BeginCommandBuffer Being the first commit that emits meaningful command packets, there are many things included in this commit - tu6_emit_xxx are low-level helpers that emit command packets without boundary checks - tu6_xxx are high-level helpers that emit command packets with boundary checks - cmdbuf->cs is a pointer to the current CS, so that we can use the helpers above to emit to other CS - use cmd as the variable name of tu_cmd_buffer - there is a per-cmdbuf scratch bo for CP_EVENT_WRITE writeback - there is a per-cmdbuf debug marker, using scratch reg 7 or 6 depending on whether the cmdbuf is primary or secondary (olv, after rebase) REG_A6XX_SP_UNKNOWN_AB20 is renamed	2019-03-11 10:01:49 -07:00
Chia-I Wu	3b3af6321b	turnip: add tu_cs_reserve_space(_assert) They are used like tu_cs_reserve_space(...); tu_cs_emit(...); ...; tu_cs_reserve_space_assert(); to make sure we reserved enough space at the beginning.	2019-03-11 10:01:41 -07:00
Chad Versace	aaa59ef70c	turnip: Annotate vkGetImageSubresourceLayout with tu_stub Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-11 10:01:41 -07:00
Chia-I Wu	ba6bcb387c	turnip: preliminary support for tu_CmdBeginRenderPass	2019-03-11 10:01:41 -07:00
Chia-I Wu	1085df8176	turnip: preliminary support for tu_image_view_init	2019-03-11 10:01:41 -07:00
Chia-I Wu	992ecdd40e	turnip: preliminary support for tu_BindImageMemory2	2019-03-11 10:01:41 -07:00
Chia-I Wu	ef49b07b83	turnip: add cmdbuf->bo_list to bo_list in queue submit	2019-03-11 10:01:41 -07:00
Chia-I Wu	6c4df43db5	turnip: add tu_bo_list_merge tu_bo_list_merge adds an entire list to the current list.	2019-03-11 10:01:41 -07:00
Chia-I Wu	7ad01913bd	turnip: build drm_msm_gem_submit_bo array directly Build drm_msm_gem_submit_bo array directly in tu_bo_list. We might change this again, but this is good enough for now. There are other issues as well, such as not using VkAllocationCallbacks and sloppy error checking. We should revisit this in the near future. Same to tu_cs.	2019-03-11 10:01:41 -07:00
Chia-I Wu	c969d8b975	turnip: add more tu_cs helpers	2019-03-11 10:01:41 -07:00
Chia-I Wu	39ba2b20d1	turnip: inline tu_cs_check_space This allows the fast path (size check) to be inlined.	2019-03-11 10:01:41 -07:00
Chia-I Wu	2bcaa78236	turnip: update cs->start in tu_cs_end This allows us to assert that there is no dangling command in tu_cs_begin, rather than discarding them silently.	2019-03-11 10:01:41 -07:00
Chia-I Wu	b01d1618a4	turnip: minor cleanup to tu_cs_end Add comments and error checking.	2019-03-11 10:01:41 -07:00
Chia-I Wu	af4eb20891	turnip: add tu_cs_add_bo Refactor BO allocation code out of tu_cs_begin. Add error checking.	2019-03-11 10:01:41 -07:00
Chia-I Wu	ae9a72b48b	turnip: document tu_cs	2019-03-11 10:01:41 -07:00
Chia-I Wu	45120127ea	turnip: run sed and clang-format on tu_cs	2019-03-11 10:01:41 -07:00
Kristian H. Kristensen	0801019d33	turnip: Only get bo offset when we need to mmap The offset we get from MSM_INFO_GET_OFFSET is an offset into the drm fd for the purpose of mmaping the buffer.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	23d6f0f970	turnip: Move stream functions to tu_cs.c	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	ac2a845abf	turnip: Add emit functions in a header. This adds a radv-style check_space functions + emit functions. Also puts them in a header as a bunch of inlines, so (1) we can use them from meta code. (2) they are inline for performance as these are common and small. Did not put them in tu_private.h as a bunch of inlines only clutters up that huge headerfile. Precise error propagation for memory allocation failures is still todo.	2019-03-11 10:01:41 -07:00
Chia-I Wu	2e684cb800	turnip: preliminary support for tu_QueueWaitIdle This creates a new fd on each queue submit. I do not go with DRM_IOCTL_MSM_WAIT_FENCE solely because the path is marked legacy. Otherwise, we can use the fence id rather than requesting a fence fd until external fences are supported and enabled.	2019-03-11 10:01:41 -07:00
Chia-I Wu	b7a6a80e6c	turnip: constify tu_device in tu_gem_*	2019-03-11 10:01:41 -07:00
Chia-I Wu	3809e6cf63	turnip: add wrappers around DRM_MSM_SUBMITQUEUE_* Add tu_drm_submitqueue_new and tu_drm_submitqueue_close.	2019-03-11 10:01:41 -07:00
Chia-I Wu	fcf24f47aa	turnip: add wrappers around DRM_MSM_GET_PARAM Add tu_drm_get_gpu_id and tu_drm_get_gmem_size.	2019-03-11 10:01:41 -07:00
Chia-I Wu	a25a803127	turnip: remove unnecessary libfreedreno_drm dep Remove libfreedreno_drm dep and unused fd_device.	2019-03-11 10:01:41 -07:00
Chia-I Wu	91232c52fe	turnip: use msm_drm.h from inc_freedreno The recent change to msm_drm.h changed the APIs in an incompatible way.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	4f32869e3d	turnip: Shorten primary_cmd_stream name. It really is too long.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	26261847cf	turnip: Fill command buffer	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	abe352525d	turnip: Implement submission.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	abf0792bbe	turnip: Make bo_list functions not static	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	65e0e79054	turnip: Add msm queue support.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	8713499657	turnip: Add a command stream.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	e3a9b07923	turnip: Implement a slow bo list	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	48b65201a6	turnip: Implement some UUIDs.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	7ae005f037	turnip: clean up TODO. ./deqp-vk -n dEQP-VK.info.* Writing test log into TestResults.qpa dEQP Core unknown (0xcafebabe) starting.. target implementation = 'Surfaceless' WARNING: tu is not a conformant vulkan implementation, testing use only. WARNING: tu is not a conformant vulkan implementation, testing use only. Test case 'dEQP-VK.info.build'.. Pass (Not validated) Test case 'dEQP-VK.info.device'.. Pass (Not validated) Test case 'dEQP-VK.info.platform'.. Pass (Not validated) Test case 'dEQP-VK.info.memory_limits'.. Pass (Pass) DONE! Test run totals: Passed: 4/4 (100.0%) Failed: 0/4 (0.0%) Not supported: 0/4 (0.0%) Warnings: 0/4 (0.0%)	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	06602bf77f	turnip: Remove some radv leftovers.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	c72e6085e7	turnip: Implement some format properties for RGBA8. Just to get some tests to not skip. This is neither complete nor completely correct.	2019-03-11 10:01:41 -07:00
Chia-I Wu	d30baaaba6	turnip: add .clang-format Add and apply .clang-format.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	6401ad389e	turnip: Implement pipe-less param query.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	b0562e272f	turnip: move tu_gem.c to tu_drm.c	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	3d99dd55a0	turnip: Stop hardcoding the msm version check.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	d9c3dc8ec8	turnip: Add image layout calculations.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	603354cffa	turnip: Fix memory mapping.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	426f6e46a8	turnip: Fix bo allocation after we stopped using libdrm_freedreno ... Al this figuring out new errors is why I don't like reinventing the wheel.	2019-03-11 10:01:41 -07:00
Bas Nieuwenhuizen	f0a24e123f	turnip: Add 630 to the list.	2019-03-11 10:01:41 -07:00
Chad Versace	c3b5eea2cc	turnip: Don't return from tu_stub funcs Since the macros are lowercase and look like normal functions, that they change control flow with a hidden return is surprising.	2019-03-11 10:01:41 -07:00
Chad Versace	bf709dfe3f	turnip: Fix 'unused' warnings Now turnip builds without warnings on my machine.	2019-03-11 10:01:41 -07:00
Chad Versace	471f2d8409	turnip: Add TODO file	2019-03-11 10:01:41 -07:00
Chad Versace	359e9016c5	turnip: Replace fd_bo with tu_bo (olv, after rebase) remove inc_drm_uapi	2019-03-11 10:01:33 -07:00
Chad Versace	eb16ec715f	turnip: Use vk_errorf() for initialization error messages This small cleanup better prepares turnip for VK_EXT_debug_report.	2019-03-11 10:01:33 -07:00
Chad Versace	1372c95ad2	turnip: Add TODO for Android logging	2019-03-11 10:01:33 -07:00
Chad Versace	cca208a033	turnip: Require DRM device version >= 1.3 Because the driver will require support for iova.	2019-03-11 10:01:33 -07:00
Chad Versace	5486943ed9	turnip: Fix indentation	2019-03-11 10:01:33 -07:00
Chad Versace	99a5de14cb	turnip: Fix a real -Wmaybe-uninitialized	2019-03-11 10:01:33 -07:00
Chad Versace	75f2c8458b	turnip: Use vk_outarray in all relevant public functions	2019-03-11 10:01:33 -07:00
Chad Versace	3ec87d56bd	turnip: Fix result of vkEnumerate*ExtensionProperties Given an unsupported layer name, the functions must return VK_ERROR_LAYER_NOT_PRESENT.	2019-03-11 10:01:33 -07:00
Chad Versace	ee835c7790	turnip: Fix result of vkEnumerateLayerProperties The functions must not return VK_ERROR_LAYER_NOT_PRESENT. The spec reserves that error for vkEnumerateExtensionProperties.	2019-03-11 10:01:33 -07:00
Chad Versace	daffb01704	turnip: Fix indentation in function signatures Due to s/anv/tu/, in many function signatures the indentation of parameters was off-by-one.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	b4f3e0d549	turnip: Disable more features.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	a01edd9c86	turnip: Initialize memory type in requirements.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	7be2e1fc37	turnip: Cargo cult the Intel heap size functionality.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	462b693d94	turnip: Report a memory type and heap.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	8e52e8183c	turnip: Add buffer allocation & mapping support.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	a0d62e4337	turnip: Fix newly introduced warning.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	bcd15ab34e	turnip: Remove abort.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	13ff7ffbcb	turnip: Gather some device info.	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	7922d50bd4	turnip: Fix up detection of device.	2019-03-11 10:01:33 -07:00
Chad Versace	c63cb15745	turnip: Drop Makefile.am and Android.mk The Makefile.am doesn't work. I tried fixing it but gave up because I don't understand Autotools. I strongly suspect the Android.mk also doesn't work. Rather than maintain the broken build files, let's delete them and re-add working build files if-and-when we need them. (Maybe we'll be lucky and turnip will never need to support Autotools!).	2019-03-11 10:01:33 -07:00
Bas Nieuwenhuizen	26380b3a9f	turnip: Add driver skeleton (v2) meson files have been updated, autotools and android still need updating. Only build tested. v2 (chadv): - Rebase onto master. - Fix build breakage in Python scripts. - Drop the WSI code. The internal WSI apis have changed recently, and will likely change again before the driver goes upstream. To avoid unnecessary rebase work, let's drop the WSI code and re-add it when we're ready to really use WSI. (olv, after rebase) do not enable freedreno by default on ARM	2019-03-11 10:01:15 -07:00
Connor Abbott	d086d16b81	nir/serialize: Prevent writing uninitialized state_slot data The nir_state_slot struct had some padding that was never initialized. Serializing the individual parts of the struct is more robust and avoids the overhead of zeroing it at creation, so just do that. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 15:17:41 +01:00
Tapani Pälli	47fc359822	anv: release memory allocated by glsl types during spirv_to_nir Fixes leaks for each glsl_type generated: ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18 ==32470== at 0x483880B: malloc (vg_replace_malloc.c:309) ==32470== by 0x4C43F4A: ralloc_size (ralloc.c:119) ==32470== by 0x4C44014: rzalloc_size (ralloc.c:151) ==32470== by 0x4C44258: rzalloc_array_size (ralloc.c:215) ==32470== by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:114) ==32470== by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:1146) ==32470== by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501) ==32470== by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269) ==32470== by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018) ==32470== by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365) ==32470== by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490) ==32470== by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173) v2: move release call to vkDestroyInstance Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-11 13:14:45 +02:00
Eric Engestrom	f9a6460bbf	wsi/x11: use WSI_FROM_HANDLE() instead of pointer casts Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-11 10:49:36 +00:00
Eric Engestrom	f2e24dd81d	wsi/wayland: fix pointer casting warning on 32bit Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-11 10:49:36 +00:00
Eric Engestrom	687babc045	wsi/display: s/#if/#ifdef/ to fix -Wundef Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-11 10:49:36 +00:00
Eric Engestrom	1ee01d91c7	wsi: deduplicate get_current_time() functions between display and x11 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-11 10:49:36 +00:00
Tapani Pälli	7bb34ecff9	anv: release memory allocated by bo_heap when descriptor pool is destroyed Fixes following leak: ==21853== 32 bytes in 1 blocks are definitely lost in loss record 2 of 20 ==21853== at 0x483AB1A: calloc (vg_replace_malloc.c:762) ==21853== by 0x4C4DD7F: util_vma_heap_free (vma.c:221) ==21853== by 0x4C4D647: util_vma_heap_init (vma.c:46) ==21853== by 0x4957B9F: anv_CreateDescriptorPool (anv_descriptor_set.c:578) Fixes: `c520f4dec9` ("anv: Add a concept of a descriptor buffer") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 08:13:27 +02:00
Tapani Pälli	105002bd2d	anv: destroy descriptor sets when pool gets destroyed Patch maintains a list of sets in the pool and destroys possible remaining sets when pool is destroyed. As stated in Vulkan spec: "When a pool is destroyed, all descriptor sets allocated from the pool are implicitly freed and become invalid." This fixes memory leaks spotted with valgrind: ==19622== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==19622== at 0x483880B: malloc (vg_replace_malloc.c:309) ==19622== by 0x495B67E: default_alloc_func (anv_device.c:547) ==19622== by 0x4955E05: vk_alloc (vk_alloc.h:36) ==19622== by 0x4956A8F: anv_multialloc_alloc (anv_private.h:538) ==19622== by 0x4956A8F: anv_CreateDescriptorSetLayout (anv_descriptor_set.c:217) Fixes: `14f6275c92` ("anv/descriptor_set: add reference counting for descriptor set layouts") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 08:13:01 +02:00
Timothy Arceri	051b4064da	anv: add support for dumping shader info via VK_EXT_debug_report This information will be used by the vkpipeline-db tool. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 16:16:04 +11:00
Kenneth Graunke	f36794d1f0	iris: Fix backface stencil write condition A bit too much search and replace here.	2019-03-10 14:52:53 -07:00
Alyssa Rosenzweig	ea2cd73625	panfrost/drm: Cast pointer to u64 to fix warning Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-10 19:16:56 +00:00
Tomeu Vizoso	756f7b9989	panfrost: Add backend targeting the DRM driver This backend interacts with the new DRM driver for Midgard GPUs which is currently in development. When using this backend, Panfrost has roughly on-par functionality as when using the non-DRM driver from Arm. Alyssa Rosenzweig: To do so, we implement additional routines for runtime GPU version detection and fencing. We cleanup some duplicate code interfering with the new driver. We fix a long-standing memory leak which is aggravated on the new driver. Finally, we implement BO import/export in a way compatible with the new driver. These changes are squashed to preserve bisectability given the hard-to-track ABI shifts in the nondrm module Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-10 19:09:23 +00:00
Tomeu Vizoso	d4dc79df72	panfrost: Add gem_handle to panfrost_memory and panfrost_bo It will be used by the DRM backend to store GEM handles from the kernel. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-10 18:56:56 +00:00
Rob Clark	941adcef03	freedreno/a6xx: more bcolor fixes Non-zero offset wasn't working, which breaks a bunch of dEQP-GLES31.functional.texture.border_clamp.formats.* when doing sharded deqp runs (because order of tests changes, resulting in different texture state bound.. deqp doesn't really clean up it's gl state between tests very well) Previously, if additional textures were bound, due to using too small of a bcolor_entry size, the last 32bytes of the bcolor_entry would be overwritten. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-10 11:40:06 -04:00
Eric Engestrom	db944999a1	gitlab-ci: add panfrost to the gallium drivers build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-09 23:25:12 +00:00
Eric Engestrom	e6ba67dd65	panfrost: move #include to fix compilation In standalone.h, the struct gl_context type is not declared by #includ'ing mtypes.h: In file included from src/gallium/drivers/panfrost/midgard/cmdline.c:24: src/compiler/glsl/standalone.h:46:14: warning: ‘struct gl_context’ declared inside parameter list will not be visible outside of this definition or declaration struct gl_context ctx); ^~~~~~~~~~ This causes the following compilation failure: src/gallium/drivers/panfrost/midgard/cmdline.c: In function ‘compile_shader’: src/gallium/drivers/panfrost/midgard/cmdline.c:58:61: error: passing argument 4 of ‘standalone_compile_shader’ from incompatible pointer type [-Werror=incompatible-pointer-types] prog = standalone_compile_shader(&options, 2, argv, &local_ctx); ^~~~~~~~~~ In file included from src/gallium/drivers/panfrost/midgard/cmdline.c:24: src/compiler/glsl/standalone.h:43:28: note: expected ‘struct gl_context ’ but argument is of type ‘struct gl_context ’ struct gl_shader_program standalone_compile_shader( ^~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: `e67e072637` "panfrost: Implement Midgard shader toolchain" Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-09 22:37:40 +00:00
Eric Engestrom	d4d29c0455	panfrost: fix tgsi_to_nir() call Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109945 Fixes: `7da251fc72` "panfrost: Check in sources for command stream" Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-09 22:06:19 +00:00
Axel Davy	5475434fa6	Revert "d3dadapter9: Support software renderer on any DRI device" This reverts commit `0d08476593`. It makes gitlab's travis fail. Revert until patch is fixed. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-03-09 14:29:43 +01:00
Axel Davy	597b5e27fa	st/nine: Change a few advertised caps Most hw on the native platform advertise these caps this way. D3DCAPS_READ_SCANLINE: We don't really have hardware support for that, but many games don't even check the flag, and expect GetRasterStatus to work, which is why we emulated it with a timer (like wine). So we may as well advertise the cap. D3DCURSORCAPS_LOWRES: I don't know what is the status of this on X11, but I don't know of any dx9 game running at height < 400 either. D3DPTEXTURECAPS_TEXREPEATNOTSCALEDBYSIZE: The cap should correspond to what the current generation of hw is doing. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2019-03-09 13:57:49 +01:00
Axel Davy	0d3c37e2f9	st/nine: Do not advertise CANMANAGERESOURCE It doesn't seem the main vendors advertise it. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2019-03-09 13:57:49 +01:00
Axel Davy	a8583e75d6	st/nine: Do not advertise support for D15S1 and D24X4S4 The former is supported on Matrox cards but no other hw. The latter isn't supported anywhere. It is fine to not advertise them as supported, and it could prevent apps to trigger weird rendering paths. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-09 13:57:49 +01:00
Patrick Rudolph	0d08476593	d3dadapter9: Support software renderer on any DRI device If D3D_ALWAYS_SOFTWARE is set for debugging purposes, run on any DRI enabled platform. Instead of probing for a compatible gallium driver (which might fail if there's none) always use the KMS DRI software renderer. Allows to run nine on i915 when D3D_ALWAYS_SOFTWARE=1. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2019-03-09 13:57:49 +01:00
Axel Davy	f7b9c09c7c	st/nine: Disable depth write when nothing gets updated I do not see any perf impact on radeonsi, but it seems iris needs this. It seems something sensible to do. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Andre Heider <a.heider@gmail.com>	2019-03-09 13:57:49 +01:00
Elie Tournier	d7b3196976	virgl: Return an error if we use fp64 on top of GLES Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: <Gurchetan Singh gurchetansingh@chromium.org>	2019-03-09 11:33:20 +01:00
Elie Tournier	1f1514e1aa	virgl: Set PIPE_CAP_DOUBLES when running on GLES This is a lie but no known app use fp64. Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: <Gurchetan Singh gurchetansingh@chromium.org>	2019-03-09 11:33:14 +01:00
Elie Tournier	8ad1e86bb0	virgl: Add a caps to advertise GLES backend Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: <Gurchetan Singh gurchetansingh@chromium.org>	2019-03-09 11:32:30 +01:00
Kenneth Graunke	da51e3f1b0	Revert MR 369 (Fix extract_i8 and extract_u8 for 64-bit integers) This broke piles of image load store tests (179 failures on CI, mesa_master build #15546, previous build right before this landed was green). I'd rather not leave the tree on fire over the weekend, so let's revert for now, and we can figure out what happened next week.	2019-03-09 01:42:16 -08:00
Ian Romanick	18e4bf65de	nir/algebraic: Add missing 16-bit extract_[iu]8 patterns No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-08 22:24:19 -08:00
Ian Romanick	55c1ac4b75	nir/algebraic: Add missing 64-bit extract_[iu]8 patterns No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-08 22:24:19 -08:00
Ian Romanick	9aaaac6080	nir/algebraic: Remove redundant extract_[iu]8 patterns No shader-db changes on any Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-08 22:24:19 -08:00
Ian Romanick	37ee462e03	nir/algebraic: Fix up extract_[iu]8 after loop unrolling Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256837 (<.01%) instructions in affected programs: 4713 -> 4710 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.06% max: 0.08% x̄: 0.06% x̃: 0.06% total cycles in shared programs: 372286583 -> 372286583 (0.00%) cycles in affected programs: 198516 -> 198516 (0.00%) helped: 1 HURT: 1 helped stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.01% max: 0.01% x̄: 0.01% x̃: 0.01% No changes on any other Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-08 22:24:19 -08:00
Jason Ekstrand	8fdee457a4	anv/pipeline: Move lower_explicit_io much later Now that nir_opt_copy_prop_vars can properly handle array derefs on vectors, it's safe to move UBO and SSBO lowering to late in the pipeline. This should allow NIR to actually start optimizing SSBO access. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-08 22:03:34 -06:00
Jason Ekstrand	179d254cba	intel/nir: Move lower_mem_access_bit_sizes to postprocess_nir It doesn't really matter where this pass goes as long as it's after we call nir_lower_explicit_io and before we go into the back-end. Putting it brw_postprocess_nir lets us move nir_lower_explicit_io significantly later in the pipeline. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-08 22:03:14 -06:00
Rob Clark	ad25948261	freedreno/ir3: turn on [iu]mul_high Which also requires uadd_carry lowering Until recently this was lowered in glsl ir so it went unnoticed that we weren't lowering it. Fixes: `1d8994a63b` glsl: [u/i]mulExtended optimization for GLSL Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-08 18:44:57 -05:00
Rob Clark	53083e4fbc	freedreno/ir3: fix ir3_cmdline harder Fixes: `45271702ec` freedreno: fix ir3_cmdline build Fixes: `7530d4abfc` glsl/freedreno/panfrost: pass gl_context to the standalone compiler Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-08 18:44:57 -05:00
Eric Anholt	fafead7b62	st/dri: Set the PIPE_BIND_SHARED flag on create_image_with_modifiers. With createImage(), the caller was expected to set a SHARED flag if they needed the ability to get a GEM handle. DRI3, wayland, and gbm all set it, EGL_MESA_drm_image passes it through, and surfaceless doesn't need it because there's no way to request a handle. With the new createImageWithModifiers() DRI method to replace it, the expectation is that you'll always be able to share the buffer, so the flag is unnecessary in its arguments. However, we do need to tell gallium about this expectation. Without this, kmscube's modifiers path using gbm_bo_create_with_modifiers(&modifier, 1) instead of gbm_bo_create(SCANOUT \| SHARED) will call the driver's resource_create() function wtih PIPE_BIND_SHARED unset, so the driver (particularly renderonly drivers) may allocate in such a way that it can't return an answer from gbm_bo_get_handle(). I used to have a hack in v3d using count==1 && modifier==LINEAR to indicate that you wanted SHARED anyway, but that was dropped recently. Fixes: `59527a36e9` ("v3d: Restructure RO allocations using resource_from_handle.") Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-08 15:33:35 -08:00
Kenneth Graunke	9d1334d2a0	iris: Use copy_region and staging resources to avoid transfer stalls This is similar to intel_miptree_map_blit and intel_buffer_object.c's temporary blits in i965. Improves performance of DiRT Rally by 20-25% by eliminating stalls. Breaks piglit's spec/arb_shader_image_load_store/host-mem-barrier, by using the GPU to do uploads, exposing a st/mesa issue where it doesn't give us memory_barrier() calls. This is a pre-existing issue and will be fixed by a later patch (currently out for review).	2019-03-08 13:29:39 -08:00
Eric Engestrom	f67c870179	android: fix missing backspace for line continuation Reported-by: Clayton Craft <clayton.a.craft@intel.com> Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109944 Fixes: `e1d81decf7` "build: make passing an incorrect pointer type a hard error" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-08 21:14:24 +00:00
Karol Herbst	8a8742d327	prog_to_nir: fix write from vps to FOG for fragment programs we already treat fog as a single component value, but for vp we didn't. Fixes fog related piglit tests with my out of tree Nouveau nir patches. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-08 10:35:12 -08:00
Sagar Ghuge	bca28deb46	iris: Track last VS URB entry size Return immediately if last VS URB entry size is good enough for BLORP operation v2: Fix comments (Caio) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Kenneth Graunke<kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-08 10:01:39 -08:00
Sagar Ghuge	d0a8fba69a	iris: Refactor code to share 3DSTATE_URB_* packet v2: 1) Set IRIS_DIRTY_URB bit (Caio) 2) Get rid of unnecessary function (Caio) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-08 10:01:39 -08:00
Eric Engestrom	6e3d3f5b2c	glx/meson: use full include path for dri_interface.h Everything else uses `#include "GL/internal/dri_interface.h"` instead, and this full path was even already used in other parts of GLX. While at it, nothing uses `inc_gl_internal` anymore so let's remove it as well. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-03-08 18:00:19 +00:00
Eric Engestrom	b1218d8cf7	hgl/meson: drop unused include directory Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-03-08 18:00:19 +00:00
Brian Paul	0de83bacf0	intel/compiler: silence unitialized variable warning in opt_vector_float() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-08 10:23:11 -07:00
Brian Paul	b5ea56e411	intel/decoders: silence uninitialized variable warnings in gen_print_batch() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-08 10:23:11 -07:00
Brian Paul	e5e2be3c73	st/mesa: init hash keys with memset(), not designated initializers Since the compiler may not zero-out padding in the object. Add a couple comments about this to prevent misunderstandings in the future. Fixes: `67d96816ff` ("st/mesa: move, clean-up shader variant key decls/inits") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-03-08 10:23:11 -07:00
Eric Engestrom	d2cff164cd	gitlab-ci: fix llvm version (7 doesn't have a ".0") Fixes: `85ee157283` "gitlab-ci: autotools needs to be told which llvm version to use" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-08 17:03:06 +00:00
Eric Engestrom	e1d81decf7	build: make passing an incorrect pointer type a hard error More or less any of this issue pointed out by the compiler is a coding error. Make sure we flag it and bail loudly. v2: - apply the change to autotools and scons as well (Emil) - C++ doesn't need this, it's already an error and the flag doesn't exist (Gert) v3: - drop scons, flags are not checked so until someone adds that functionality we can't have this. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> # v1 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> # v1 [Emil: apply the same change to autotools and scons] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-08 16:24:06 +00:00
Eric Engestrom	598f10eacc	r600: cast pointer to expected type Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-03-08 16:24:06 +00:00
Eric Engestrom	85ee157283	gitlab-ci: autotools needs to be told which llvm version to use Fixes: 45d58cd91567b39f51af "gitlab-ci: only build the default (=latest) and oldest llvm versions" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-08 16:03:04 +00:00
Eric Engestrom	3006f9d8c0	gitlab-ci: only build the default (=latest) and oldest llvm versions Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-08 15:59:27 +00:00
Eric Engestrom	08b70e1c2b	travis: clean up Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-08 15:33:39 +00:00
Eric Engestrom	e2f528bf21	travis: drop unused vars Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-08 15:17:42 +00:00
Eric Engestrom	44c420aa1b	travis: fix meson build by letting `auto` do its job Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-08 15:17:42 +00:00
Eric Engestrom	9cf85d3b78	autotools: don't build libGLES*.so with GLVND GLVND already provides these, so distro packagers have been deleting them all along. Let's save ourselves the trouble and not build them in the first place. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-08 15:13:36 +00:00
Eric Engestrom	b01524fff0	meson: don't build libGLES*.so with GLVND GLVND already provides these, so distro packagers have been deleting them all along. Let's save ourselves the trouble and not build them in the first place. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-08 15:13:36 +00:00
Brian Paul	2c387819f4	pipebuffer: s/PB_ALL_USAGE_FLAGS/PB_USAGE_ALL/ To fix build failure. I guess my meson configuration has assertions disabled for some reason. Trivial fix.	2019-03-08 08:07:24 -07:00
Brian Paul	d4381cf593	svga: remove SVGA_RELOC_READ flag in SVGA3D_BindGBSurface() This fixes a rendering issue where UBO updates aren't always picked up by drawing calls. This issue effected the Webots robotics simulator. VMware bug 2175527. Testing Done: Webots replay, piglit, misc Linux games Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2019-03-08 07:40:35 -07:00
Brian Paul	07e8a31e49	svga: refactor draw_vgpu10() function The draw_vgpu10() function was huge. Move the code for preparing the vertex buffers and the index buffer into separate functions. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-08 07:40:35 -07:00
Brian Paul	53acd4c688	st/mesa: whitespace, formatting fixes in st_cb_flush.c Trivial.	2019-03-08 07:40:35 -07:00
Brian Paul	67d96816ff	st/mesa: move, clean-up shader variant key decls/inits Move the variant key declarations inside the scope they're used. Use designated initializers instead of memset() calls. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-08 07:40:35 -07:00
Brian Paul	76a10fc89e	winsys/svga: use new pb_usage_flags enum type And add a comment that we're implicitly converting PIPE_TRANSFER_ flags to PB_USAGE_ flags in one place. And statically assert that the enum values match. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2019-03-08 07:40:35 -07:00
Brian Paul	b5f2b0d6b6	pipebuffer: whitespace fixes in pb_buffer.h Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2019-03-08 07:40:35 -07:00
Brian Paul	b286e74df6	pipebuffer: use new pb_usage_flags enum type Use a new enum type instead of 'unsigned' to make things a bit more understandable. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2019-03-08 07:40:35 -07:00
Charmaine Lee	daf567f797	svga: add svga shader type in the shader variant With this patch, the svga shader type will be saved in the shader variant, and there is no need to pass in the shader type to the define/destroy variant functions. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-08 07:40:34 -07:00
Brian Paul	ac6b33a50d	gallium/util: add some const qualifiers in u_bitmask.c And add/update comments. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-03-08 07:40:34 -07:00
Brian Paul	b5a3a90c0c	gallium/util: whitespace cleanups in u_bitmask.[ch] Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-03-08 07:40:34 -07:00
Alejandro Piñeiro	686b7b1d48	nir/linker: fix ARRAY_SIZE query with xfb varyings For a non-array varying, it is expecting ARRAY_SIZE as 1, instead of 0. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Antia Puentes	de31fb2f4f	nir/linker: Fix TRANSFORM_FEEDBACK_BUFFER_INDEX From the ARB_enhanced_layouts specification: "For the property TRANSFORM_FEEDBACK_BUFFER_INDEX, a single integer identifying the index of the active transform feedback buffer associated with an active variable is written to <params>. For variables corresponding to the special names "gl_NextBuffer", "gl_SkipComponents1", "gl_SkipComponents2", "gl_SkipComponents3", and "gl_SkipComponents4", -1 is written to <params>." We were storing the xfb_buffer value, instead of the value corresponding to GL_TRANSFORM_FEEDBACK_BUFFER_INDEX. Note that the implementation assumes that varyings would be sorted by offset and buffer. Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	7c0f411c27	nir/linker: use nir_gather_xfb_info Instead of a custom ARB_gl_spirv xfb gather info pass. In fact, this is not only about reusing code, but the current custom code was not handling properly how many varyings are enumerated from some complex types. So this change is also about fixing some corner cases. v2: Use util_bitcount, simplify current stage check (Kenneth) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	b2a212ac2e	nir/xfb: handle arrays and AoA of basic types On OpenGL, a array of a simple type adds just one varying. So gl_transform_feedback_varying_info struct defined at mtypes.h includes the parameters Type (base_type) and Size (number of elements). This commit checks this when the recursive add_var_xfb_outputs call handles arrays, to ensure that just one is addded. We also need to take into account AoA here v2: use glsl_type_is_leaf from nir_types (Timothy Arceri) v3: simplified aoa check, without the need ot using glsl_type_is_leaf, using glsl_types_is_struct (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	2b65fecd85	nir_types: add glsl_type_is_struct helper Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	8d693746e9	nir/xfb: sort varyings too Right now we are only re-sorting outputs. But it is better to sort too varyings, as linker expect them to be sorted out (as it was done on GLSL). For varyings, and to make easier to compute buffer_index, we sort also by buffer. We could do the same for outputs, but we lack a reason for that, so we left it as it is (just offset). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	cf0b2ad486	nir/xfb: adding varyings on nir_xfb_info and gather_info In order to be used for OpenGL (right now for ARB_gl_spirv). This commit adds two new structures: * nir_xfb_varying_info: that identifies each individual varying. For each one, we need to know the type, buffer and xfb_offset * nir_xfb_buffer_info: as now for each buffer, in addition to the stride, we need to know how many varyings are assigned to it. For this patch, the only case where num_outputs != num_varyings is with the case of doubles, that for dvec3/4 could require more than one output. There are more cases though (like aoa), that will be handled on following patches. v2: updated after new nir general XFB support introduced for "anv: Add support for VK_EXT_transform_feedback" v3: compute num_varyings beforehand for allocating, instead of relying on num_outputs as approximate value (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	9f68b9ac71	nir_types: add glsl_varying_count helper Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Alejandro Piñeiro	b62a8149ab	nir/xfb: add component_offset at nir_xfb_info Where component_offset here is the offset when accessing components of a packed variable. Or in other words, location_frac on nir.h. Different places of mesa use different names for it. Technically nir_xfb_info consumer can get the same from the component_mask, it seems somewhat forced to make it to compute it, instead of providing it. v2: rename local location_frac for comp_offset, more similar to the intended use (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Samuel Pitoiset	e72daf3e70	Revert "radv: execute external subpass barriers after ending subpasses" This changes is actually wrong because we have to sync before doing image layout transitions. This fixes rendering issues in Batman, Path of Exile and probably more titles. This reverts commit `76c17cfd8d`. Fixes: `76c17cfd8d` ("radv: execute external subpass barriers after ending subpasses") Cc: 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-08 14:59:26 +01:00
Lionel Landwerlin	7271808df8	intel/error2aub: support older style engine names Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	a036eac029	intel/error2aub: deal with GuC log buffer When Guc is enabled, the error state will contain a "global" buffer for the GuC log buffer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	c619ea945d	intel/error2aub: add a verbose option Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	ca0161f890	intel/error2aub: write GGTT buffers into the aub file Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	9b5dc2124f	intel/error2aub: store engine last ring buffer head/tail pointers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	cdab19fa57	intel/error2aub: annotate buffer with their address space Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	630a72827a	intel/error2aub: parse other buffer types We don't write them in the aub file yet. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	c0ea043888	intel/error2aub: strenghten batchbuffer identifier marker Found out that some base64 data matched the '---' identifier. We can avoid this by adding the surrounding spaces. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	650e6e5d33	intel/error2aub: identify buffers by engine Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	a07f5262f0	intel/error2aub: build a list of BOs before writing them The error state contains several kind of BOs, including the context image which we will want to write in a later commit. Because it can come later in the error state than the user buffers and because we need to write it first in the aub file, we have to first build a list of BOs and then write them in the appropriate order. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Chris Wilson	04ddff1aa4	iris: Wire up EGL_IMG_context_priority Add the missing PIPE_CAP_CONTEXT_PRIORITY_MASK and parsing of the context construction flags. Testcase: piglit/egl-context-priority Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-07 20:27:10 -08:00
Kenneth Graunke	2993088500	iris: Export a copy_region helper that doesn't flush I'll want to use this for transfer maps, which already do their own flushing. This lets us avoid a double flush, and also gives us more control over the batch which is selected.	2019-03-07 17:08:19 -08:00
Kenneth Graunke	335726fdac	iris: Spruce up "are we using this engine?" checks for flushing We were using batch->contains_draw as a proxy for "are we even using this engine?" That isn't quite right, because it only counts regular draws. BLORP operations may have also rendered to a resource, which needs to trigger flushing. To check for this, we also see if the render and sometimes depth caches are non-empty. We can also drop the "but there might already be stale data in the cache even if we haven't emitted any commands yet" concern in the comments. The kernel flushes caches between batches. This may not be great but it's at least better than what was there.	2019-03-07 17:08:07 -08:00
Timur Kristóf	b0c214ccee	radeonsi/nir: Only set window_space_position for vertex shaders. By mistake, this was previously set for all shaders. It is a vertex shader property so only makes sense to set it for vertex shaders. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-By: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-03-08 00:39:45 +00:00
Jason Ekstrand	1664de5924	nir/builder: Add a build_deref_array_imm helper Unlike most of the cases in which we do this by hand, the new helper properly handles non-32-bit pointers. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-07 21:20:30 +00:00
Jason Ekstrand	fcf2a0122e	nir/builder: Cast array indices in build_deref_follower There's no guarantee when build_deref_follower is called that the two derefs have the same bit size destination. Insert a cast on the array index in case we have differing bit sizes. While we're here, insert some asserts in build_deref_array and build_deref_ptr_as_array. The validator will catch violations here but they're easier to debug if we catch them while building. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-07 21:20:30 +00:00
Jason Ekstrand	cd4c1458ba	nir/builder: Emit better code for iadd/imul_imm Because we already know the immediate right-hand parameter, we can potentially save the optimizer a bit of work. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-07 21:20:30 +00:00
Rob Clark	ebbb6b8eaa	freedreno/a6xx: perfcntrs Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-07 15:33:42 -05:00
Rob Clark	40d8ed5ef3	freedreno/a6xx: fix border-color swizzles Fixes nearly all of the remaining dEQP-GLES31.functional.texture.border_clamp.formats.* fails Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-07 15:33:42 -05:00
Rob Clark	f5d80ff2db	freedreno/a6xx: refactor fd6_tex_swiz() We need a version of fd6_tex_swiz() that just returns the composed swizzle without building part of the TEX_CONST_0 state. So just refactor the existing function to build more of the TEX_CONST_0 state, and leave fd6_tex_swiz() simply composing swizzles. The small IBO state change (to use LINEAR for smaller sizes/levels) is to match the state in fd6_tex_const_0(). It seems like maybe tiled actually works at the smaller sizes but not if minification is in play, so best just to make images match what we do for textures. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-07 15:33:42 -05:00
Rob Clark	8dc47490c8	freedreno/a6xx: remove astc_srgb workaround Not used on a6xx, so remove some of the related plumbing that was copied over from older gens. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-07 15:33:42 -05:00
Rob Clark	45271702ec	freedreno: fix ir3_cmdline build Fixes: `7530d4abfc` glsl/freedreno/panfrost: pass gl_context to the standalone compiler Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-07 15:33:20 -05:00
Kenneth Graunke	d53b1b6215	iris: Drop PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY This cap is mainly for working around a r600 texture swizzle issue, but it also controls whether ARB_texture_buffer_object (with legacy formats) is enabled. I suspect the missing I/L/A/LA faking is why I had it set in the first place. Thanks to Ilia for pointing out that I shouldn't be setting this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Kenneth Graunke	809a81ec3a	iris: Properly support alpha and luminance-alpha formats For texturing, we map alpha formats to the corresponding red format, as many alpha formats are outright missing, and red is more efficient when sampling anyway. When rendering to A8_UNORM, we use that format directly, so the image gets the shader output's .a/.w channel, rather than the .r/.x channel. All other A* formats are non-renderable, so we can't do much and just mark them as unsupported for rendering. Fortunately, GL only requires rendering to A8_UNORM, so that works out. According to Andre Heider and Timur Kristóf, this fixes font rendering in Witcher 1 (via nine). Andre also reported that it fixes Unigine Heaven (presumably via nine). v2: Use the same swizzle for both sampler views and "render targets". BLORP expects the read swizzle, and will take the inverse when setting up the destination swizzle (and actually applying it in the shaders). We ignore the format swizzle when setting up normal rendering SURFACE_STATEs, which is necessary because it would be an illegal shader channel select combination. Thanks to Jason Ekstrand for pointing out that BLORP took an inverse swizzle. Tested-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Kenneth Graunke	fbc51c4c95	iris: Defer uploading sampler state tables until draw time Gallium might call us multiple times to bind subsets of the samplers, at which point we'd recreate the table a bunch of times. It doesn't really buy us anything to do it here - even if we defer to draw time, the dirty tracking ensures we'll only do it on the first draw after a bind_sampler_states() call. We now use the number of samplers specified by the shader instead of the binding count. If this number changes, we flag sampler state as dirty so we re-upload a table with the right number of entries. This also fixes a bug where ice->state.need_border_colors was never unset, so once something needed border colors, the pool would always be pinned in all future batches. v2: Explicitly flag sampler states as dirty, rather than assuming that bind_sampler_states() will be called if the program texture count changes. While this may be true for st/mesa, it isn't the case for Gallium HUD. Tested-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Kenneth Graunke	9caabd6c5f	iris: Plumb through ISL_SWIZZLE_IDENTITY in buffer surface emitters Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Kenneth Graunke	4787bc944a	isl: Add a swizzle parameter to isl_buffer_fill_state() This is necessary for legacy texture buffer object formats, where we'll need to use a swizzle to fake e.g. luminance. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Lionel Landwerlin	575f8e8b60	iris: fix decode_get_bo callback Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `acb50d6b1f` ("intel/decoders: handle decoding MI_BBS from ring") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-07 17:39:07 +00:00
Erik Faye-Lund	55e4759c8d	virgl: remove unused variable This variable is now unused, so let's remove it. Fixes: `9c4930946a` (virgl: add encoder functions for new protocol) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-03-07 17:24:54 +00:00
Erik Faye-Lund	44620d4ef7	virgl: remove unused variable This variable is now unused, so let's remove it. Fixes: `db77573d7b` (virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-03-07 17:24:54 +00:00
Erik Faye-Lund	524934586b	virgl: remove unused variable This variable is now unused, so let's remove it. Fixes: `c19aedcf1a` (virgl: don't mark unclean after a flush) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-03-07 17:24:54 +00:00
Erik Faye-Lund	af29c93f22	virgl: remove unused variables These variables are now unused, let's remove them to get rif of a few warnings. Fixes: `f0e71b1088` (virgl: use transfer queue) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-03-07 17:24:54 +00:00
Lionel Landwerlin	0e269c0ac2	iris: fix decoder call Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `acb50d6b1f` ("intel/decoders: handle decoding MI_BBS from ring")	2019-03-07 16:15:03 +00:00
Lionel Landwerlin	0b3871bc7f	intel/aub_write: factorize context image/pphwsp/ring creation We allocate GGTT entries and physical addresses are we create engines rather than having a fixed layout. Context images now receive a parameter argument which is used to setup pml4 & ring buffer addresses. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	c1a2c72e76	intel/aub_write: turn context images arrays into functions We'll make them more parameterized in a later commit. As this is just a transitional commit, we allow ourself to leak the context images allocated in get_context_init(). We'll fix this in the next commit. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	8e14c9b7db	intel/aub_write: store the physical page allocator in struct We want to use this allocator in the next commit for GGTT pages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	0343a3b42b	intel/aub_write: log mmio writes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	6ef46972d9	intel/aub_write: switch to use i915_drm engine classes Prepare aub write to deal with multiple engine instances. We don't pass the instance number yet this could be done in the future by having a 2 dimensional array of struct engine. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	8a81f5c255	intel/aub_write: break execlist write in 2 We want to reuse the execlist submission, but won't need the ring buffer update. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	69ee5bde4e	intel/aub_write: write header in init Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	01443f34b4	intel/aub_write: split comment section from HW setup In the future we'll want error2aub to reuse the context image saved by i915 instead of the default one we write in intel_dump_gpu. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	2b42adff14	intel/aub_read: reuse defines from gen_context Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	bf93084f44	intel/decoders: limit number of decoded batchbuffers IGT has a test to hang the GPU that works by having a batch buffer jump back into itself, trigger an infinite loop on the command stream. As our implementation of the decoding is "perfectly" mimicking the hardware, our decoder also "hangs". This change limits the number of batch buffer we'll decode before we bail to 100. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	acb50d6b1f	intel/decoders: handle decoding MI_BBS from ring An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	ec526d6ba0	intel/decoders: add address space indicator to get BOs Some commands like MI_BATCH_BUFFER_START have this indicator. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Eric Engestrom	3e8d5b5ed4	vulkan/overlay: fix missing var rename in previous commit Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-07 13:45:14 +00:00
Eric Engestrom	d141472d0e	vulkan/util: use the platform defines in vk.xml instead of hard-coding them See also: `3d4238d26c` "anv: use the platform defines in vk.xml instead of hard-coding them" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-07 11:49:44 +00:00
Andre Heider	a4324dcefb	iris: add support for tgsi_to_nir The Gallium Nine state tracker now works on iris. Also tested with GALLIUM_HUD and Star Wars: Knights of the Old Republic on WINE (GL_ATI_fragment_shader). Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-07 00:38:13 -08:00
Tapani Pälli	8b010f3557	nir: free dead_ctx in case of no progress Fixes a leak: ==7576== 320 (48 direct, 272 indirect) bytes in 1 blocks are definitely lost in loss record 26 of 26 ==7576== at 0x4C2EE3B: malloc (vg_replace_malloc.c:309) ==7576== by 0x53EF0E4: ralloc_size (ralloc.c:119) ==7576== by 0x53EF0C2: ralloc_context (ralloc.c:113) ==7576== by 0x5471F64: nir_split_per_member_structs (nir_split_per_member_structs.c:176) ==7576== by 0x51288CF: anv_shader_compile_to_nir (anv_pipeline.c:216) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-07 07:40:19 +02:00
Tapani Pälli	4900c0cff4	anv: call blob_finish when done with it Fixes leaks from anv_device_upload_nir: ==7345== 8,192 bytes in 2 blocks are definitely lost in loss record 24 of 24 ==7345== at 0x4C2ED78: malloc (vg_replace_malloc.c:308) ==7345== by 0x4C31393: realloc (vg_replace_malloc.c:836) ==7345== by 0x54E0848: grow_to_fit (blob.c:67) ==7345== by 0x54E0BE5: blob_reserve_bytes (blob.c:166) ==7345== by 0x54E0C7C: blob_reserve_intptr (blob.c:186) ==7345== by 0x54704A7: nir_serialize (nir_serialize.c:1091) ==7345== by 0x512F97D: anv_device_upload_nir (anv_pipeline_cache.c:756) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-07 07:39:48 +02:00
Tapani Pälli	a9555f37d5	anv: use anv_gem_munmap in block pool cleanup Use anv_gem_munmap for unmap when softpin in use, this corresponds to anv_gem_mmap used in anv_block_pool_expand_range. This fixes valgrind errors seen for each pool when softpin is in use: ==25581== 262,144 bytes in 1 blocks are definitely lost in loss record 31 of 31 ==25581== at 0x50E77E8: anv_gem_mmap (anv_gem.c:96) ==25581== by 0x50EEE2B: anv_block_pool_expand_range (anv_allocator.c:543) ==25581== by 0x50EEB51: anv_block_pool_init (anv_allocator.c:477) ==25581== by 0x50EF7EF: anv_state_pool_init (anv_allocator.c:920) ==25581== by 0x510B8EB: anv_CreateDevice (anv_device.c:2031) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-07 07:36:28 +02:00
Kenneth Graunke	744b8e1c12	iris: Fix MOCS for blits and clears I915_MOCS_CACHED is the wrong value. Expose mocs() and use that.	2019-03-06 18:04:53 -08:00
Timothy Arceri	ecceb076e5	st/glsl: start spilling out common st glsl conversion code The NIR and TGSI paths are currently intertwined which makes it not only hard to follow but also makes it hard to take advantage of the differences in IR. Here we take the first step to splitting that path apart. With this we take the opportunity to no longer call the GLSL IR optimisation passes after the final lowering calls for NIR. We can instead just use the NIR passes which can produce better code and should also result in faster compile times. The speed-up can be measured in some dolphin uber shaders due to no longer calling lower_if_to_cond_assign() for example dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53 seconds on my machine. There are some code changes as a result of not calling lower_if_to_cond_assign(), this is because it flattens ifs that contain UBOs where as NIR's peephole select doesn't. This is were most of the regressions in Max Waves happens with shader-db. shader-db results (VEGA): Totals from affected shaders: SGPRS: 2349056 -> 2349640 (0.02 %) VGPRS: 1322160 -> 1323300 (0.09 %) Spilled SGPRs: 21190 -> 21527 (1.59 %) Spilled VGPRs: 99 -> 99 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 72 -> 72 (0.00 %) dwords per thread Code Size: 57260904 -> 57270932 (0.02 %) bytes Compile Time: 1107186 -> 1022942 (-7.61 %) milliseconds LDS: 786 -> 786 (0.00 %) blocks Max Waves: 391932 -> 391619 (-0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-06 23:05:20 +00:00
Timothy Arceri	e2fd96a563	radeonsi/nir: stop calling nir_lower_returns() We now call this for all drivers in glsl_to_nir() instead. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-06 23:05:20 +00:00
Timothy Arceri	673f4f69a8	i965: stop calling nir_lower_returns() We now call this for all drivers in glsl_to_nir() instead. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-06 23:05:20 +00:00
Timothy Arceri	7e60d5a501	glsl: use NIR function inlining for drivers that use glsl_to_nir() glsl_to_nir() is still missing support for converting certain functions to NIR, so for those we use the GLSL IR optimisations to remove the functions. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-06 23:05:20 +00:00
Timothy Arceri	7530d4abfc	glsl/freedreno/panfrost: pass gl_context to the standalone compiler This allows us to use the ctx with glsl_to_nir() in a following patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-06 23:05:20 +00:00
Lionel Landwerlin	15b83b3af9	vulkan/overlay: drop dependency on validation layer headers v2: reimplement layer chain info getters (Eric) v3: make it compile.. (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-06 22:46:37 +00:00
Lionel Landwerlin	530927d3f6	vulkan/util: generate instance/device dispatch tables This will be used by the overlay instead of system installed validation layers helpers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-06 22:46:37 +00:00
Lionel Landwerlin	ee491a4987	vulkan/util: make header available from c++ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-06 22:46:37 +00:00
Jose Maria Casanova Crespo	ffa9082c40	iris: setup EdgeFlag Vertex Element when needed. If Vertex Shader uses EdgeFlag the hardware request that it is setup as the last VERTEX_ELEMENT_STATE. If SGVS are add at draw time we need to also reconfigure the last 3DSTATE_VF_INSTANCING so its VertexElementIndex points to the new Vertex Element that contains the EdgeFlag. So if draw parameters or edgeflag are not used the CSO generated at iris_create_vertex_element is sent directly in the batches. But if edge flag is used we adjust last VERTEX_ELEMENT_STATE and last 3DSTATE_VF_INSTANCING using their alternative edge flag version we generate at iris_create_vertex_element and store at the CSO. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 22:19:08 +00:00
Eric Anholt	c4d2da1f14	v3d: Include a count of register pressure in the RA failure dumps. You usually want to go find the highest pressure and figure out why you couldn't spill or what pattern led to a bunch of pressure leading to that point.	2019-03-06 14:13:45 -08:00
Samuel Pitoiset	71ffa00fc6	radv: enable lower_mul_2x32_64 Fixes: `58bcebd987` ("spirv: Allow [i/u]mulExtended to use new nir opcode") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-06 22:41:20 +01:00
Jason Ekstrand	9ab1b1d022	st/nir: Move 64-bit lowering later Now that we have a loop unrolling cost function and loop unrolling isn't going to kill us the moment we have a 64-bit op in a loop, we can go ahead and move 64-bit lowering later. This gives us the opportunity to do more optimizations and actually let the full optimizer run even on 64-bit ops rather than hoping one round of opt_algebraic will fix everything. This substantially reduces both fp64 shader compile times and the resulting code size. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	656ace3dd8	intel/nir: Move 64-bit lowering later Now that we have a loop unrolling cost function and loop unrolling isn't going to kill us the moment we have a 64-bit op in a loop, we can go ahead and move 64-bit lowering later. This gives us the opportunity to do more optimizations and actually let the full optimizer run even on 64-bit ops rather than hoping one round of opt_algebraic will fix everything. This substantially reduces both fp64 shader compile times and the resulting code size. On the vs-isnan-dvec test from piglit: Before this commit: 1684.63s user 17.29s system 99% cpu 28:28.24 total 101479 instructions. 0 loops. 802452 cycles. 79:369 spills:fills. Peak memory usage (according to massif): 1.435 GB After this commit: 179.64s user 7.75s system 99% cpu 3:07.92 total 57316 instructions. 0 loops. 459287 cycles. 0:0 spills:fills. Peak memory usage (according to massif): 531.0 MB Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	e02959f442	nir/lower_doubles: Inline functions directly in lower_doubles Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	f25ca337b4	nir/deref: Expose nir_opt_deref_impl Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	de8d80f9cc	nir/inline_functions: Break inlining into a builder helper This pulls the guts of function inlining into a builder helper so that it can be used elsewhere. The rest of the infrastructure is still needed for most inlining cases to ensure that everything gets inlined and only ever once. However, there are use-cases where you just want to inline one little thing. This new helper also has a neat trick where it can seamlessly inline a function from one nir_shader into another. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	0a6b1d0580	glsl/nir: Inline functions in float64_funcs_to_nir This doesn't really change anything as the functions will all get inlined anyway. However it does let us do a bit of the work earlier and in a common place. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	82d9a37a59	glsl/nir: Add a shared helper for building float64 shaders Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	8993e0973f	intel/nir: Drop an unneeded lower_constant_initializers call Even though this is technically a step in the function inlining process as laid out in nir_inline_functions.c, it's not really needed. We already have constant initializers lowered here and no new ones are added by appending the softfp64 functions. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	fa4824c1db	intel/debug: Add a debug flag to force software fp64 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	0ce1aea88b	i965: Compile the fp64 program based on nir options Instead of looking the devinfo directly, look at the lowering options we provided to NIR. This is more accurate as it's now checking for "do we need full software lowering" rather than a hardware bit. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	9314084237	nir: Teach loop unrolling about 64-bit instruction lowering The lowering we do for 64-bit instructions can cause a single NIR ALU instruction to blow up into hundreds or thousands of instructions potentially with control flow. If loop unrolling isn't aware of this, it can unroll a loop 20 times which contains a nir_op_fsqrt which we then lower to a full software implementation based on integer math. Those 20 invocations suddenly get a lot more expensive than NIR loop unrolling currently expects. By giving it an approximate estimate function, we can prevent loop unrolling from going to town when it shouldn't. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Jason Ekstrand	ebb3695376	nir: Expose double and int64 op_to_options_mask helpers We already have one internally for int64 but we don't have a similar one for doubles so we'll have to make one. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Iago Toral Quiroga	ca2b5e9069	compiler/nir: add an is_conversion field to nir_op_info This is set to True only for numeric conversion opcodes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 17:24:57 +00:00
Ian Romanick	55e6454d5e	intel/fs: Fix extract_u8 of an odd byte from a 64-bit integer In the old code, we would generate the exact same instruction for extract_u8(some_u64, 0) and extract_u8(some_u64, 1). The mask-a-word trick only works for even numbered bytes. This fixes the (new) piglit test tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test. v2: Use a SHR instead of an AND. This saves an instruction compared to using two moves. Suggested by Jason. Fixes: `6ac2d16901` ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:45 -08:00
Ian Romanick	4aaf139ea4	intel/fs: nir_op_extract_i8 extracts a byte, not a word Fixes: `6ac2d16901` ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:42 -08:00
Ian Romanick	bbf20a1ca3	intel/compiler: Silence unused parameter warning in brw_interpolation_map.c The parameter is never used, and it's not part of a common interface idiom. Remove it. src/intel/compiler/brw_interpolation_map.c: In function ‘brw_setup_vue_interpolation’: src/intel/compiler/brw_interpolation_map.c:62:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo) ^~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:36 -08:00
Ian Romanick	dea19138dd	intel/compiler: Silence many unused parameter warnings in brw_eu.h In file included from src/intel/compiler/brw_eu_util.c:34:0: src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’: src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_desc_header_present(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’: src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’: src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc_ex_mlen(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’: src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_binding_table_index(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’: src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_sampler(const struct gen_device_info devinfo, uint32_t desc) ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’: src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_return_format(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’: src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_dp_desc_binding_table_index(const struct gen_device_info devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’: src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, /< 0 for SIMD4x2 / ^~~~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’: src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, ^~~~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-06 08:35:31 -08:00
Eric Engestrom	89241eeafc	meson: remove unused include_directories(vulkan) The correct include path is "vulkan/…". Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-03-06 12:46:11 +00:00
Eric Engestrom	ad862c36e5	meson: fix with_dri2 definition for GNU Hurd Suggested-by: Dylan Baker <dylan@pnwbakers.com> Cc: Timo Aaltonen <tjaalton@debian.org> Cc: James Clarke <jrtc27@debian.org> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-03-06 12:40:06 +00:00
Lionel Landwerlin	b49726afd4	radv: set num_components on vulkan_resource_index intrinsic In `61e009d2c4` we changed the number of components in the vulkan_resource_index intrinsic and forgot the update Radv's code for it. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `61e009d2c4` ("spirv: Use the same types for resource indices as pointers") Reviewed-by: Samuel Pitoiset samuel.pitoiset@gmail.com	2019-03-06 11:56:21 +00:00
Timothy Arceri	54522d0506	nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc() Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Timothy Arceri	e16a27fcf8	glsl: rename record_types -> struct_types Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Timothy Arceri	8294295dbd	glsl: rename record_location_offset() -> struct_location_offset() Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Timothy Arceri	88d8c4e290	glsl: rename get_record_instance() -> get_struct_instance() Replace done using: find ./src -type f -exec sed -i -- \ 's/get_record_instance(/get_struct_instance(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Timothy Arceri	81ee2cd8ba	glsl: rename is_record() -> is_struct() Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Karol Herbst	272e927d0e	nir/spirv: initial handling of OpenCL.std extension opcodes Not complete, mostly just adding things as I encounter them in CTS. But not getting far enough yet to hit most of the OpenCL.std instructions. Anyway, this is better than nothing and covers the most common builtins. v2: add hadd proof from Jason move some of the lowering into opt_algebraic and create new nir opcodes simplify nextafter lowering fix normalize lowering for inf rework upsample to use nir_pack_bits add missing files to build systems v3: split lines of iadd/sub_sat expressions Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 22:28:29 +01:00
Karol Herbst	d0b47ec4df	nir/vtn: add support for SpvBuiltInGlobalLinearId v2: use formula with fewer operations Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-05 22:28:29 +01:00
Karol Herbst	f48c672965	nir: add support for address bit sized system values v2: add assert in else clause make local group intrinsics 32 bit wide v3: always use 32 bit constant for local_size v4: add comment by Jason Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 22:28:29 +01:00
Karol Herbst	5f8257fb0b	nir/spirv: improve parsing of the memory model v2: add some vtn_fail_ifs Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 22:28:29 +01:00
Karol Herbst	5d48359a2c	nir: replace magic numbers with M_PI we define it inside 'include/c99_math.h' so it is safe to use. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 22:28:29 +01:00
Caio Marcelo de Oliveira Filho	69cc6272fb	anv: Implement VK_EXT_external_memory_host v2: Ignore the import if handleType == 0. (Jason) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 12:59:50 -08:00
Eric Anholt	5c655c47db	v3d: Drop the V3D 3.x vpm read dead code elimination. We now have NIR dead code eliminating our VPM reads, so this shouldn't be necessary.	2019-03-05 12:57:39 -08:00
Eric Anholt	e8ee1f8eaf	v3d: Eliminate the TLB and TLBU files. We can just use the magic register file like we do for other magic waddrs.	2019-03-05 12:57:39 -08:00
Eric Anholt	110f14d4b4	v3d: Use ldunif instructions for uniforms. The idea is that for repeated use of the same uniform, we could avoid loading it on each consumer. The results look pretty good. total instructions in shared programs: 6413571 -> 6521464 (1.68%) total threads in shared programs: 154214 -> 154000 (-0.14%) total uniforms in shared programs: 2393604 -> 2119629 (-11.45%) total spills in shared programs: 4960 -> 4984 (0.48%) total fills in shared programs: 6350 -> 6418 (1.07%) Once we do scheduling at the NIR level, the register pressure (and thus also instructions) issues we see here will drop back down.	2019-03-05 12:57:39 -08:00
Eric Anholt	4036fce8fd	v3d: Add support for register-allocating a ldunif to a QFILE_TEMP. On V3D 4.x, we can use ldunifrf to load uniforms to any register, and this will let us schedule the ldunif wherever we want in the program.	2019-03-05 12:57:39 -08:00
Eric Anholt	70df388219	v3d: Drop the old class bits splitting up the accumulators. This seems to be left over from vc4, and I don't use them any more.	2019-03-05 12:57:39 -08:00
Eric Anholt	dff1fc04e0	v3d: Add support for vir-to-qpu of ldunif instructions to a temp. We can load a uniform to any register, so add support for non-ALU instructions with sig.ldunif to a temp.	2019-03-05 12:57:39 -08:00
Eric Anholt	4739181a16	v3d: Switch implicit uniforms over to being any qinst->uniform != ~0. I'm not sure why I didn't do this before -- it's clearly much simpler to add dumping of the extra thing than to have it as another implicit source.	2019-03-05 12:57:39 -08:00
Eric Anholt	1e98f02d88	v3d: Do uniform rematerialization spilling before dropping threadcount This feels like the right tradeoff for threads vs uniforms, particularly given that we often have very short thread segments right now: total instructions in shared programs: 6411504 -> 6413571 (0.03%) total threads in shared programs: 153946 -> 154214 (0.17%) total uniforms in shared programs: 2387665 -> 2393604 (0.25%)	2019-03-05 12:57:39 -08:00
Eric Anholt	060979a380	v3d: Fix temporary leaks of temp_registers and when spilling. On each iteration of successfully spilling a reg, we'd allocate another copy of temp_registers, and when decrementing thread conut we'd allocate another copy of the graph. These all got cleaned up on freeing the compile.	2019-03-05 12:57:39 -08:00
Eric Engestrom	faf9e40f35	gitlab-ci: drop job prefixes It is already obvious whether the job is building a container or running a mesa build, so let's drop that prefix so that we can see more information on the screen (eg. in the jobs list on a pipeline page). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-03-05 20:49:42 +00:00
Timur Kristóf	45809bcb33	tgsi_to_nir: Set correct location for uniforms. Previously, only the driver_location was set for all variables, but constants need to use the location field instead. This change is necessary because the nine state tracker can produce non-packed constants whose location needs to be explicitly set. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-05 19:13:27 +00:00
Timur Kristóf	770faf546d	tgsi_to_nir: Improve interpolation modes. This patch extracts the interpolation mode translation into a separate function called ttn_translate_interp_mode, adds support for TGSI_INTERPOLATE_COLOR which was missing, and also sets the proper interpolation mode to output variables, which were not set previously. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Kenneth Graunke	2fb800fd1d	tgsi_to_nir: use sampler variables and derefs v2: fix is_shadow, is_array and txq Some drivers (eg. iris) need the presence of sampler variables and derefs so that they can count them to determine the number of samplers used. This change also makes the output NIR closer to what glsl_to_nir outputs. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	674045d04b	tgsi_to_nir: Support FACE and POSITION properly. Previously, FACE was hard-coded as a sysval, but TTN emulated it incorrectly. Also, POSITION was not supported when it was a sysval. This patch fixes these by allowing both of them to be sysvals or inputs, based on driver capabilities. It also fixes the TGSI FACE emulation based on the TGSI spec. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	f748fa47f8	tgsi_to_nir: Extract ttn_emulate_tgsi_front_face into its own function. We'll need to use the same logic in other places, so it makes sense to have a separate function for this. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	840c7d1ebd	tgsi_to_nir: Restructure system value loads. Minor cleanup to the way system value loads work in tgsi_to_nir. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	9a834447d6	tgsi_to_nir: Produce optimized NIR for a given pipe_screen. With this patch, tgsi_to_nir will output NIR that is tailored to the given pipe, by reading its capabilities and adjusting the NIR code to those capabilities similarly to how glsl_to_nir works. It also adds an optimization loop that brings the output NIR in line with what glsl_to_nir outputs. This is necessary for the same reason why glsl_to_nir has its own optimization loop: currently not every driver does these optimizations yet. For uses which cannot pass a pipe_screen we also keep a variant called tgsi_to_nir_noscreen which keeps the old behavior. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Acked-By: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	e582e761b7	freedreno: Plumb pipe_screen through to irX_tgsi_to_nir. This patch makes it possible for freedreno to pass a pipe_screen to tgsi_to_nir. This will be needed when tgsi_to_nir supports reading pipe capabilities. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-05 19:13:27 +00:00
Timur Kristóf	6684e039eb	nir: Add multiplier argument to nir_lower_uniforms_to_ubo. Note that locations can be set in different units, and the multiplier argument caters to supporting these different units. For example, st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4, while tgsi_to_nir uses bytes, so the multiplier should be 16. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	909d1f50f3	nir: Move nir_lower_uniforms_to_ubo to compiler/nir. The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	4dba72c4b3	tgsi_to_nir: Split to smaller functions. Previously, tgsi_to_nir was a single big function, and this patch intends to make the code easier to understand by splitting it up to multiple smaller pieces. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-By: Tested-by: Rob Clark <robdclark@gmail.com>	2019-03-05 19:13:27 +00:00
Timur Kristóf	950aebbc53	tgsi_to_nir: Make the TGSI IF translation code more readable. This patch is a minor cleanup that only intends to make the TGSI IF translation a bit easier to read. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	fa076acbc0	tgsi_to_nir: Fix TGSI LIT translation by using flt. TGSI spec says LIT needs a "greater than" comparison. NIR doesn't have that, so let's use "less than" and swap the arguments. Previously "greater than or equal" was used by tgsi_to_nir which is incorrect. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Timur Kristóf	28be7b33b9	tgsi_to_nir: Fix the TGSI ARR translation by converting the result to int. According to the TGSI spec, ARR needs to do a rounding and then a float-to-integer conversion which was missing. This patch also makes the rounding a bit more efficient by using nir_fround_even instead of the previous nir_ffloor+nir_fadd trick. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-05 19:13:27 +00:00
Timur Kristóf	317f10bf40	nir: Add ability for shaders to use window space coordinates. This patch adds a shader_info field that tells the driver to use window space coordinates for a given vertex shader. It also enables this feature in radeonsi (the only NIR-capable driver that supported it in TGSI), and makes tgsi_to_nir aware of it. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-05 19:13:27 +00:00
Eric Anholt	2780a99ff8	v3d: Move the stores for fixed function VS output reads into NIR. This lets us emit the VPM_WRITEs directly from nir_intrinsic_store_output() (useful once NIR scheduling is in place so that we can reduce register pressure), and lets future NIR scheduling schedule the math to generate them. Even in the meantime, it looks like this lets NIR DCE some more code and make better decisions. total instructions in shared programs: 6429246 -> 6412976 (-0.25%) total threads in shared programs: 153924 -> 153934 (<.01%) total loops in shared programs: 486 -> 483 (-0.62%) total uniforms in shared programs: 2385436 -> 2388195 (0.12%) Acked-by: Ian Romanick <ian.d.romanick@intel.com> (nir)	2019-03-05 10:59:40 -08:00
Eric Anholt	a9dd227a47	v3d: Translate f2i(fround_even) as FTOIN. This appears to be just what the opcode does. Needed for equivalence when moving FF VPM stores into NIR.	2019-03-05 10:59:40 -08:00
Eric Anholt	a4f612b4cf	nir: Improve printing of load_input/store_output variable names. We were printing only when the channel was exactly the start channel, so scalarized loads/stores would be missing the name on the rest. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-05 10:59:40 -08:00
Jason Ekstrand	43f40dc7cb	anv: Implement VK_EXT_inline_uniform_block Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	61e009d2c4	spirv: Use the same types for resource indices as pointers We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	9f7ee4f8e5	spirv: Use the generic dereference function for OpArrayLength With the new deref changes, the old pointer_offset version may not be the right one to call. Just call the generic one and let it sort it out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	f1dbc7e97d	spirv: Pull offset/stride from the pointer for OpArrayLength We can't pull it from the variable type because it might be an array of blocks and not just the one block. While we're here, throw in some error checking. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-03-05 10:06:50 -06:00
Jason Ekstrand	c520f4dec9	anv: Add a concept of a descriptor buffer This buffer goes along side the CPU data structure and may contain pointers, bindless handles, or any other descriptor information. Currently, all descriptors are size zero and nothing goes in the buffer but this commit sets up the framework we will need later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	5c30fffeec	anv: Take references to push descriptor set layouts Technically, descriptor set layouts aren't required to survive past the function they're passed into so we need to reference them. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	8ab95b849e	anv: Refactor descriptor pushing a bit Pull the common code out of the two entrypoints into the helper which fetches the push descriptor set for us. Now that it does more than just get a thing, call it anv_cmd_buffer_push_descriptor_set. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	cab064bc10	anv: drop add_var_binding from anv_nir_apply_pipeline_layout.c It has exactly one caller. Just inline it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	49cf61c6aa	anv: Clean up descriptor set layouts The descriptor set layout code in our driver has undergone many changes over the years. Some of the fields which were once essential are now useless or nearly so. The has_dynamic_offsets field was completely unused accept for the code to set and hash it. The per-stage indices were only being used to determine if a particular binding had images, samplers, etc. The fact that it's per-stage also doesn't matter because that binding should never be accessed by a shader of the wrong stage. This commit deletes a pile of cruft and replaces it all with a descriptive bitfield which states what a particular descriptor contains. This merely describes the data available and doesn't necessarily dictate how it will be lowered in anv_nir_apply_pipeline_layout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	4c50b7c92c	anv: Count image param entries rather than images This is what we're actually storing in the descriptor set and consuming when we bind surface states. This commit renames image_count to image_param_count a few places and moves the decision to not count image params on gen9+ into anv_descriptor_set.c when we build the layout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	3822c7495a	anv: Stop allocating buffer views for dynamic buffers We emit the surface states for those on-the-fly so we don't need the buffer view. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	8c6d410a50	anv: Rework arguments to anv_descriptor_set_write_* Make them all take a device followed by a set. This is consistent with how the actual Vulkan entrypoint parameters are laid out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	5b7a9e7398	anv/descriptor_set: Refactor alloc/free of descriptor sets This commit just puts the free list code together as part of the pool instead of having it inlined into the descriptor set create code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Eric Anholt	fd1d22b92e	v3d: Stop treating exec masking specially. In our backend, the successor edges from the blocks only point to where QPU control flow goes, not where the notional control flow goes from a "break" or "continue" modifying the execution mask to resume writing to some channels later. As a result, this attempt at restricting live ranges ended up missing the live range of a value where a conditional break/continue was present in a loop before the later def of a variable. The previous commit ended up fixing the problem that the flag tried to solve. Fixes glsl-vs-loop-continue.shader_test and/or glsl-vs-loop-redundant-condition.shader_test based on register allocation results.	2019-03-05 07:36:24 -08:00
Eric Anholt	c6ae666cf5	v3d: Restrict live intervals to the blocks reachable from any def. In the backend, we often have condition codes on writes to variables, such that there's no screening def anywhere and the previous live ranges algorithm would conclude that the start of the range extends to the start of the program. However, we do know that the live range can only extend as early as you can reach from all blocks writing to the variable. The motivation was that, while we have a couple of hacks to try to promote conditional writes up to being a def within the block, the exec_mask one was broken and needed a replacement. Based on `c3c1aa5aeb` ("intel/fs: Restrict live intervals to the subset possibly reachable from any definition.").	2019-03-05 07:36:24 -08:00
Andres Gomez	cf79d62f90	gitlab-ci: install distro's ninja Ubuntu Bionic is shipping ninja 1.8.2. Therefore, we do not need to download v1.6.0 manually any more. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-05 14:05:24 +00:00
Samuel Pitoiset	c2a148692b	radv: properly align the fence and EOP bug VA on GFX9 If alignement is 0, offets returned by radv_cmd_buffer_upload_alloc() are always 0. These two virtual addresses were pointing at the same location. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-05 15:00:20 +01:00
Samuel Pitoiset	2eb0905ffa	radv: allocate enough space in cmdbuf when starting a subpass This fixes some CTS crashes with: dEQP-VK.renderpass2.suballocation.attachment_write_mask.attachment_count_8.start_index_* Ideally, we should check cmd_buffer->cs->max_dw because there is likely enough space (the internal clear draws allocate space), but keep that way for consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-05 15:00:10 +01:00
Eric Engestrom	31d302ae51	vulkan: import vk_layer.h from Khronos Instead of relying on the system having it (and the right version). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 13:24:14 +00:00
Eric Engestrom	bcc4bfc8e8	egl: fix libdrm-less builds This function was never used, and isn't properly guarded by HAVE_LIBDRM, breaking the build on systems that don't have libdrm. Let's just remove it. Fixes: `7552fcb7b9` "egl: add base EGL_EXT_device_base implementation" Reported-by: Timo Aaltonen <tjaalton@debian.org> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-05 13:04:06 +00:00
Eric Engestrom	e37ea1e0d3	vulkan: import missing file from Khronos Fixes: `114c4aa0c8` "vulkan: update headers/registry to 1.1.102" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 12:52:31 +00:00
Eric Engestrom	91cc6fcbb0	util: #define PATH_MAX when undefined (eg. Hurd) Cc: Timo Aaltonen <tjaalton@debian.org> Cc: James Clarke <jrtc27@debian.org> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-05 12:27:35 +00:00
Eric Engestrom	fe205818c2	radv: use the platform defines in vk.xml instead of hard-coding them Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 11:57:10 +00:00
Eric Engestrom	3d4238d26c	anv: use the platform defines in vk.xml instead of hard-coding them Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 11:57:10 +00:00
Lionel Landwerlin	e21c201c96	anv: update supported patch version Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-05 10:39:17 +00:00
Tapani Pälli	3bb8768b9d	anv: toggle on support for VK_EXT_ycbcr_image_arrays We already propagate coord_components correctly and did not have layer restrictions for ycbcr formats. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:39:17 +00:00
Lionel Landwerlin	114c4aa0c8	vulkan: update headers/registry to 1.1.102 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-05 10:39:11 +00:00
Tapani Pälli	33bf3d510c	anv: retain the is_array state in create_plane_tex_instr_implicit This does not seem to fix anything ATM but is the right thing todo. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `f3e91e78a3` ("anv: add nir lowering pass for ycbcr textures") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:38:31 +00:00
Eric Engestrom	e1ee4ab3dc	meson: avoid going back up the tree with include_directories() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-03-05 10:02:47 +00:00
Kenneth Graunke	dca36d5516	i965: Implement threaded GL support. Now i965 supports mesa_glthread=true like Gallium drivers do. According to Markus (degasus), the Citra emulator now runs ~30% faster. Emmanuel (linkmauve) also reported that the Dolphin emulator improved by 2.8x on one game. (Both of those still need to be added to drirc.) An Intel Mesa CI run with mesa_glthread=true appears to be happy. Bioshock Infinite's benchmark mode seems to be around 15-20% faster on my Skylake GT4 at 1920x1080. Tested-by: Markus Wick <markus@selfnet.de> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-05 00:49:05 -08:00
Jason Ekstrand	0010d0348a	anv/pipeline: Drop anv_fill_binding_table We zero out the prog data anyway and, now that bias is always zero, this function is accomplishing nothing. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-04 23:56:40 +00:00
Jason Ekstrand	65ee5cc0da	anv: Use an actual binding for gl_NumWorkgroups This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-04 23:56:40 +00:00
Jason Ekstrand	5c96120b5c	intel,nir: Lower TXD with min_lod when the sampler index is not < 16 When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: `cb98e0755f` "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-04 23:56:39 +00:00
Jason Ekstrand	ca295ddbfb	spirv: OpImageQueryLod requires a sampler No idea how this fell through the cracks besides the fact that the sampler bound at 0 almost always works and the CTS isn't amazing. In any case, this appears to have been broken for almost forever. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-03-04 23:56:39 +00:00
Jason Ekstrand	5049fbddb4	anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupport We were accidentally not counting those surfaces Fixes: `ddc4069122` "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-04 23:56:39 +00:00
Sagar Ghuge	58bcebd987	spirv: Allow [i/u]mulExtended to use new nir opcode Use new nir opcode nir_[i/u]mul_2x32_64 and extract lower and higher 32 bits as needed instead of emitting mul and mul_high. v2: Surround the switch case with curly braces (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-04 15:50:25 -08:00
Sagar Ghuge	47ec9bdc60	nir/algebraic: Optimize low 32 bit extraction Optimize a situation where we only need lower 32 bits from 64 bit result. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-04 15:50:25 -08:00
Sagar Ghuge	1d8994a63b	glsl: [u/i]mulExtended optimization for GLSL Optimize mulExtended to use 32x32->64 multiplication. Drivers which are not based on NIR, they can set the MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old behavior. v2: Add missing condition check (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <Matt Turner <mattst88@gmail.com> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-04 15:50:25 -08:00
Sagar Ghuge	e551040c60	nir/glsl: Add another way of doing lower_imul64 for gen8+ On Gen 8 and 9, "mul" instruction supports 64 bit destination type. We can reduce our 64x64 int multiplication from 4 instructions to 3. Also instead of emitting two mul instructions, we can emit single mul instuction and extract low/high 32 bits from 64 bit result for [i/u]mulExtended v2: 1) Allow lower_mul_high64 to use new opcode (Jason Ekstrand) 2) Add lower_mul_2x32_64 flag (Matt Turner) 3) Remove associative property as bit size is different (Connor Abbott) v3: Fix indentation and variable naming convention (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-04 15:50:25 -08:00
Axel Davy	1d363d440f	st/nine: Ignore multisample quality level if no ms Apparently instead of returning error when passing a quality level different than 0 for D3DMULTISAMPLE_NONE, we should pass. Fixes: https://github.com/iXit/Mesa-3D/issues/340 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-03-04 21:52:15 +01:00
Axel Davy	86666f051e	st/nine: Ignore window size if error Check GetWindowInfo and ignore the computed sizes if there is an error. Fixes a regression caused by earlier commit when using old wine gallium nine patches. Should also address a crash at window destruction. Related issues: https://github.com/iXit/Mesa-3D/issues/331 https://github.com/iXit/Mesa-3D/issues/332 Cc: mesa-stable@lists.freedesktop.org Fixes: `2318ca68bb` ("st/nine: Handle window resize when a presentation buffer is used") Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-03-04 21:52:15 +01:00
Mauro Rossi	ec0f465bc5	android: anv: fix libexpat shared dependency Fixes undefined reference building errors for XML_* functions Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	2019-03-04 20:53:59 +01:00
Mauro Rossi	14e7e26a09	android: anv: fix generated files depedencies (v2) Fix anv_extrypoints.{c,h} and anv_extensions.{c,h} missing dependencies Rename the variable labels according to targets and python scripts Align the building rules as per Automake for simplification Fixes building errors during rebuils due to missing dependencies (v2) Fixed a missing $(VULKAN_API_XML) reference Fixes: `9a508b7` ("android: anv/extensions: fix generated sources build") Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	2019-03-04 20:53:51 +01:00
Brian Paul	e2369e133c	st/wgl: init a variable to silence MinGW warning MinGW release build says 'value' may be used before being initialized. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-04 11:48:48 -07:00
Brian Paul	66ba12973b	svga: silence array out of bounds warning MinGW release build complains about a possible out-of-bounds array access. Test i < 4 to silence it. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-04 11:48:47 -07:00
Brian Paul	999db9ac51	svga: init fill variable to avoid compiler warning MinGW release builds warns about use of a possbily uninitialized variable here. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-04 11:48:47 -07:00
Brian Paul	9b07a221a4	st/mesa: whitespace fixes in st_texture.h Trivial.	2019-03-04 11:48:47 -07:00
Brian Paul	d74932dfea	st/mesa: line wrapping, whitespace fixes in st_cb_texture.c Trivial.	2019-03-04 11:48:36 -07:00
Brian Paul	fc91c2698e	st/mesa: whitespace fixes in st_sampler_view.c Replace tabs w/ spaces. 80-column wrapping. Trivial.	2019-03-04 11:42:49 -07:00
Gurchetan Singh	610758d3e5	egl/sl: also allow virtgpu to fallback to kms_swrast virtio-gpu fallbacks to software rendering when 3D features are unavailable since 6c5ab, and kms_swrast is more feature complete than swrast. v2: Add comment (Emil) Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-03-04 17:33:17 +00:00
Mathias Fröhlich	904a0552aa	st/mesa: Invalidate the gallium array atom only if needed. Now that the buffer object usage history tracks if it is being used as vertex buffer object, we can restrict setting the ST_NEW_VERTEX_ARRAYS bit to dirty on glBufferData calls to buffers that are potentially used as vertex buffer object. Also put a note that the same could be done for index arrays used in indexed draws. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-04 17:03:06 +01:00
Mathias Fröhlich	e727f8c8b8	mesa: Track buffer object use also for VAO usage. We already track the usage history for buffer objects in a lot of aspects. Add GL_ARRAY_BUFFER and GL_ELEMENT_ARRAY_BUFFER to gl_buffer_object::UsageHistory. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-03-04 17:03:06 +01:00
Samuel Pitoiset	9e787904d0	rav: use 32_AR instead of 32_ABGR when alpha coverage is required This export format is faster. Seems to improve performance in Wreckfest. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-04 12:02:01 +01:00
Alyssa Rosenzweig	72981c92ce	panfrost: List primitive restart enable bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-04 05:04:14 +00:00
Alyssa Rosenzweig	2b5cda137f	panfrost/midgard: Preview for data hazards If a selected unit causes a data hazard, the whole block gets cut short. So, we preview for data hazards _while_ selecting units. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	2019-03-04 05:03:48 +00:00
Alyssa Rosenzweig	93eeba623b	panfrost/midgard: Promote smul to vmul smul comes first in the pipeline, before vmul. Until we have a full instruction scheduler, it's better to have vmul prioritized to maximize bundle size. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	2019-03-04 05:02:58 +00:00
Alyssa Rosenzweig	25bbb44dce	panfrost: Flush with offscreen rendering This special-case was needlessly added and breaks purely offscreen rendering (when there is no scanout involved) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	2019-03-04 05:01:45 +00:00
Alyssa Rosenzweig	4f7460297b	panfrost/midgard: Don't force constant on VLUT Previously, we forced a #0 inline constant tacked on for the lut instructions to mirror the blob's behaviour, which caused some suboptimal codegen due to our constant inlining implementation. Instead, just don't force a constant at all. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	2019-03-04 04:59:58 +00:00
Alyssa Rosenzweig	c351cc4e94	panfrost: Cleanup cruft related to clears Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-03-04 04:59:12 +00:00
Alyssa Rosenzweig	40ffee4448	panfrost: Decouple Gallium clear from FBD clear The operations of gallium->clear() and the hardware callbacks are fundamentally independent. This routine decouples them by routing shared information via panfrost_job, allowing the hardware half to be deferred to the fragment job generation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-03-04 04:58:55 +00:00
Alyssa Rosenzweig	59c9623d0a	panfrost: Import job data structures from v3d At the moment, Panfrost state is ad hoc, which creates issues for FBOs. This commit imports the skeleton of the v3d_job structure as panfrost_job, in preparation for refactors to organize this state. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-03-04 04:58:15 +00:00
Ilia Mirkin	4eec3a2a36	glsl: fix recording of variables for XFB in TCS shaders This is purely for conformance, since it's not actually possible to do XFB on TCS output varyings. However we do have to make sure we record the names correctly, and this removes an extra level of array-ness from the names in question. Fixes KHR-GL45.tessellation_shader.single.xfb_captures_data_from_correct_stage v2: Add comment to the new program_resource_visitor::process function. (Ilia Mirkin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108457 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-04 01:55:00 +01:00
Jose Maria Casanova Crespo	bf1f49482d	glsl: TCS outputs can not be transform feedback candidates on GLES Avoids regression on: KHR-GLES*.core.tessellation_shader.single.xfb_captures_data_from_correct_stage that is uncovered by the following patch. "glsl: fix recording of variables for XFB in TCS shaders" v2: Rebased over glsl: fix recording of variables for XFB in TCS shaders v3: Move this patch before "glsl: fix recording of variables for XFB in TCS shaders" to avoid temporal regressions. (Illia Mirkin) Cc: 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-04 01:55:00 +01:00
Jose Maria Casanova Crespo	cc7173b438	glsl: fix typos in comments "transfor" -> "transform" Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-04 01:55:00 +01:00
Gert Wollny	3214f20914	mesa: Expose EXT_texture_query_lod and add support for its use shaders EXT_texture_query_lod provides the same functionality for GLES like the ARB extension with the same name for GL. v2: Set ES 3.0 as minimum GLES version as required by the extension Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-03 21:50:42 +01:00
Greg V	7dc2f47882	util: emulate futex on FreeBSD using umtx Obtained from: FreeBSD ports Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-03 19:48:49 +00:00
Rob Clark	00f838fa73	freedreno/ir3: track register pressure in sched Not a perfect solution, and the "pressure" target is hard-coded. But it doesn't really seem to much in the common case, and avoids exploding register usage in dEQP ssbo tests. So this should serve as a stop-gap solution until I have time to re- write the scheduler. Hurts slightly in instruction count, but gains (reduces) slightly the register usage in shader-db. Fixes ~150 dEQP-GLES31.functional.ssbo.* that were failing due to RA fail. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-03 13:27:50 -05:00
Rob Clark	8a5f2d9444	freedreno/ir3: add Sethi–Ullman numbering pass Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-03 13:27:50 -05:00
Rob Clark	c8e351ee3a	freedreno/ir3: include nopN in expanded instruction count Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-03 13:27:50 -05:00
Dave Airlie	cb4e3e3ef6	st/mesa: add support for lowering fp64/int64 for nir drivers This might enough for iris and possible r600 (when it gets NIR) This appears to work for iris. v2: * change cap return so DOUBLES == 2 means sw emu v3: * Refactor using int64/doubles lowering options which were added into nir options * Remove DOUBLES == 2 added in v2 [jordan: Remove "2" value on PIPE_CAP_DOUBLES] [jordan: Use lowering options added to nir options] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-02 14:33:44 -08:00
Jordan Justen	7de056e1a9	scons: Generate float64_glsl.h for glsl_to_nir fp64 lowering Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-02 14:33:44 -08:00
Jordan Justen	10c5579921	intel/compiler: Move int64/doubles lowering options Instead of calculating the int64 and doubles lowering options each time a shader is preprocessed, save and use the values in nir_shader_compiler_options. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-02 14:33:44 -08:00
Jordan Justen	31b35916dd	nir: Add int64/doubles options into nir_shader_compiler_options This will allow the options to be visible under nir_shader->options, which will allow the gallium state_tracker to use the driver preferred settings during glsl_to_nir. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-02 14:33:41 -08:00
Ian Romanick	bae0c36751	nir/algebraic: Optimize away an fsat of a b2f The b2f can only produce 0.0 or 1.0, so the fsat does nothing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-02 13:58:56 -08:00
Ian Romanick	d1d56f5f9a	intel/fs: Don't assert on b2f with a saturate modifier This ran afoul of Iris's use of nir_lower_clamp_color_outputs which applies fsat() before writes to vertex shader color outpus. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7725d60938` ("intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))")	2019-03-02 13:58:50 -08:00
Lionel Landwerlin	32ffd90002	anv: add support for INTEL_DEBUG=bat As requested by Ken ;) v2: Also decode simple batches (Caio) Fix u_vector usage issues (Lionel) v3: Make binding/instruction/state/surface available (Lionel) v4: Going through device pools for simple batches (Lionel) Centralize search BO callbacks into anv_device.c (Lionel) v5: Clear decoded batch buffer var after use (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-02 12:53:21 +00:00
Eric Anholt	f1122f78b7	v3d: Fix build of NEON code with Mesa's cflags not targeting NEON. v3d may be built as part of a set of drivers in a system not requiring NEON, but we know V3D devices will be paired with CPUs with NEON so we should be able to use this asm. Fixes: `0c05198d6b` ("v3d: Always enable the NEON utile load/store code.")	2019-03-01 14:21:49 -08:00
Matt Turner	e0148bbcfd	intel/compiler: Add commas on final values of compaction table arrays Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-03-01 13:56:25 -08:00
Ian Romanick	ecc9ffa778	nir/algebraic: Replace a-fract(a) with floor(a) I noticed this while looking at a shader that was affected by Tim's "more loop unrolling" series. In review, Tim Arceri asked: > Why the hurt on Gen6+ is this something that should be in the late > optimisations pass? As far as I can tell, it's just because our scheduler is terrible. In all the fragment shaders that I looked at (some hurt shaders were from other stages), only one of the SIMD8 or SIMD16 version would be hurt. In many of those case, the other SIMD width is improved (e.g., shaders/closed/steam/brutal-legend/3990.shader_test). Often it looks like the scheduler decides to differently schedule a SEND the occurs somewhere early in the shader. Once that happens, everything is different. I looked at one vertex shader that was hurt (from Goat Simulator). In that case, both the floor and fract are used. The optimization eliminates the add, and it should allow better scheduling. In the area of the FRC and RNDD instructions, the scheduler does the right thing. However, later in the shader a MAD and and ADD get scheduled differently, and that makes it slightly worse. In light of this, I tried adding some "is_used_once" mark-up, and that did not fix all the cycles regressions. It also did a lot more harm than good on SKL (helped 82 vs. hurt 241). All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15437001 -> 15435259 (-0.01%) instructions in affected programs: 213651 -> 211909 (-0.82%) helped: 988 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 1.76 x̃: 1 helped stats (rel) min: 0.15% max: 11.54% x̄: 1.14% x̃: 0.59% 95% mean confidence interval for instructions value: -1.89 -1.63 95% mean confidence interval for instructions %-change: -1.23% -1.05% Instructions are helped. total cycles in shared programs: 383007378 -> 382997063 (<.01%) cycles in affected programs: 1650825 -> 1640510 (-0.62%) helped: 679 HURT: 302 helped stats (abs) min: 1 max: 348 x̄: 23.39 x̃: 14 helped stats (rel) min: 0.04% max: 28.77% x̄: 1.61% x̃: 0.98% HURT stats (abs) min: 1 max: 250 x̄: 18.43 x̃: 7 HURT stats (rel) min: 0.04% max: 25.86% x̄: 1.41% x̃: 0.53% 95% mean confidence interval for cycles value: -13.05 -7.98 95% mean confidence interval for cycles %-change: -0.86% -0.50% Cycles are helped. Iron Lake and GM45 had similar results. (GM45 shown) total instructions in shared programs: 5043616 -> 5043010 (-0.01%) instructions in affected programs: 119691 -> 119085 (-0.51%) helped: 432 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.10% max: 8.11% x̄: 0.66% x̃: 0.39% 95% mean confidence interval for instructions value: -1.58 -1.23 95% mean confidence interval for instructions %-change: -0.72% -0.59% Instructions are helped. total cycles in shared programs: 128139812 -> 128135762 (<.01%) cycles in affected programs: 3829724 -> 3825674 (-0.11%) helped: 602 HURT: 0 helped stats (abs) min: 2 max: 486 x̄: 6.73 x̃: 6 helped stats (rel) min: 0.02% max: 4.85% x̄: 0.19% x̃: 0.10% 95% mean confidence interval for cycles value: -8.40 -5.05 95% mean confidence interval for cycles %-change: -0.22% -0.16% Cycles are helped. Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-03-01 12:43:25 -08:00
Ian Romanick	1edf67fc3f	intel/fs: Generate if instructions with inverted conditions Per-platform results were all over the place, so I have included all the results here. There is an important note at the bottom of the commit message. Skylake total instructions in shared programs: 15184683 -> 15184679 (<.01%) instructions in affected programs: 2786 -> 2782 (-0.14%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.05% max: 0.84% x̄: 0.44% x̃: 0.44% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.96% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 370961367 -> 370961173 (<.01%) cycles in affected programs: 205867 -> 205673 (-0.09%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 149 x̄: 39.60 x̃: 16 helped stats (rel) min: 0.02% max: 1.05% x̄: 0.45% x̃: 0.55% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -93.01 28.34 95% mean confidence interval for cycles %-change: -0.82% 0.08% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 15465366 -> 15465362 (<.01%) instructions in affected programs: 2799 -> 2795 (-0.14%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.04% max: 0.84% x̄: 0.44% x̃: 0.44% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.96% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 410938419 -> 410938531 (<.01%) cycles in affected programs: 566028 -> 566140 (0.02%) helped: 18 HURT: 17 helped stats (abs) min: 1 max: 16 x̄: 3.50 x̃: 1 helped stats (rel) min: <.01% max: 1.05% x̄: 0.13% x̃: <.01% HURT stats (abs) min: 1 max: 12 x̄: 10.29 x̃: 12 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.09% 95% mean confidence interval for cycles value: 0.31 6.09 95% mean confidence interval for cycles %-change: -0.10% 0.05% Inconclusive result (%-change mean confidence interval includes 0). Haswell total instructions in shared programs: 13749760 -> 13749759 (<.01%) instructions in affected programs: 2241 -> 2240 (-0.04%) helped: 1 HURT: 0 total cycles in shared programs: 385398913 -> 385398363 (<.01%) cycles in affected programs: 554914 -> 554364 (-0.10%) helped: 31 HURT: 1 helped stats (abs) min: 1 max: 453 x̄: 18.00 x̃: 6 helped stats (rel) min: <.01% max: 0.25% x̄: 0.03% x̃: 0.05% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -45.88 11.51 95% mean confidence interval for cycles %-change: -0.05% -0.02% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 180663626 -> 180663881 (<.01%) cycles in affected programs: 472350 -> 472605 (0.05%) helped: 15 HURT: 30 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 8 max: 10 x̄: 9.00 x̃: 9 HURT stats (rel) min: 0.06% max: 0.14% x̄: 0.10% x̃: 0.10% 95% mean confidence interval for cycles value: 4.21 7.12 95% mean confidence interval for cycles %-change: 0.05% 0.08% Cycles are HURT. Sandy Bridge total cycles in shared programs: 154568664 -> 154569225 (<.01%) cycles in affected programs: 356486 -> 357047 (0.16%) helped: 1 HURT: 31 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02% HURT stats (abs) min: 4 max: 33 x̄: 18.16 x̃: 8 HURT stats (rel) min: 0.05% max: 0.23% x̄: 0.14% x̃: 0.10% 95% mean confidence interval for cycles value: 12.19 22.87 95% mean confidence interval for cycles %-change: 0.10% 0.16% Cycles are HURT. Iron Lake total instructions in shared programs: 8206589 -> 8206565 (<.01%) instructions in affected programs: 3024 -> 3000 (-0.79%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.75% max: 0.83% x̄: 0.80% x̃: 0.80% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.82% -0.77% Instructions are helped. total cycles in shared programs: 187657428 -> 187656228 (<.01%) cycles in affected programs: 95748 -> 94548 (-1.25%) helped: 12 HURT: 0 helped stats (abs) min: 80 max: 120 x̄: 100.00 x̃: 100 helped stats (rel) min: 1.00% max: 1.66% x̄: 1.27% x̃: 1.21% 95% mean confidence interval for cycles value: -113.27 -86.73 95% mean confidence interval for cycles %-change: -1.43% -1.11% Cycles are helped. GM45 total instructions in shared programs: 5037569 -> 5037557 (<.01%) instructions in affected programs: 1521 -> 1509 (-0.79%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.75% max: 0.83% x̄: 0.79% x̃: 0.79% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.83% -0.75% Instructions are helped. total cycles in shared programs: 128101478 -> 128100758 (<.01%) cycles in affected programs: 52746 -> 52026 (-1.37%) helped: 6 HURT: 0 helped stats (abs) min: 120 max: 120 x̄: 120.00 x̃: 120 helped stats (rel) min: 1.16% max: 1.66% x̄: 1.41% x̃: 1.41% 95% mean confidence interval for cycles value: -120.00 -120.00 95% mean confidence interval for cycles %-change: -1.70% -1.12% Cycles are helped. This change has almost no effect right now. However, removing this patch (but leaving the patch "nir/algebraic: Replace a bcsel of a b2f with a b2f(!(a \|\| b))") after adding a patch that removes !(a < b) -> (a >= b) optimizations (like https://patchwork.freedesktop.org/patch/264787/) has the following results on Skylake: Skylake total instructions in shared programs: 15071022 -> 15089710 (0.12%) instructions in affected programs: 1022219 -> 1040907 (1.83%) helped: 1 HURT: 3937 helped stats (abs) min: 41 max: 41 x̄: 41.00 x̃: 41 helped stats (rel) min: 1.01% max: 1.01% x̄: 1.01% x̃: 1.01% HURT stats (abs) min: 1 max: 256 x̄: 4.76 x̃: 4 HURT stats (rel) min: 0.05% max: 11.18% x̄: 2.59% x̃: 2.60% 95% mean confidence interval for instructions value: 4.56 4.93 95% mean confidence interval for instructions %-change: 2.54% 2.64% Instructions are HURT. total cycles in shared programs: 369777134 -> 370092923 (0.09%) cycles in affected programs: 17516573 -> 17832362 (1.80%) helped: 115 HURT: 3624 helped stats (abs) min: 1 max: 1721 x̄: 81.18 x̃: 28 helped stats (rel) min: <.01% max: 10.74% x̄: 1.24% x̃: 0.65% HURT stats (abs) min: 1 max: 12640 x̄: 89.71 x̃: 54 HURT stats (rel) min: <.01% max: 28.24% x̄: 4.72% x̃: 4.52% 95% mean confidence interval for cycles value: 75.21 93.71 95% mean confidence interval for cycles %-change: 4.43% 4.64% Cycles are HURT. total spills in shared programs: 9450 -> 9442 (-0.08%) spills in affected programs: 166 -> 158 (-4.82%) helped: 2 HURT: 0 total fills in shared programs: 21115 -> 21094 (-0.10%) fills in affected programs: 438 -> 417 (-4.79%) helped: 2 HURT: 0 LOST: 1 GAINED: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	d40640efe8	nir/algebraic: Replace a bcsel of a b2f sources with a b2f(!(a \|\| b)) I have not investigated the result of doing this during code generation. That should be possible, but it would be a bit more effort. All Gen6+ platforms had nearly identical results. (Skylake shown) total cycles in shared programs: 370961508 -> 370961367 (<.01%) cycles in affected programs: 5174 -> 5033 (-2.73%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8206587 -> 8206589 (<.01%) instructions in affected programs: 1325 -> 1327 (0.15%) helped: 0 HURT: 2 total cycles in shared programs: 187657422 -> 187657428 (<.01%) cycles in affected programs: 11566 -> 11572 (0.05%) helped: 0 HURT: 2 This change has almost no effect right now. However, removing this patch (but leaving the patch "intel/fs: Generate if instructions with inverted conditions") after adding a patch that removes !(a < b) -> (a >= b) optimizations (like https://patchwork.freedesktop.org/patch/264787/) has the following results on Skylake: Skylake total instructions in shared programs: 15071804 -> 15071806 (<.01%) instructions in affected programs: 640 -> 642 (0.31%) helped: 0 HURT: 2 total cycles in shared programs: 369914348 -> 369916569 (<.01%) cycles in affected programs: 27900 -> 30121 (7.96%) helped: 4 HURT: 15 helped stats (abs) min: 2 max: 112 x̄: 30.00 x̃: 3 helped stats (rel) min: 0.28% max: 12.28% x̄: 3.34% x̃: 0.40% HURT stats (abs) min: 2 max: 758 x̄: 156.07 x̃: 81 HURT stats (rel) min: 0.20% max: 74.30% x̄: 16.29% x̃: 16.91% 95% mean confidence interval for cycles value: 12.68 221.11 95% mean confidence interval for cycles %-change: 3.09% 21.23% Cycles are HURT. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	7725d60938	intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a)) Since Boolean values are either -1 (true) or 0 (false), b2f(inot(a)) maps -1 => 0.0 and 0 => 1.0. This is equivalent to 1.0 + float(boolBitsToInt(a)). On Intel GPUs, ADD is one of the few instructions that can type-convert during write to destination, so we can achieve this in a single instruction: add g47F, g26D, 1D v2: Fix swizzles. v3: Fix typos in comments. Noticed by Ken. All Gen6+ platforms had similar results. (Skylake shown) Skylake total instructions in shared programs: 15185583 -> 15184683 (<.01%) instructions in affected programs: 239389 -> 238489 (-0.38%) helped: 899 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.15% max: 1.85% x̄: 0.49% x̃: 0.44% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.09% max: 0.09% x̄: 0.09% x̃: 0.09% 95% mean confidence interval for instructions value: -1.01 -0.99 95% mean confidence interval for instructions %-change: -0.51% -0.48% Instructions are helped. total cycles in shared programs: 370964249 -> 370961508 (<.01%) cycles in affected programs: 1487586 -> 1484845 (-0.18%) helped: 420 HURT: 268 helped stats (abs) min: 1 max: 232 x̄: 22.41 x̃: 6 helped stats (rel) min: 0.05% max: 22.60% x̄: 1.30% x̃: 0.41% HURT stats (abs) min: 1 max: 230 x̄: 24.90 x̃: 10 HURT stats (rel) min: <.01% max: 21.60% x̄: 1.45% x̃: 0.52% 95% mean confidence interval for cycles value: -7.61 -0.36 95% mean confidence interval for cycles %-change: -0.44% -0.02% Cycles are helped. No changes on Iron Lake or GM45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	cb3e21cd19	intel/fs: Use De Morgan's laws to avoid logical-not of a logic result on Gen8+ Instead of emitting ~(a & b), emit (~a \| ~b) since logical-not of operands is free on Gen8+. v2: Fix swizzles. Fix types for cmod propagation. v3: Simplify logic for inverting source of inot(ixor(a, b)). Suggested by Ken. Skylake and Broadwell had similar results. (Skylake shown) Skylake total instructions in shared programs: 15185593 -> 15185583 (<.01%) instructions in affected programs: 5673 -> 5663 (-0.18%) helped: 12 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.30% max: 5.88% x̄: 1.50% x̃: 0.70% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% 95% mean confidence interval for instructions value: -1.66 0.13 95% mean confidence interval for instructions %-change: -2.60% -0.15% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 370977726 -> 370964249 (<.01%) cycles in affected programs: 869987 -> 856510 (-1.55%) helped: 15 HURT: 2 helped stats (abs) min: 2 max: 6640 x̄: 902.20 x̃: 16 helped stats (rel) min: <.01% max: 4.92% x̄: 1.71% x̃: 1.53% HURT stats (abs) min: 14 max: 42 x̄: 28.00 x̃: 28 HURT stats (rel) min: 1.08% max: 3.18% x̄: 2.13% x̃: 2.13% 95% mean confidence interval for cycles value: -1654.87 69.34 95% mean confidence interval for cycles %-change: -2.29% -0.23% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	8eb36c9129	intel/fs: Emit logical-not of operands on Gen8+ On Gen8+ specifying negation of a logical operation such as AND actually performs a logical-not. Take advantage of this to generate fewer instructions. v2: Major rebase. Use nir_src_as_alu_instr. Fix swizzle handling. No changes on any pre-Gen8 platform. Skylake and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 15466902 -> 15466274 (<.01%) instructions in affected programs: 1262953 -> 1262325 (-0.05%) helped: 682 HURT: 4 helped stats (abs) min: 1 max: 5 x̄: 1.02 x̃: 1 helped stats (rel) min: 0.03% max: 2.40% x̄: 0.18% x̃: 0.04% HURT stats (abs) min: 1 max: 62 x̄: 17.50 x̃: 3 HURT stats (rel) min: 0.03% max: 1.89% x̄: 0.53% x̃: 0.10% 95% mean confidence interval for instructions value: -1.10 -0.73 95% mean confidence interval for instructions %-change: -0.19% -0.15% Instructions are helped. total cycles in shared programs: 410996093 -> 410950440 (-0.01%) cycles in affected programs: 144389048 -> 144343395 (-0.03%) helped: 519 HURT: 51 helped stats (abs) min: 1 max: 1060 x̄: 104.46 x̃: 140 helped stats (rel) min: 0.01% max: 10.98% x̄: 0.34% x̃: 0.03% HURT stats (abs) min: 1 max: 4060 x̄: 167.90 x̃: 22 HURT stats (rel) min: <.01% max: 8.20% x̄: 0.96% x̃: 0.25% 95% mean confidence interval for cycles value: -97.16 -63.02 95% mean confidence interval for cycles %-change: -0.32% -0.13% Cycles are helped. total spills in shared programs: 95311 -> 95329 (0.02%) spills in affected programs: 881 -> 899 (2.04%) helped: 0 HURT: 4 total fills in shared programs: 93629 -> 93634 (<.01%) fills in affected programs: 794 -> 799 (0.63%) helped: 1 HURT: 2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	06eaaf2de9	intel/fs: Refactor ALU source and destination handling to a separate function Other places will need to do this soon to properly handle source swizzles. The patch looks a little odd, but the change is pretty straight forward. All of the swizzle and mask handling is moved out, but the code for handling move instructions and vecN instructions remains in nir_emit_alu. I'm not terribly pleased with the "need_dest" parameter, but get_nir_dest is (somewhat surprisingly) destructive. I am open to suggestions of alternatives. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	fb3ca9109c	intel/fs: Handle OR source modifiers in algebraic optimization Found by inspection. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	c9d5bd050c	intel/fs: Relax type matching rules in cmod propagation from MOV instructions To allow cmod propagation from a MOV in a sequence like: and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD mov.nz.f0(16) null<1>F g31<8,8,1>D A similar change to the vec4 backend had no effect. Somewhere between `c1ec582059` and `40fc4b5acd` (1,094 commits) the effectiveness of this patch diminished, and as of commit `d7e0d47b9d` (nir: Add a bunch of b2[if] optimizations) this optimization no longer has any effect on any platform. A later patch "intel/fs: Use De Morgan's laws to avoid logical-not of a logic result on Gen8+," generates some instruction sequences that require this change in order for cmod propagation to make progress. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	eae19f5f19	nir/algebraic: Replace i2b used by bcsel or if-statement with comparison All of the helped shaders are in Deus Ex. I looked at a couple shaders, and they have a pattern like: vec1 32 ssa_373 = i2b32 ssa_345.w vec1 32 ssa_374 = bcsel ssa_373, ssa_20, ssa_0 ... vec1 32 ssa_377 = ine ssa_345.w, ssa_0 if ssa_377 { ... vec1 32 ssa_416 = i2b32 ssa_385.w vec1 32 ssa_417 = bcsel ssa_416, ssa_386, ssa_374 ... } The massive help occurs because the i2b32 is removed, then other passes determine that ssa_374 must be ssa_20 inside the if-statement allowing the first bcsel to also be deleted. v2: Rebase on 1-bit Boolean changes. v3: Fix i2b32 vs ine problem in if-statement replacement. Noticed by Bas. Skylake total instructions in shared programs: 15241394 -> 15186287 (-0.36%) instructions in affected programs: 890583 -> 835476 (-6.19%) helped: 355 HURT: 0 helped stats (abs) min: 1 max: 497 x̄: 155.23 x̃: 149 helped stats (rel) min: 0.09% max: 16.49% x̄: 6.10% x̃: 6.59% 95% mean confidence interval for instructions value: -165.07 -145.39 95% mean confidence interval for instructions %-change: -6.42% -5.77% Instructions are helped. total cycles in shared programs: 373846583 -> 371023357 (-0.76%) cycles in affected programs: 118972102 -> 116148876 (-2.37%) helped: 343 HURT: 14 helped stats (abs) min: 45 max: 118284 x̄: 8332.32 x̃: 6089 helped stats (rel) min: 0.03% max: 38.19% x̄: 2.48% x̃: 1.77% HURT stats (abs) min: 120 max: 4126 x̄: 2482.79 x̃: 3019 HURT stats (rel) min: 0.16% max: 17.37% x̄: 2.13% x̃: 1.11% 95% mean confidence interval for cycles value: -8723.28 -7093.12 95% mean confidence interval for cycles %-change: -2.57% -2.02% Cycles are helped. total spills in shared programs: 32401 -> 23465 (-27.58%) spills in affected programs: 24457 -> 15521 (-36.54%) helped: 343 HURT: 0 total fills in shared programs: 37866 -> 31765 (-16.11%) fills in affected programs: 18889 -> 12788 (-32.30%) helped: 343 HURT: 0 Broadwell and Haswell had similar results. (Haswell shown) Haswell total instructions in shared programs: 13764783 -> 13750679 (-0.10%) instructions in affected programs: 1176256 -> 1162152 (-1.20%) helped: 334 HURT: 21 helped stats (abs) min: 1 max: 358 x̄: 42.59 x̃: 47 helped stats (rel) min: 0.09% max: 11.81% x̄: 1.30% x̃: 1.37% HURT stats (abs) min: 1 max: 61 x̄: 5.76 x̃: 1 HURT stats (rel) min: 0.03% max: 1.84% x̄: 0.17% x̃: 0.03% 95% mean confidence interval for instructions value: -43.99 -35.47 95% mean confidence interval for instructions %-change: -1.35% -1.08% Instructions are helped. total cycles in shared programs: 386511910 -> 385402528 (-0.29%) cycles in affected programs: 143831110 -> 142721728 (-0.77%) helped: 327 HURT: 39 helped stats (abs) min: 16 max: 25219 x̄: 3519.74 x̃: 3570 helped stats (rel) min: <.01% max: 10.26% x̄: 0.95% x̃: 0.96% HURT stats (abs) min: 16 max: 4881 x̄: 1065.95 x̃: 997 HURT stats (rel) min: <.01% max: 16.67% x̄: 0.70% x̃: 0.24% 95% mean confidence interval for cycles value: -3375.59 -2686.60 95% mean confidence interval for cycles %-change: -0.92% -0.64% Cycles are helped. total spills in shared programs: 100480 -> 97846 (-2.62%) spills in affected programs: 84702 -> 82068 (-3.11%) helped: 316 HURT: 21 total fills in shared programs: 96877 -> 94369 (-2.59%) fills in affected programs: 69167 -> 66659 (-3.63%) helped: 316 HURT: 9 No changes on Ivy Bridge or earlier platforms. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	d2056ab993	intel/vec4: Emit constants for some ALU sources as immediate values In some cases of flow control, the constant propagation is not able to determine that the source of an instruction must be a constant value. When we still have NIR SSA values, we can easily determine this. Emit the immediate value during code generation to possible avoid spurious loads of constants into registers. I wrote this patch to prevent a couple trivial regressions in vec4 shaders caused by "nir/algebraic: Replace i2b used by bcsel or if-statement with comparison". The final result was quite a bit better than that... No shader-db changes on any Gen8+ platform. v2: Assert that we never get a negation source modifier on Gen8+. Suggested by Ken. This should never happen because we don't normally use vec4 for Gen8+ (requires and environment variable to force it), and there's no code to generate these negations. Still, erring on the side of caution is better. Haswell total instructions in shared programs: 13776218 -> 13764783 (-0.08%) instructions in affected programs: 663931 -> 652496 (-1.72%) helped: 3495 HURT: 1 helped stats (abs) min: 1 max: 30 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.21% max: 10.00% x̄: 1.79% x̃: 1.49% HURT stats (abs) min: 24 max: 24 x̄: 24.00 x̃: 24 HURT stats (rel) min: 12.24% max: 12.24% x̄: 12.24% x̃: 12.24% 95% mean confidence interval for instructions value: -3.39 -3.15 95% mean confidence interval for instructions %-change: -1.84% -1.75% Instructions are helped. total cycles in shared programs: 386818984 -> 386511910 (-0.08%) cycles in affected programs: 20379636 -> 20072562 (-1.51%) helped: 3052 HURT: 476 helped stats (abs) min: 2 max: 12516 x̄: 110.40 x̃: 6 helped stats (rel) min: 0.05% max: 24.68% x̄: 1.58% x̃: 0.69% HURT stats (abs) min: 2 max: 416 x̄: 62.76 x̃: 24 HURT stats (rel) min: 0.10% max: 10.75% x̄: 4.03% x̃: 2.18% 95% mean confidence interval for cycles value: -115.57 -58.51 95% mean confidence interval for cycles %-change: -0.93% -0.73% Cycles are helped. total spills in shared programs: 100482 -> 100480 (<.01%) spills in affected programs: 79 -> 77 (-2.53%) helped: 3 HURT: 1 total fills in shared programs: 96883 -> 96877 (<.01%) fills in affected programs: 85 -> 79 (-7.06%) helped: 4 HURT: 0 Ivy Bridge total instructions in shared programs: 12000562 -> 11990113 (-0.09%) instructions in affected programs: 572581 -> 562132 (-1.82%) helped: 3106 HURT: 0 helped stats (abs) min: 1 max: 30 x̄: 3.36 x̃: 2 helped stats (rel) min: 0.21% max: 10.00% x̄: 1.86% x̃: 1.49% 95% mean confidence interval for instructions value: -3.49 -3.23 95% mean confidence interval for instructions %-change: -1.91% -1.81% Instructions are helped. total cycles in shared programs: 180958504 -> 180664500 (-0.16%) cycles in affected programs: 19991810 -> 19697806 (-1.47%) helped: 2654 HURT: 486 helped stats (abs) min: 2 max: 12516 x̄: 121.61 x̃: 6 helped stats (rel) min: 0.05% max: 20.66% x̄: 1.48% x̃: 0.68% HURT stats (abs) min: 2 max: 396 x̄: 59.18 x̃: 24 HURT stats (rel) min: 0.05% max: 9.62% x̄: 3.82% x̃: 2.16% 95% mean confidence interval for cycles value: -125.62 -61.64 95% mean confidence interval for cycles %-change: -0.76% -0.56% Cycles are helped. Sandy Bridge total instructions in shared programs: 10842336 -> 10835438 (-0.06%) instructions in affected programs: 395340 -> 388442 (-1.74%) helped: 1926 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.58 x̃: 2 helped stats (rel) min: 0.10% max: 9.68% x̄: 1.78% x̃: 1.42% 95% mean confidence interval for instructions value: -3.73 -3.43 95% mean confidence interval for instructions %-change: -1.84% -1.72% Instructions are helped. total cycles in shared programs: 154590074 -> 154569050 (-0.01%) cycles in affected programs: 8159932 -> 8138908 (-0.26%) helped: 1670 HURT: 228 helped stats (abs) min: 2 max: 260 x̄: 18.13 x̃: 6 helped stats (rel) min: 0.02% max: 8.70% x̄: 0.74% x̃: 0.28% HURT stats (abs) min: 2 max: 1798 x̄: 40.58 x̃: 14 HURT stats (rel) min: 0.03% max: 12.97% x̄: 1.04% x̃: 0.31% 95% mean confidence interval for cycles value: -13.51 -8.64 95% mean confidence interval for cycles %-change: -0.60% -0.46% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8212357 -> 8206587 (-0.07%) instructions in affected programs: 323664 -> 317894 (-1.78%) helped: 1457 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 3.96 x̃: 3 helped stats (rel) min: 0.33% max: 11.49% x̄: 1.86% x̃: 1.44% 95% mean confidence interval for instructions value: -4.14 -3.78 95% mean confidence interval for instructions %-change: -1.93% -1.78% Instructions are helped. total cycles in shared programs: 187668016 -> 187657422 (<.01%) cycles in affected programs: 14856234 -> 14845640 (-0.07%) helped: 1372 HURT: 83 helped stats (abs) min: 2 max: 24 x̄: 7.92 x̃: 6 helped stats (rel) min: 0.02% max: 1.14% x̄: 0.12% x̃: 0.08% HURT stats (abs) min: 2 max: 14 x̄: 3.20 x̃: 2 HURT stats (rel) min: 0.03% max: 0.60% x̄: 0.12% x̃: 0.12% 95% mean confidence interval for cycles value: -7.65 -6.91 95% mean confidence interval for cycles %-change: -0.11% -0.10% Cycles are helped. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:41:46 -08:00
Eric Engestrom	fc82ea1350	Revert "swr/rast: Archrast codegen updates" This reverts the following commits: `71a76a47cc` "swr/codegen: fix autotools build" `7763e664ce` "meson/swr: replace hard-coded path with current_build_dir()" `773b3ceaca` "swr/rast: Fix autotools and scons codegen" `16e10b8c30` "swr/rast: Add general SWTag statistics" `b45a15a39f` "swr/rast: Add string handling to AR event framework" `8608a747aa` "swr/rast: Add initial SWTag proto definitions" `93cd9905c8` "swr/rast: Cleanup and generalize gen_archrast" The last one in this list broke all the build systems that can build this (meson, autotools & scons). See MR !304 for more details: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/304 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-01 16:46:32 +00:00
Fritz Koenig	12af6b30a3	freedreno/a6xx: Enable UBWC modifier Adding the supported_modifiers allows buffers to be created with UBWC	2019-03-01 15:51:16 +00:00
Fritz Koenig	4715e7a98a	freedreno: UBWC allocator UBWC requires space for a metadata or flag buffer that contains compression data. Each 16x4 tile of image data corresponds to a byte of compression data. This buffer needs to be stored before (at a lower address) the image buffer in order to match up with what the display driver. This allows the display driver to directly scan-out at UBWC buffer.	2019-03-01 15:51:16 +00:00
Fritz Koenig	3e6758a4e7	freedreno/a6xx: UBWC support Universal bandwidth compression(UBWC) reduces memory bandwidth by compressing buffers. This compression takes the form of a full sized image buffer as well as a smaller metadata buffer.	2019-03-01 15:51:16 +00:00
Fritz Koenig	41082446db	freedreno: pass count to query_dmabuf_modifiers query_dmabuf_modifiers needs to know the max number of modifiers that the list will hold.	2019-03-01 15:51:16 +00:00
Eric Engestrom	2793417ec6	anv: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	258e463db5	anv: remove spaces around kwargs assignment pylint complains: > C0326: No space allowed around keyword argument assignment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	7b704fd2fd	anv: drop unused parameter I'm guessing a previous version of this script used an index-based map of entrypoints, but that's not the case anymore. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	b503d4e458	anv: simplify chained comparison Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Caio Marcelo de Oliveira Filho	1458aa1f78	nir/copy_prop_vars: handle indirect vector elements Differently than the direct case, the indirect array derefs of vector are handled like regular derefs, with the exception that we ignore any vector entry that has SSA values when performing a load. Such SSA values don't help loading of the indirect unless we emit an if-ladder. Copy_derefs are supported for indirects. Also enable two tests that now pass. v2: Remove unnecessary temporaries. Be clearer when identifying the case where copy_entry doesn't help when we are dealing with an indirect array_deref (of a vector). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:55:31 -08:00
Caio Marcelo de Oliveira Filho	6c0de78cc2	nir/copy_prop_vars: prefer using entries from equal derefs When looking up an entry to use, always prefer an equal match, as it more likely to contain reusable SSA or derefs to propagate. This will be necessary when adding entries with array derefs of vectors, because we don't want the vector if the equal entry (an array deref of that vector) is present. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:55:31 -08:00
Caio Marcelo de Oliveira Filho	61965afd00	nir/copy_prop_vars: add tests for indirect array deref Both on an actual array and on a vector, and an extra test on a vector mixing direct and indirect access. The vector tests are disabled and will be enabled by a later commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:55:31 -08:00
Caio Marcelo de Oliveira Filho	96c32d7776	nir/copy_prop_vars: handle load/store of vector elements When direct array deref is used on a vector type (for loads and stores), copy_prop_vars is now smart to propagate values it knows about. Given a 'vec4 v', storing to v[3] will update the copy entry for v and it is equivalent to a write to v.w. Loading from v[1] will try first to see if there's a known value for v.y -- and drop the load in that case. The copy entries still always refer to the entire vectors, so the operations happen on the parent deref (the 'vector') and the values are fixed accordingly. It might be the case now that certain entries have not only different SSA defs in each element but also those come from different components than they are set to, because stores to individual elements always come from a SSA definition with a single component. Tests related to these cases are now enabled. v2: Instead of asserting on invalid indices, "load" an undef and remove the store. (Jason) v3: Merge code path for the cases of is_array_deref_of_vector into the regular code path. Add a base_index parameter to value_set_from_value. (code changes by Jason) v4: Removed the get_entry_for_deref helper, now being used only once. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:50:05 -08:00
Caio Marcelo de Oliveira Filho	33dafdc024	nir/copy_prop_vars: use NIR_MAX_VEC_COMPONENTS Also replace uses of 0xf with the appropriate full mask created from the number of components. Note that an increase of MAX might make us change how the data is stored later on, but for now at least we make sure the pass is not hardcoded. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:50:05 -08:00
Caio Marcelo de Oliveira Filho	e84c841fb0	nir/copy_prop_vars: rename/refactor store_to_entry helper The name reflected this function role back when the pass also did dead write elimination. So rename it to what it does now, which is setting a value using another value; and narrow the argument list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:50:05 -08:00
Christian Gmeiner	6c61449251	etnaviv: fix compile warnings Fixes the following compile warnings: [591/629] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_context.c.o'. ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_context.c: In function 'etna_cmd_stream_reset_notify': ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_context.c:334:22: warning: unused variable 'entry' [-Wunused-variable] struct set_entry entry; ^~~~~ [604/629] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_resource.c.o'. ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_resource.c: In function 'etna_resource_used': ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_resource.c:649:22: warning: unused variable 'entry' [-Wunused-variable] struct set_entry entry; ^~~~~ Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-03-01 08:45:05 +01:00
Christian Gmeiner	64813541d5	etnaviv: fix resource usage tracking across different pipe_context's A pipe_resource can be shared by all the pipe_context's hanging off the same pipe_screen. Changes from v2 -> v3: - add locking with mtx_*() to resource and screen (Marek) Changes from v3 -> v4: - drop rsc->lock, just use screen->lock for the entire serialization (Marek) - simplify etna_resource_used() flush condition, which also prevents potentially flushing resources twice (Marek) - don't remove resouces from screen->used_resources in etna_cmd_stream_reset_notify(), they may still be used in other contexts and may need flushing there later on (Marek) Changes from v4 -> v5: - Fix coding style issues reported by Guido Changes from v5 -> v6: - Add missing locking in etna_transfer_map(..) (Boris) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Marek Vasut <marex@denx.de> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-03-01 08:08:56 +01:00
Christian Gmeiner	f1061fa577	etnaviv: enable ETC2 texture compression support for HALTI0 GPUs Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Christian Gmeiner	5d09325c1c	etnaviv: hook-up etc2 patching Changes v1 -> v2: - Avoid the GPU sampling from the resource that gets mutated by the the transfer map by setting DRM_ETNA_PREP_WRITE. Changes v2 -> v3: - make use of likely(..) - drop minor optimization regarding rsc->layout == ETNA_LAYOUT_LINEAR - better documentation why DRM_ETNA_PREP_WRITE is needed Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Christian Gmeiner	d8177f6233	etnaviv: keep track of mapped bo address Saves us from calling etna_bo_map(..) and saves us from doing the same offset calcs for map() and unmap() operations. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Christian Gmeiner	5bb4e6956d	etnaviv: implement ETC2 block patching for HALTI0 ETC2 is supported with HALTI0, however that implementation is buggy in hardware. The blob driver does per-block patching to work around this. We need to swap colors for t-mode etc2 blocks. Changes v2 -> v3: - Drop redundant format check Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Jason Ekstrand	e8f863e718	intel/compiler: Re-prefix non-logical surface opcodes with VEC4 The scalar back-end uses SHADER_OPCODE_SEND for all surface messages so we no longer need the non-logical opcodes there. Prefix them VEC4 so it's clear that they're only used by the vec4 back-end. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	95ae400abc	intel/schedule_instructions: Move some comments Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	aeaba24fcb	intel/compiler: Drop unused surface opcodes The unused typed surface read/write support in the vec4 back-end has been dropped and the fs back-end now uses SHADER_OPCODE_SEND for all image and buffer ops. There's no reason to keep these opcodes around anymore. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	a04c737215	intel/fs: Get rid of the IMAGE_SIZE opcode Since switching to SHADER_OPCODE_SEND for image operations, we no longer need the non-logical opcode. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	10b7d14c31	intel/vec4: Drop dead code for handling typed surface messages Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	9d437f9482	intel/fs: Drop the fs_surface_builder All of the actual abstraction (except possibly setting size_written) happens as part of the logical opcodes. The only thing that the surface builder is providing at this point is extra levels of functions to call through. I'm going to be adding bindless image support soon and all the extra abstraction here is just getting in the way. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	494a0543e6	intel/fs: Re-order logical surface arguments It makes more sense to start at the surface then move on to the address and then the data. Also, this is a really good test of whether or not we got all the places that use the sources by explicit integer number. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	94f8fd9a0c	intel/fs: Add an enum type for logical sampler inst sources Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jose Fonseca	838c0485e0	scons: Workaround failures with MSVC when using SCons 3.0.[2-4]. This change applies the workaround suggested by Bill Deegan on the affected SCons versions. It also adds a comment with the URL explaining why we were using customizing the decider and max_drift in the first place, as I had forgotten all about it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109443 Tested-by: liviuprodea@yahoo.com Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-02-28 21:26:15 +00:00
Kristian H. Kristensen	87c2e8cbc9	freedreno: Fix a couple of warnings Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-28 10:43:53 -08:00
Kristian H. Kristensen	a5a19d1bc8	freedreno/a6xx: Don't zero SO buffer addresses Just disable SO in VPC_SO_BUF_CNTL. Less noise in dumps. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-28 10:43:53 -08:00
Kristian H. Kristensen	7dee916105	freedreno/a6xx: Only output MRT control for used framebuffers Not much of an optimization, but makes for less noise in the command buffer dumps. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-28 10:43:53 -08:00
Eric Engestrom	df5cd51259	gitlab-ci: install xmllint to validate 00-mesa-defaults.conf Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-28 17:30:48 +00:00
Eric Engestrom	bb6b691c57	driconf: add DTD to allow the drirc xml (00-mesa-defaults.conf) to be validated This DTD can be used to validate the drirc xml: $ xmllint --noout --valid 00-mesa-defaults.conf Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-28 17:30:44 +00:00
Eric Engestrom	4c3b293242	vulkan: use VkBase{In,Out}Structure instead of a custom struct VkBaseInStructure and VkBaseOutStructure are part of vulkan_core.h (which is part of vulkan.h) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-28 16:25:59 +00:00
Lionel Landwerlin	add4b8930a	vulkan/overlay: add support for fps output in file Also make the sampling period configurable. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-28 12:40:57 +00:00
Lionel Landwerlin	b6b275212d	vulkan/overlay: rework option parsing Makes adding new options easier. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-28 12:40:57 +00:00
Lionel Landwerlin	4e29a1d36a	vulkan/overlay: fix min/max computations This shouldn't be condition to the acquire time being visible. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-28 12:40:57 +00:00
Emil Velikov	7ad1a05c83	egl/sl: use kms_swrast with vgem instead of a random GPU VGEM and kms_swrast were introduced to work with one another. All we do is CPU rendering to dumb buffers. There is no reason to carve out GPU memory, increasing the memory pressure on a device that could make a better use of it. Note: - The original code did not work out of the box, since the dumb buffer ioctls are not exposed to render nodes. - This requires libdrm commit 3df8a7f0 ("xf86drm: fallback to MODALIAS for OF less platform devices") - The non-kms, swrast is unaffected by this change. v2: - elaborate what and how is/isn't working (Eric) - simplify driver_name handling (Eric) v3: - move node_type outside of the loop (Eric) - kill no longer needed DRM_RENDER_DEV_NAME define Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-02-28 12:05:03 +00:00
Emil Velikov	218c7b5aca	egl/sl: use drmDevice API to enumerate available devices This provides for a more comprehensive iteration and slightly more straight-forward codebase. v2: - s/dpy/disp/ - keep original 64 devices (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-02-28 12:02:38 +00:00
Emil Velikov	893421f315	egl/sl: split out swrast probe into separate function Make the code a bit easier to read. As a bonus point this makes it obvious that we forgot to call _eglAddDevice() for the device - do so. v2: - s/dpy/disp/ (Eric) - free(driver_name) on dri2_load_driver_swrast() failure (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-02-28 12:02:19 +00:00
Juan A. Suarez Romero	b43b55d461	nir/spirv: return after emitting a branch in block When emitting a branch in a block, it does not make sense to continue processing further instructions, as they will not be reachable. This fixes a nasty case with a loop with a branch that both then-part and else-part exits the loop: %1 = OpLabel OpLoopMerge %2 %3 None OpBranchConditional %false %2 %2 %3 = OpLabel OpBranch %1 %2 = OpLabel [...] We know that block %1 will branch always to block %2, which is the merge block for the loop. And thus a break is emitted. If we keep continuing processing further instructions, we will be processing the branch conditional and thus emitting the proper NIR conditional, which leads to instructions after the break. This fixes dEQP-VK.graphicsfuzz.continue-and-merge. CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 09:47:06 +01:00
Eric Engestrom	0c3287e94d	egl/android: replace magic 0=CbCr,1=CrCb with simple enum Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-28 07:44:46 +00:00
Caio Marcelo de Oliveira Filho	6a553bedcc	st/nir: count num_uniforms for FS bultin shader Usually the uniforms will be assigned locations and have their slots counted automatically, but for builtin shaders the location assignment is manual. So count them too otherwise we get num_uniforms == 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-27 22:18:24 -08:00
Ray Zhang	b344e32cdf	glx: fix shared memory leak in X11 call XShmDetach to allow X server to free shared memory Fixes: `bcd80be49a` "drisw/glx: use XShm if possible" Signed-off-by: Ray Zhang <zhanglei002@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-02-28 14:23:02 +10:00
Timothy Arceri	e907337fad	radeonsi/nir: move si_lower_nir() call into compiler thread This helps improve compile times. For example the shader-db dolphin shader shaders/dolphin/ubershaders/120.shader_test goes from ~1.69 -> ~1.57 seconds on my machine with this change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-28 11:54:06 +11:00
Timothy Arceri	7536af670b	glsl: fix shader cache for packed param list Some types of params such as some builtins are always padded. We need to keep track of this so we can restore the list correctly. Here we also remove a couple of cache entries that are not actually required as they get rebuilt by the _mesa_add_parameter() calls. This patch fixes a bunch of arb_texture_multisample and arb_sample_shading piglit tests for the radeonsi NIR backend. Fixes: `edded12376` ("mesa: rework ParameterList to allow packing") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-28 11:47:37 +11:00
Yevhenii Kolesnikov	07f4b4e403	i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0 Added check for higher compat profile being allowed before assigning certain extensions. Fixes: `272fe94942` (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile) Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052	2019-02-28 10:25:16 +11:00
Lionel Landwerlin	6e184147dd	intel/compiler: use correct swizzle for replacement The optimization in `4cd1a0be76` introduced a replacement of : cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D ... cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D By : cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D ... mov(8) vgrf11.y:D, vgrf15.yyyy:D The first cmp instruction is storing in x while the second mov is sourcing from y. We need to take into account where the replacement on the scan_inst destination is going to store thing so that the replacement mov can source things from the correct location. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4cd1a0be76` ("i965/vec4: Propagate conditional modifiers from more compares to other compares") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-27 20:06:42 +00:00
Jonathan Marek	61e3188633	freedreno: catch failing fd_blit and fallback to software blit Fixes cases where the fd_blit fails and never happens (ex: blit to etc1) Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-02-27 18:46:28 +00:00
Jonathan Marek	e3591b0339	freedreno: use renderonly path for buffers allocated with modifiers Now that freedreno has create_with_modifiers(), this "hack" is needed to make some cases work. Copied from vc4. Fixes: `41ddf1d1` Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-02-27 18:46:28 +00:00
Jonathan Marek	6c0fefb448	freedreno: a2xx: fix mipmapping for NPOT textures Fixes: `3a273a4a` Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-02-27 18:46:28 +00:00
Jonathan Marek	4f23767590	freedreno: a2xx: fix fast clear for some gmem configurations In freedreno_gmem.c, gmem_align of 0x8000 is used. Alignment used here should be the same. Fixes: `912a9c8d` Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-02-27 18:46:28 +00:00
Jonathan Marek	8eca6df5ed	freedreno: a2xx: add use_hw_binning function Fixes: `cb2322c7` Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-02-27 18:46:28 +00:00
Jonathan Marek	357313ab0f	freedreno: a2xx: don't write 4th vertex in mem2gmem There is only room for 3 vertices now (RECT has 3 vertices). Fixes: `6ef7700a` Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-02-27 18:46:28 +00:00
Erik Faye-Lund	71a76a47cc	swr/codegen: fix autotools build When the output directory was changed, the BUILT_SOURCES and build-rule target-path was no longer correct, leading to races to generate the sources and compiling them. Fix this by updating both sets of paths, so automake see what's going on here. Fixes: `773b3ceaca` ("swr/rast: Fix autotools and scons codegen") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-02-27 17:59:06 +00:00
Timo Aaltonen	738626daca	util/os_misc: Add check for PIPE_OS_HURD Fix build on Hurd. Signed-off-by: Timo Aaltonen <tjaalton@debian.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-27 14:56:48 +00:00
Lionel Landwerlin	2fff5966d6	vulkan/overlay: install layer binary in libdir This will allow multilib. v2: Drop path from json file, dlopen should be able to locate the lib in libdir v3: Switch from configure_file to install_data (Dylan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109788 Tested-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-27 11:45:42 +00:00
Eric Engestrom	7763e664ce	meson/swr: replace hard-coded path with current_build_dir() Fixes: `93cd9905c8` "swr/rast: Cleanup and generalize gen_archrast" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alok Hota <alok.hota@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-02-27 11:13:05 +00:00
Gert Wollny	b7201a468d	nir: Add posibility to not lower to source mod 'abs' for ops with three sources This is useful for r600 since there the abs source modifier is not supported for ops with three sources v2: Use correct logic to enable lowering to abs source mod (Eric Anhold) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-27 11:04:06 +00:00
Gurchetan Singh	ce112fcc87	virgl/vtest: deprecate protocol version 1 This is a partial revert of 9d81cd ("virgl: Pass resource size and transfer offsets"). The adjustments made in the client code means there's various mismatches when transfering data. Let's fallback to protocol version 0 and deprecate protocol version 1. We can still use the protocol version 1 slots for a shared memory transfer mechanism later. Fixes: dEQP-GLES31.functional.copy_image.mixed.viewclass_128_bits_mixed.*_renderbuffer Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-02-27 11:02:29 +00:00
Tapani Pälli	b9acfef337	util: fix a warning when building against clang7 headers Header xmmintrin.h conditionally includes emmintrin.h that defines _MM_DENORMALS_ZERO_MASK, add ifndef to fix this warning. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-27 08:57:41 +02:00
Tapani Pälli	d1af8115f8	iris: add libmesa_iris_gen8 library to the build Patch fixes iris build on Android. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-27 08:57:41 +02:00
Tapani Pälli	5e52184f72	android: make libbacktrace optional on USE_LIBBACKTRACE Otherwise with VNDK enabled we fail linking: src/gallium/targets/dri/Android.mk: error: gallium_dri (native:vendor) should not link to libbacktrace.vendor (native:vndk_private) Option makes it possible to use libbacktrace only when VNDK is not enabled. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-27 08:56:46 +02:00
Tapani Pälli	a3c366c4b2	android: add liblog to libmesa_intel_common build Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-27 08:53:09 +02:00
Alyssa Rosenzweig	b7a5b81d14	panfrost/midgard: Allow flt to run on most units Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-27 03:56:56 +00:00
Alyssa Rosenzweig	4c82abb9b6	panfrost: Expose perf counters in environment Previously, we were guarded by an #ifdef, which is generally a bad form. This patch instead guards them behind an environmental variable. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-27 03:56:38 +00:00
Alyssa Rosenzweig	60270c83b5	panfrost: Identify 4-bit channel texture formats Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-27 03:56:17 +00:00
Alyssa Rosenzweig	90fd82c540	panfrost: Add RGB565, RGB5A1 texture formats Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-27 03:55:19 +00:00
Jose Maria Casanova Crespo	4122665dd9	iris: Enable ARB_shader_draw_parameters support Additional VERTEX_ELEMENT_STATE are used to store basevertex and baseinstance and drawid updating the DWordLength of the 3DSTATE_VERTEX_ELEMENTS command. This passes all piglit tests for spec.draw_parameters. tests and VK-GL-CTS KHR-GL45.shader_draw_parameters_tests.* tests. Now we only mark a dirty_update when parameters are changed or when we have an indirect draw. We enable PIPE_CAP_DRAW_PARAMETERS on Iris. There is no edge flag support in the Vertex Elements setup. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-26 13:28:38 -08:00
Pierre Moreau	1c9fdcefd4	clover: Fix indentation issues Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	5285fff5f9	clover: Only use devices supporting IR_NATIVE Currently clover will advertise any device that advertises PIPE_CAP_COMPUTE, even if they do not support PIPE_SHADER_IR_NATIVE, which is the IR used internally by clover. This avoids clover advertising devices as available even though they actually are not supported. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	8f9b4a2be6	clover: Move platform extensions definitions to clover/platform.cpp Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Aaron Watry <awatry@gmail.com>	2019-02-26 21:02:07 +01:00
Pierre Moreau	b033620abf	clover: Move device extensions definitions to core/device.cpp Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Aaron Watry <awatry@gmail.com>	2019-02-26 21:02:07 +01:00
Pierre Moreau	d42f5896c5	clover: Validate program and library linking options Program linking options are only valid if the library was created with the `-enable-link-options` option, which itself is only valid when creating a library, and only when creating an executable. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	fccc6ecb52	clover: Disallow creating libraries from other libraries If creating a library, do not allow non-compiled object in it, as executables are not allowed, and libraries would make it really hard to enforce the "-enable-link-options" flag. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Aaron Watry <awatry@gmail.com>	2019-02-26 21:02:07 +01:00
Pierre Moreau	bad161c894	clover/api: Fail if trying to build a non-executable binary From the OpenCL 1.2 Specification, Section 5.6.2 (about clBuildProgram): > If program is created with clCreateProgramWithBinary, then the > program binary must be an executable binary (not a compiled binary or > library). Reviewed-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	25d4e65eb7	clover/api: Rework the validation of devices for building Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	505ec3a530	clover: Add an helper for checking if an IR is supported Reviewed-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	67769c913f	clover: Remove the TGSI backend as unused Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Pierre Moreau	669d00ba4c	clover: Avoid warnings from new OpenCL headers * Avoid warnings from references to deprecated CL 1.0, 1.2, 2.0 and 2.1 APIs. * Avoid warnings from not defining CL_TARGET_OPENCL_VERSION. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-02-26 21:02:07 +01:00
Karol Herbst	ba8d21a8d3	clover: update ICD table to support everything up to 2.2 Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-02-26 21:02:07 +01:00
Pierre Moreau	dddc5649bf	include/CL: Update to the latest OpenCL 2.2 headers Acked-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-02-26 21:02:07 +01:00
Marek Olšák	2ae07830e7	gallium/u_tests: use a compute-only context to test GCN compute ring	2019-02-26 14:58:55 -05:00
Marek Olšák	a1378639ab	radeonsi: always use compute rings for clover on CI and newer (v2) initialize all non-compute context functions to NULL. v2: fix SI	2019-02-26 14:58:55 -05:00
Bas Nieuwenhuizen	c0110477b5	radv: Interpolate less aggressively. Seems like dxvk used integer builtins without setting the flat interpolation decoration. I believe in the current spec the app is required to set these, but in the meantime to avoid breaking things in stable releases (and so close to release for 19.0), only expand the interpolation to float16 and struct (which cannot be builtins as our spirv parser lowers the builtin block). Fixes: `f324784104` "radv: Allow interpolation on non-float types." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-26 18:51:35 +00:00
Drew Davenport	1fd79b4b6d	util: Don't block SIGSYS for new threads SIGSYS is needed for programs using seccomp for sandboxing. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-26 19:39:14 +01:00
Rob Clark	64206102fc	freedreno/ir3: gsampler2DMSArray fixes Array index should come before sample-id. And exclude all isam variants (which take integer texel coords) from adding of offset. Fixes dEQP-GLES31.functional.texture.multisample.samples_1.use_texture_*_2d_array Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	a06bb486b0	freedreno/ir3/a6xx: fix atomic shader outputs We also need to put in the output mov. Possibly we could just fixup the output register to read it directly from the dummy, but that is more work and I guess dEQP is probably the only time you encounter this. Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_literal_fragment Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	db1fa21374	freedreno/a6xx: vertex_id is not _zero_based Fixes dEQP-GLES31.functional.draw_base_vertex.draw_elements_base_vertex.builtin_variable.vertex_id Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	79180a0566	freedreno/a6xx: fix DRAW_IDX_INDIRECT max_indicies The indirect offset does not effect the index buffer size. Fixes all of dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_100x100_drawcount_* with drawcount > 1. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	cabe55a2e7	freedreno/ir3/a6xx: fix non-ssa atomic dst We weren't propagating the array info for cases where result of atomic is array/reg. This can happen, for example, if result is part of a phi web lowered to regs. Fixes dEQP-GLES31.functional.ssbo.atomic.compswap.* Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	edd5b3126d	freedreno/a6xx: fix ssbo alignment Fixes a bunch of deqp ssbo tests that use multiple ssbo blocks packed into a single buffer. Note the a5xx value seems suspicious, but this is what blob seems to advertise. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	cb884d8ab2	freedreno/ir3: use nopN encoding when possible Use the (nopN) encoding for slightly denser shaders.. this lets us fold nop instructions into the previous alu instruction in certain cases. Shouldn't change the # of cycles a shader takes to execute, but reduces the size. (ex: glmark2 refract goes from 168 to 116 instructions) Currently only enabled for a6xx, but I think we could enable this for a5xx and possibly a4xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Rob Clark	04c2520d91	freedreno/a6xx: fix hangs with large shaders We were overflowing instrlen (which is # of groups of 16 instructions) in a couple dEQP tests, causing gpu hangs: dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20 Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-26 13:19:44 -05:00
Brian Paul	6dabcb5bcf	mesa: fix display list corner case assertion This fixes a failed assertion in glDeleteLists() for the following case: list = glGenLists(1); glDeleteLists(list, 1); when those are the first display list commands issued by the application. When we generate display lists, we plug in empty lists created with the make_list() helper. This function uses the OPCODE_END_OF_LIST opcode but does not call dlist_alloc() which would set the InstSize[OPCODE_END_OF_LIST] element to non-zero. When the empty list was deleted, we failed the InstSize[opcode] > 0 assertion. Typically, display lists are created with glNewList/glEndList so we set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc(). That's why this bug wasn't found before. To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST] element in make_list(). The game oolite was hitting this. Fixes: https://github.com/OoliteProject/oolite/issues/325 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-26 09:56:45 -07:00
Brian Paul	cb52d4482d	svga: fix dma.pending > 0 test The dma.pending field is boolean, so testing for > 0 isn't right. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-02-26 09:56:45 -07:00
Brian Paul	96ea977c79	svga: assorted whitespace and formatting fixes Remove trailing whitespace, etc. Trivial.	2019-02-26 09:56:45 -07:00
Brian Paul	a81eebf9bc	st/mesa: whitespace/formatting fixes in st_cb_texture.c Remove trailing whitespace, replace tabs w/ spaces, etc. Trivial.	2019-02-26 09:56:45 -07:00
Eleni Maria Stea	fd37a19ac4	i965: fixed clamping in set_scissor_bits when the y is flipped Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 https://bugs.freedesktop.org/show_bug.cgi?id=109594 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) v3: - Reversed the CLAMP change in bbox[3] as the API guarantees that the viewport height is positive. (Nanley Chery) v4: - Added nomination for the mesa-stable branch and the link to the second bugzilla bug (Nanley Chery) CC: <mesa-stable@lists.freedesktop.org> Tested-by: Paul Chelombitko <qamonstergl@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-02-26 08:23:26 -08:00
Eduardo Lima Mitev	0bf667984b	freedreno/a6xx: Silence compiler warnings util_format_compose_swizzles() expects 'const unsigned char' and we are feeding it 'char'. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-26 14:15:33 +01:00
Kasireddy, Vivek	7cab8d3661	i965: Add support for sampling from XYUV images Add support to the i965 DRI driver to sample from XYUV8888 buffers. Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 13:08:52 +00:00
Kasireddy, Vivek	65600d0946	dri: Add XYUV8888 format In addition to adding this format to the dri_interface header, add an entry in the android and wayland backends as well. Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 13:08:52 +00:00
Vivek Kasireddy	ff14d06be5	drm-uapi: Update headers from drm-next Pull new updates from drm-next as of the following commit: commit a5f2fafece141ef3509e686cea576366d55cabb6 Merge: 71f4e45a4ed3 860433ed2a55 Author: Dave Airlie <airlied@redhat.com> Date: Wed Feb 20 12:16:30 2019 +1000 Merge https://gitlab.freedesktop.org/drm/msm into drm-next Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 13:08:51 +00:00
Kasireddy, Vivek	78fb3fd17e	nir/lower_tex: Add support for XYUV lowering The memory layout associated with this format would be: Byte: 0 1 2 3 Component: V U Y X Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 13:08:51 +00:00
Lionel Landwerlin	913d711e0f	imgui: update memory editor Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-26 12:49:07 +00:00
Lionel Landwerlin	ab9ae080ec	imgui: update commit In commit `3950e7c11e` ("imgui: bump copy") I forgot to update the README about what copy of imgui we carry. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-26 12:49:04 +00:00
Eric Engestrom	a213b927f2	driinfo: add DTD to allow the xml to be validated This DTD can be used to validate the output and make sure any parsers out there can handle it: $ xmllint --noout --valid driinfo.xml Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-26 12:48:28 +00:00
Lionel Landwerlin	9646750822	vulkan/overlay: fix includes The Loader/Validation-Layers repository allow the user to choose where header files are installed. On my system I choose /usr/include thinking it was the obvious "base" location, but it turns out the headers end up being installed right there rather in a vulkan subdirectory. On Debian/Ubuntu the selected installation path is /usr/include/vulkan, so just go with that. Hopefully other distro don't choose another path. Note that the validation layer doesn't provide a .pc file so we have no way of querying where the headers are installed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 12:29:54 +00:00
Lionel Landwerlin	47ef52d333	vulkan/overlay: fix missing installation of layer Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 12:29:46 +00:00
Eric Engestrom	318e550549	dri_interface: add missing #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-26 12:03:20 +00:00
Eric Engestrom	7f5d9c2757	gitlab-ci: always run the containers build If the first time a fork was created, the job creating the containers was manually cancelled, this would have left the fork unable to use the CI (until the next automatic regeneration of the container). Avoid this by always running the container-generation job, even though 99% of the time it will spin up, see that the container exists and shut down. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-02-26 12:02:14 +00:00
Emil Velikov	40a82e6463	docs: mention "Allow commits from members who can merge..." Mention the tick-box otherwise only the MR author can rebase the series. Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reivewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-26 11:27:10 +00:00
Emil Velikov	d9d1cb43d7	egl/android: bump the number of drmDevices to 64 It's the current maximum supported by the kernel. Stay consistent with the rest of Mesa and use the same number. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-26 11:07:23 +00:00
Emil Velikov	02344fe80b	loader: use loader_open_device() to handle O_CLOEXEC Some platforms lack O_CLOEXEC. The loader_open_device() handles those appropriately, so use the helper. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 11:07:23 +00:00
Emil Velikov	f0a7b463b5	meson: egl: correctly manage loader/xmlconfig Earlier commit introduced support for haiku yet did not properly annotate the loader/xmlconfig dependencies. Thus we ended up adding inc_loader for each !haiku platform - see `659910eda0` `9a96bf0ecd` `c731508b98` `ec6cb01e21`. One piece remained though - the wayland platform. Hence the following would fail: meson -Dgallium-drivers=etnaviv -Ddri-drivers=''\ -Dtools=etnaviv -Dplatforms=wayland -Dglx=disabled \ build/ Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Reported-by: Boris Brezillon <boris.brezillon@collabora.com> Fixes: `834d221512` ("meson: Add Haiku platform support v4") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-02-26 11:07:23 +00:00
Emil Velikov	9d84a922b8	egl/dri: de-duplicate dri2_load_driver* The difference between the three functions is the list of mandatory driver extensions. Pass that as an argument to the common helper. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-26 11:07:23 +00:00
Samuel Pitoiset	4924dfc851	radv: don't copy buffer descriptors list for samplers Sampler descriptors don't have a buffer list. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy..sampler_. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-26 11:22:28 +01:00
Samuel Pitoiset	9256e0a09d	radv: fix out-of-bounds access when copying descriptors BO list We shouldn't increment the buffer list pointers twice. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-26 11:22:22 +01:00
Tapani Pälli	1d5e5ec30a	nir: use nir_variable_create instead of open-coding the logic Fixes: `3d7611e9` "st/nir: use NIR for asm programs" Reported-by: Matthias Lorenz <oschowa@web.de> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-26 09:00:36 +02:00
Tapani Pälli	22267feff1	nir: initialize value in copy_prop_vars_block Fixes following valgrind warning: ==27561== Conditional jump or move depends on uninitialised value(s) ==27561== at 0x667856B: value_set_ssa_components (nir_opt_copy_prop_vars.c:78) ==27561== by 0x667A1C4: copy_prop_vars_block (nir_opt_copy_prop_vars.c:797) Fixes: `62332d139c` "nir: Add a local variable-based copy propagation pass" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-26 08:56:25 +02:00
Eric Anholt	97566efe5c	v3d: Rematerialize MOVs of uniforms instead of spilling them. If we have a MOV of a uniform value available to spill, that's one of our best choices. We can just not spill the value, and emit a new load of the uniform as the fill. This saves bothering the TMU and the thrsw, and is the same cost in uniforms (since the spill offset is a uniform anyway). This doesn't have a huge impact on shader-db, since there aren't a whole lot of spills and we usually copy-prop the uniforms at the VIR level such that the only uniform MOVs are from vir_lower_uniforms: total instructions in shared programs: 6430292 -> 6430279 (<.01%) total uniforms in shared programs: 2386023 -> 2385787 (<.01%) total spills in shared programs: 4961 -> 4960 (-0.02%) total fills in shared programs: 6352 -> 6350 (-0.03%) However, I'm interested in dropping the uniforms copy-prop in the backend, since it would be cheaper to not load repeated uniforms if we have the registers to spare. This also saves many spills on dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20, which is what motivated a bunch of my recent backend work in the first place: before: 46 spills, 106 fills, 3062 instructions after: 0 spills, 0 fills, 2611 instructions	2019-02-25 21:33:47 -08:00
Eric Anholt	e0fada983d	v3d: Dump the VIR after register spilling if we were forced to. Spilling is unusual, but one often has to debug it when it happens, so dump it.	2019-02-25 21:26:24 -08:00
Eric Anholt	2786d2161a	v3d: Fix vir_is_raw_mov() for input unpacks. There are no users at the moment, but I wanted to start using this in register spilling.	2019-02-25 21:26:24 -08:00
Mathias Fröhlich	1ab2159249	st/mesa: Reduce array updates due to current changes. Since using bitmasks we can easily check if we have any current value that is potentially uploaded on array setup. So check for any potential vertex program input that is not already a vao enabled array. Only flag array update if there is a potential overlap. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-02-26 05:42:04 +01:00
Dylan Baker	6f42303646	meson/iris: Use current coding style Just a few minor style things. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-25 23:37:27 +00:00
Timothy Arceri	603206d0a6	radeonsi: fix query buffer allocation Fix the logic for buffer full check on alloc. This patch just takes the fix Nicolai attached to the bug report and updates it to work on master. Fixes: `e0f0d3675d` ("radeonsi: factor si_query_buffer logic out of si_query_hw") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561	2019-02-26 09:55:41 +11:00
Eric Anholt	7c1bf075f3	nir: Just return when asked to rewrite uses of an SSA def to itself. The nir_builder swizzling improvement to not emit extra MOVs resulted in nir_lower_tex() trying to rewrite an SSA def to itself, triggering the assert on all texturing in v3d. There's no work to be done in this case, so just stop asserting. Fixes: `743700be1f` ("nir/builder: Don't emit no-op swizzles") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-25 21:25:24 +00:00
Samuel Pitoiset	5671f38085	radv: fix clearing attachments in secondary command buffers If no framebuffer is bound, get the number of samples and the image format from the render pass. This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-25 21:42:50 +01:00
Alok Hota	773b3ceaca	swr/rast: Fix autotools and scons codegen Use new input flags for gen_archrast.py Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-25 13:05:39 -06:00
Alok Hota	16e10b8c30	swr/rast: Add general SWTag statistics Update Archrast parser to use stats, used with an internal tool Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-25 13:05:36 -06:00
Alok Hota	b45a15a39f	swr/rast: Add string handling to AR event framework For use by an internal tool Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-25 13:05:31 -06:00
Alok Hota	8608a747aa	swr/rast: Add initial SWTag proto definitions Update gen_archrast.py to properly generate event IDs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-25 13:05:17 -06:00
Alok Hota	93cd9905c8	swr/rast: Cleanup and generalize gen_archrast Update meson.build to accomodate Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-25 13:05:07 -06:00
Daniel Schürmann	0bd45f96b9	nir: Use SM5 properties to optimize shift(a@32, iand(31, b)) This is a common pattern from HLSL->SPIRV translation and supported in HW by all current NIR backends. vkpipeline-db results anv (SKL): total instructions in shared programs: `6403130` -> 6402380 (-0.01%) instructions in affected programs: 204084 -> 203334 (-0.37%) helped: 208 HURT: 0 total cycles in shared programs: 1915629582 -> 1918198408 (0.13%) cycles in affected programs: 1158892682 -> 1161461508 (0.22%) helped: 107 HURT: 86 shader-db results on i965 (KBL): total instructions in shared programs: 15284592 -> 15284568 (<.01%) instructions in affected programs: 81683 -> 81659 (-0.03%) helped: 24 HURT: 0 total cycles in shared programs: 375013622 -> 375013932 (<.01%) cycles in affected programs: 40169618 -> 40169928 (<.01%) helped: 13 HURT: 9 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-25 12:59:44 -06:00
Daniel Schürmann	0525bdc225	nir: Define shifts according to SM5 specification. SPIR-V shifts are undefined for values >= bitsize, but SM5 shifts are defined to only use the least significant bits. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-25 12:59:43 -06:00
Jason Ekstrand	c4fb6b0c81	intel/eu: Add an EOT parameter to send_indirect_[split]_message For split indirect sends we have to put the EOT parameter in the extended descriptor as well as the instruction itself so just calling brw_inst_set_eot is insufficient. Moving the EOT handling handling into the send_indirect_[split]_message helper lets us handle it properly.	2019-02-25 11:35:12 -06:00
Sergii Romantsov	dcc4866419	d3d: meson: do not prefix user provided d3d-drivers-path The user can select the location where there d3d drivers are installed by the d3d-drivers-path meson option. By default path will be $prefix/$libdir/d3d. Currently we add $prefix to the user provided path. Resulting in an incorrect or even missing path. Based on logic of Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698 CC: Kenneth Graunke <kenneth@whitecape.org> CC: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-25 16:07:02 +00:00
Sergii Romantsov	f6556ec7d1	dri: meson: do not prefix user provided dri-drivers-path The user can select the location where there dri drivers are installed by the dri-drivers-path meson option. By default path will be $prefix/$libdir/dri. Currently we add $prefix to the user provided path. Resulting in an incorrect or even missing path. v2: fixed dri_search_path by default, rebased to master v3: new commit-message (Emil Velikov), cc mesa-stable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698 CC: Rafael Antognolli <rafael.antognolli@intel.com> CC: Dylan Baker <dylan@pnwbakers.com> Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Fixes: `306914db92` (meson: Add dridriverdir variable to dri.pc.) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-25 16:07:02 +00:00
Lionel Landwerlin	30828f4646	intel/aub_viewer: silence more compiler warnings format not a string literal and no format arguments. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-25 13:11:16 +00:00
Lionel Landwerlin	91df8b1780	intel/aub_viewer: silence compiler warning buffer_addr may be used uninitialized. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-25 13:11:13 +00:00
Lionel Landwerlin	f1da10e0c5	intel/aub_viewer: printout 48bits addresses Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-25 13:11:05 +00:00
Gert Wollny	875942c059	mesa/core: Enable EXT_depth_clamp for GLES >= 2.0 The extension NV_depth_clamp is written against OpenGL 1.2.1, and since GLES 2.0 is based on GL 2.0 there is no reason not to enable this extension also for GLES >= 2.0. v2: Use EXT_depth_clamp that has been proposed to Khronos v3: - Fix check for extension availability (Erik Faya-Lund) - Also fix the test in is_enabled v4: - Test both, ARB and EXT extension (Erik) v5: - Fix white space errors (Erik) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-25 09:44:27 +00:00
Kenneth Graunke	b45186a6cd	iris: Properly allow rendering to RGBX formats. I was converting them at pipe_surface creation time, but not when answering queries about whether formats support rendering. This caused a lot of FBO incomplete errors for formats that ought to be supported. Fixes "Child of Light", which uses PIPE_FORMAT_R8G8B8X8_UNORM_SRGB. Also fixes Witcher 1 using wined3d (GL) according to Timur Kristóf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109738	2019-02-25 01:11:27 -08:00
Kenneth Graunke	fce089c8a2	iris: Drop RGBX -> RGBA for storage image usages GLSL doesn't expose RGB/RGBX image formats, so this isn't needed.	2019-02-25 00:57:50 -08:00
Kenneth Graunke	6921588d54	mesa: Fix RGBBuffers for renderbuffers with sized internal formats For texture attachments, 'f' is texImg->_BaseFormat, but for renderbuffer attachments, 'f' is att->Renderbuffer->InternalFormat. InternalFormat may be something like GL_RGB8, which causes our (f == GL_RGB) check to fail. Switch to using a proper _BaseFormat, which drops the size. Fixes dEQP-GLES31.functional.draw_buffers_indexed.random. max_required_draw_buffers.15 on iris when combined with a driver fix. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>	2019-02-25 00:57:42 -08:00
Oscar Blumberg	da9c030763	glsl: Fix function return typechecking apply_implicit_conversion only converts and check base types but we need actual type equality for function returns, otherwise you can return a vec2 from a function declared as returning a float. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-25 08:49:06 +02:00
Jordan Justen	bd0ad651e0	iris: Always use in-tree i915_drm.h Ref: `f1374805a8` "drm-uapi: use local files, not system libdrm" Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-24 21:06:40 -08:00
Alyssa Rosenzweig	f943047e48	panfrost: Decode render target swizzle/channels On MRT-capable systems, the framebuffer format is encoded as a 64-bit word in the render target descriptor. Previously, the two 32-bit words were exposed as opaque hex values. This commit identifies a 12-bit Mali swizzle and a 2-bit channel counter, removing some of the magic. It also adds decoding support for the AFBC and MSAA enable bits, which were already known but otherwise ignored in pandecode. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 04:49:50 +00:00
Alyssa Rosenzweig	c6be9969d2	panfrost/midgard: Add fround(_even), ftrunc, ffma These ops were discovered by invoking the correspondingly names GLSL functions. The rounding ops here behave exact as expected and are mapped to their corresponding NIR ops where applicable. The ffma behaves as a LUT instruction and requires some special argument packing (since Midgard normally only allows for 2 arguments); this quirk will be addressed in the future, but for now FMA is still lowered. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 02:36:26 +00:00
Alyssa Rosenzweig	4a4726af3c	panfrost/nondrm: Split out dump_counters Previously, this function was implied a part of the job submit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 02:34:16 +00:00
Alyssa Rosenzweig	cdca103d43	panfrost/nondrm: Make COHERENT_LOCAL explicit This flag corresponds to what was MEM_COHERENT_LOCAL in the vendor driver, which seems to influence the cache policy, necessary for the varying temporary storage but nothing else. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 02:32:45 +00:00
Alyssa Rosenzweig	f44d4653a9	panfrost/nondrm: Flag CPU-invisible regions Potentially, the kernel could optimize these allocations, or perhaps we can save on mapping costs. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 02:31:09 +00:00
Alyssa Rosenzweig	10cc251842	panfrost/meson: Remove subdir for nondrm This change fixes cross builds with the (temporary) non-DRM overlay. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 02:27:26 +00:00
Alyssa Rosenzweig	77fea552f6	panfrost: Use tiler fast path (performance boost) For reasons that are still unclear (speculation included in the comment added in this patch), the tiler? metadata has a fast path that we were not enabling; there looks to be a possible time/memory tradeoff, but the details remain unclear. Regardless, this patch improves performance dramatically. Particular wins are for geometry-heavy scenes. For instance, glmark2-es2's Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious when vsync is disabled, as in glmark2-es2-wayland. With this patch, on GLES 2.0 samples not involving FBOs, it appears performance is converging with (and sometimes surpassing) the blob. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-25 02:25:50 +00:00
Jason Ekstrand	743700be1f	nir/builder: Don't emit no-op swizzles The nir_swizzle helper is used some on it's own but it's also called by nir_channel and nir_channels which are used everywhere. It's pretty quick to check while we're walking the swizzle anyway whether or not it's an identity swizzle. If it is, we now don't bother emitting the instruction. Sure, copy-prop will clean it up for us but there's no sense making more work for the optimizer than we have to. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-24 20:01:27 -06:00
Jason Ekstrand	724371c6b9	nir/split_vars: Don't compact vectors unnecessarily Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-24 20:01:18 -06:00
Erik Faye-Lund	7a6a5d4bfa	st/mesa: remove unused header-file This header has been unused since `f8f2520e88` ("st/mesa: Remove unnecessary headers"). And in the more than 8 years since, this hasn't been useful. So let's just get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-24 20:53:37 +01:00
Maya Rashish	021c496135	configure: fix test portability From the bash manual: string1 == string2 string1 = string2 True if the strings are equal. = should be used with the test command for POSIX conformance.	2019-02-24 19:26:15 +00:00
David Shao	6fa923a65d	meson: ensure that xmlpool_options.h is generated for gallium targets that need it Fixes: `68076b8747` "meson: build gallium vdpau state tracker" Fixes: `22a817af8a` "meson: build gallium xvmc state tracker" Fixes: `5a785d51a6` "meson: build gallium va state tracker" Fixes: `0ba909f0f1` "meson: build gallium xa state tracker" Fixes: `1d36dc674d` "meson: build gallium omx state tracker" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-24 09:00:39 +00:00
Matthias Lorenz	f91654120b	vulkan/overlay: Add fps counter Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109747	2019-02-24 01:07:26 +00:00
Lionel Landwerlin	239b0d8570	Revert "anv: add support for INTEL_DEBUG=bat" This reverts commit `e4d88396d2`. Apologies, I pushed the wrong commit.	2019-02-24 01:06:39 +00:00
Lionel Landwerlin	e4d88396d2	anv: add support for INTEL_DEBUG=bat As requested by Ken ;) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-23 23:29:04 +00:00
Christian Gmeiner	c56e734496	etnaviv: blt: mark used src resource as read from Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-02-23 16:00:50 +01:00
Christian Gmeiner	7244e76804	etnaviv: rs: mark used src resource as read from Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-02-23 16:00:25 +01:00
Vinson Lee	2bd08b8b9d	gallium/auxiliary/vl: Fix duplicate symbol build errors. CXXLD gallium_dri.la duplicate symbol _compute_shader_video_buffer in: ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o) ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o) duplicate symbol _compute_shader_weave in: ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o) ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o) duplicate symbol _compute_shader_rgba in: ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o) ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o) Fixes: `9364d66cb7` ("gallium/auxiliary/vl: Add video compositor compute shader render") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: James Zhu <James.Zhu@amd.com>	2019-02-22 23:07:26 -08:00
Caio Marcelo de Oliveira Filho	4c160b6bd8	nir: fix MSVC build Zero initialize struct with {0} instead of {}.	2019-02-22 22:38:05 -08:00
Caio Marcelo de Oliveira Filho	eb13211997	nir/copy_prop_vars: add tests for load/store elements of vectors Test using array deref on vectors in loads and stores. These are marked DISABLED_ as this optimization is currently not done. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho	4f3809d389	nir: nir_build_deref_follower accept array derefs of vectors Code itself already supports it, just make sure we can use it for those cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho	c4beadd28e	nir/copy_prop_vars: change test helper to get intrinsics Replace find_next_intrinsic(intrinsic, after) with get_intrinsic(intrinsic, index). This makes slightly more convenient to check the resulting loads/stores/copies, since in most tests we know which one we care about. The cost is to perform more traversals, but for such tests this is not a problem. Added the ASSERT_EQ() on count to some tests missing it, so the indices queried are always expected to find something. Also, drop two nir_print_shader leftover calls in a test. v2: Remove redundant assertions. nir_src_comp_as_uint already assert what we need. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho	fdcb9779d9	nir/copy_prop_vars: keep track of components in copy_entry When a copy_entry is SSA, store not only the nir_ssa_def* for each component, but also the source component they come from. At the moment this is always a match (i.e. 'component[i] == i'), because all the operations for a copy_entry happen using definitions with the same size. This prepares the code for array_derefs of vectors, in which 'component[i] != i'. Also, extract setting all SSA components into a function of its own. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho	6624decbb5	nir/copy_prop_vars: add debug helpers Disabled by default, to be used during development. Adding those so I don't rewrite some ad-hoc version of them everytime I'm working with this pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 21:00:50 -08:00
Caio Marcelo de Oliveira Filho	60d9bb9ff5	nir/copy_prop_vars: don't get confused by array_deref of vectors For now these derefs are not handled, so don't let these get into the copies list -- which would cause wrong propagations. For load_derefs, do nothing. For store_derefs, invalidate whatever the store is writing to. For copy_derefs, invalidate whatever the copy is writing to. These cases will happen once derefs to SSBOs/UBOs are kept around long enough to get optimized by copy_prop_vars. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 21:00:50 -08:00
Timothy Arceri	f48527e51a	nir: allow nir_lower_phis_to_scalar() on more src types Rather than only lowering if all srcs are scalarizable we instead check that at least one src is scalarizable. We change undef type to return false otherwise it will cause regressions when it is the only scalarizable src. total instructions in shared programs: 13219105 -> 13024547 (-1.47%) instructions in affected programs: 1153797 -> 959239 (-16.86%) helped: 581 HURT: 74 total cycles in shared programs: 333968972 -> 324807922 (-2.74%) cycles in affected programs: 129809402 -> 120648352 (-7.06%) helped: 571 HURT: 131 total spills in shared programs: 57947 -> 29130 (-49.73%) spills in affected programs: 53364 -> 24547 (-54.00%) helped: 351 HURT: 0 total fills in shared programs: 51310 -> 25468 (-50.36%) fills in affected programs: 44882 -> 19040 (-57.58%) helped: 351 HURT: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-23 11:11:51 +11:00
Alok Hota	6053499f2e	swr/rast: bypass size limit for non-sampled textures This fixes a bug where SWR will fail to render in cases with large buffer allocations, e.g. very large meshes whose vertex buffers exceed 2GB CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-22 23:35:11 +00:00
Marek Olšák	b326a15eda	tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics This might have decreased performance for radeonsi/tgsi, because most most shaders claimed they used bindless. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-02-22 18:00:54 -05:00
Jordan Justen	cf652205cf	iris: Add gitlab-ci build testing Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-22 14:08:21 -08:00
Rob Clark	fd360c82f0	freedreno/a6xx: cube image fix Note that emit_intrinsic_load_image() already swaps a .3d flag with an .a flag. I tried doing things the other way around (going back to .3d) but that didn't work. And treating cube images as 2d array is also what blob does, so let's just go with that. Fixes dEQP-GLES31.functional.image_load_store.cube.load_store.* Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-22 14:05:32 -05:00
Rob Clark	f90c3b4485	freedreno/a6xx: fix border-color offset Fixes nearly all of dEQP-GLES31.functional.texture.border_clamp.* when run after a test that binds textures used in vertex shader. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-22 14:05:32 -05:00
Rob Clark	bdedb8277a	freedreno/ir3: don't hardcode wrmask Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.const_literal.vertex.samplercubeshadow and few other similar tests that do multiple texture fetches into individual components of a packet output. Mostly works around the issue mentioned in ra_block_find_definers(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-22 14:05:32 -05:00
Rob Clark	5d4fa194b8	freedreno: fix race condition rsc->write_batch can be cleared behind our back, so we need to acquire the lock before deref'ing. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-22 14:05:32 -05:00
Kenneth Graunke	3090c6b9e9	vulkan: Fix 32-bit build for the new overlay layer vulkan_core.h defines non-dispatchable handles as (struct object ) on 64-bit systems, but uint64_t on 32-bit systems. The former can be implicitly cast to void , but the latter requires an explicit cast. While here, %lu is the wrong format specifier for uint64_t on 32-bit systems, so use PRIu64, fixing a warning. Reported-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-22 08:56:54 -08:00
Juan A. Suarez Romero	4f917e6a61	anv: advertise 8 subpixel precision bits On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is used to select between 8 bit subpixel precision (value 0) or 4 bit subpixel precision (value 1). As this value is not set, means it is taking the value 0, so 8 bit are used. On the other side, in the Vulkan CTS tests, if the reference rasterizer, which uses 8 bit precision, as it is used to check what should be the expected value for the tests, is changed to use 4 bit as ANV was advertising so far, some of the tests will fail. So it seems ANV is actually using 8 bits. v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason) v3: use _8Bit definition as value (Jason) v4: (by Jason) anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect This field was added on gen8 even though there's an identically defined one in 3DSTATE_SF. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Kenneth Graunke <kenneth@whitecape.org> CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 17:53:55 +01:00
Juan A. Suarez Romero	3b423eeb2d	genxml: add missing field values for 3DSTATE_SF Fill out "Vertex Sub Pixel Precision Select" possible values. CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 17:53:45 +01:00
Bas Nieuwenhuizen	f324784104	radv: Allow interpolation on non-float types. In particular structs containing floats and 16-bit floating point types. Fixes: `62024fa775` "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Fixes: `da29594636` "spirv: Only split blocks" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-22 17:06:55 +01:00
Bas Nieuwenhuizen	a1fdd4a4a7	radv: Fix float16 interpolation set up. float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: `62024fa775` "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-22 17:06:55 +01:00
Ilia Mirkin	ae2cb72804	nv50: disable compute It causes more trouble than it's worth. Now vl tries to create compute shaders without all the proper checking. Since there's really no (current) way to use compute on nv50, just mark it disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109742 Fixes: `f6ac0b5d71` ("gallium/auxiliary/vl: Add compute shader to support video compositor render") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-02-22 09:42:41 -05:00
Lionel Landwerlin	1d626fc028	intel: fix urb size for CFL GT1 Same 192Kb amount as SKL/KBL GT1 applies. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Fixes: `de7ed0ba55` ("i965/CFL: Add PCI Ids for Coffee Lake.")	2019-02-22 11:53:49 +00:00
Samuel Iglesias Gonsálvez	bd2c5a8203	isl: the display engine requires 64B alignment for linear surfaces v2: Add PRM quote (Lionel) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-22 11:45:45 +00:00
Gert Wollny	2ee197d6e8	virgl: Enable mixed color FBO attachemnets only when the host supports it Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2019-02-22 10:44:08 +01:00
Mauro Rossi	338dacc341	android: intel/isl: remove redundant building rules Fixes the following building error: including ./external/mesa/Android.mk ... build/core/base_rules.mk:183: * external/mesa/src/intel: MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel. make: * [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1 ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c and that source file includes isl_tiled_memcpy.c source Fixes: `96bb328` ("iris: add Android build") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-22 07:56:11 +02:00
Kenneth Graunke	b21de090d6	Revert "iris: Enable auxiliary buffer support" This reverts commit `cd0ced49e7`. It breaks glxgears rendering.	2019-02-21 15:50:46 -08:00
Kenneth Graunke	e2cb0c5e0e	iris: Enable -msse2 and -mstackrealign This is needed for gen_clflush.h intrinsics to work on 32-bit builds. i965 and anv both set these, and iris needs to as well. Tested-by: Mark Janes <mark.a.janes@intel.com>	2019-02-21 14:51:15 -08:00
Francisco Jerez	7272fe9c08	intel/fs: Rely on undocumented unrestricted regioning for 32x16-bit integer multiply. Even though the hardware spec claims that any "integer DWord multiply" operation is affected by the regioning restrictions of CHV/BXT/GLK, this is inconsistent with the behavior of the simulator and with empirical evidence -- Return false from has_dst_aligned_region_restriction() for such instructions as a micro-optimization. Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	e03be78252	intel/fs: Implement extended strides greater than 4 for IR source regions. Strides up to 32B can be implemented for the source regions of most instructions by leveraging either the vertical or the horizontal stride of the hardware Align1 region. The main motivation for this is that currently the lower_integer_multiplication() pass will happily double the stride of one of the 32-bit sources, which can blow up if the stride of the original source was already the maximum value allowed by the hardware. An alternative would be to use the regioning legalization pass in order to lower such strides into the composition of multiple legal strides, but that would be somewhat less efficient. This showed up as a regression from my commit `cbea91eb57` in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a pre-existing problem that had affected conformance on other platforms without native support for integer multiplication. CHV/BXT were getting around it because the code I removed in that commit had the "fortunate" side effect of emitting narrower regions that didn't hit the hardware stride limit after lowering. Beyond fixing the regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on ICL (that's why this patch is marked for inclusion in mesa-stable even though the original regressing patch was not). According to Jason, a nearly equivalent change had been committed previously as `e8c9e65185` and then (mistakenly?) reverted as `a31d038208`. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 Reported-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	7f9f6263c1	intel/fs: Cap dst-aligned region stride to maximum representable hstride value. This is required in combination with the following commit, because otherwise if a source region with an extended 8+ stride is present in the instruction (which we're about to declare legal) we'll end up emitting code that attempts to write to such a region, even though strides greater than four are still illegal for the destination. Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	e2f475ddff	intel/fs: Lower integer multiply correctly when destination stride equals 4. Because the "low" temporary needs to be accessed with word type and twice the original stride, attempting to preserve the alignment of the original destination can potentially lead to instructions with illegal destination stride greater than four. Because the CHV/BXT alignment restrictions are now being enforced by the regioning lowering pass run after lower_integer_multiplication(), there is no real need to preserve the original strides anymore. Note that this bug can be reproduced on stable branches, but back-porting would be non-trivial, because the fix relies on the regioning lowering pass recently introduced. Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Francisco Jerez	c3c27762f7	intel/fs: Exclude control sources from execution type and region alignment calculations. Currently the execution type calculation will return a bogus value in cases like: mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u Which will be considered to have a 32-bit integer execution type even though the actual indirect move operation will be carried out with 16-bit precision. Similarly there's no need to apply the CHV/BXT double-precision region alignment restrictions to such control sources, since they aren't directly involved in the double-precision arithmetic operations emitted by these virtual instructions. Applying the CHV/BXT restrictions to control sources was expected to be harmless if mildly inefficient, but unfortunately it exposed problems at codegen level for virtual instructions (namely the SHUFFLE instruction used for the Vulkan 1.1 subgroup feature) that weren't prepared to accept control sources with an arbitrary strided region. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 Reported-by: Mark Janes <mark.a.janes@intel.com> Fixes: `efa4e4bc5f` "intel/fs: Introduce regioning lowering pass." Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 14:07:25 -08:00
Timothy Arceri	d9e08e753b	nir: clone instruction set rather than removing individual entries This reduces the time spent in nir_opt_cse() by almost a half. The massif tool from callgrind reported no change in peak memory use with the large doliphin uber shaders I used for testing. Reviewed-by: Thomas Helland<thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 08:36:36 +11:00
Jordan Justen	cd0ac3a6af	genxml: Remove extra space in gen4/45/5 field name Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 13:17:10 -08:00
Jordan Justen	a9b0b72a78	genxml/gen_bits_header.py: Use regex to strip no alphanum chars Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 13:15:59 -08:00
Kenneth Graunke	cd0ced49e7	iris: Enable auxiliary buffer support This currently regresses KHR-GL4x.compute_shader.resource-texture, but that's a pre-existing bug (https://bugs.freedesktop.org/109113) which should be fixed up once we have fast clear support.	2019-02-21 10:26:12 -08:00
Rafael Antognolli	db81445837	iris: Flag ALL_DIRTY_BINDINGS on aux state change. If we change the aux state for a given resource, we need to re-emit the binding table pointers for any stage that has such resource bound. Since we don't track that, flag IRIS_ALL_DIRTY_BINDINGS and emit all of them.	2019-02-21 10:26:12 -08:00
Rafael Antognolli	95589652a1	iris: Skip resolve if there's no context. If iris_resource_get_handle() gets called without a context, we can't resolve the resource. Hopefully it shouldn't be compressed anyway, so let's just add an assert to ensure it's correct.	2019-02-21 10:26:12 -08:00
Rafael Antognolli	36138bb7fc	iris/clear: Pass on render_condition_enabled.	2019-02-21 10:26:12 -08:00
Rafael Antognolli	8190165d13	iris: Avoid leaking if we fail to allocate the aux buffer. Otherwise we could leak the aux state map or the aux BO.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	7da53d7188	iris: Only resolve compute resources for compute shaders	2019-02-21 10:26:12 -08:00
Kenneth Graunke	95a36bd55c	iris: Fix aux usage in render resolve code	2019-02-21 10:26:12 -08:00
Rafael Antognolli	4f191feb0c	iris: Pin HiZ buffers when rendering.	2019-02-21 10:26:12 -08:00
Rafael Antognolli	dfd54f9954	iris: Flush before hiz_exec.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	f3f7d45a63	iris: Allow disabling aux via INTEL_DEBUG options	2019-02-21 10:26:12 -08:00
Kenneth Graunke	4634b754f4	iris: do flush for buffers still	2019-02-21 10:26:12 -08:00
Kenneth Graunke	15822f33ad	iris: make surface states for CCS_D too CCS_E can fall back to CCS_D with incompatible format views CCS_D is pretty useless without fast clears and we may as well use NONE, but we're surely going to hook those up at some point, so may as well just go ahead and do it now...	2019-02-21 10:26:12 -08:00
Rafael Antognolli	689b590069	iris: Skip msaa16 on gen < 9. Also needed to add gen information to KEY_INIT.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	fd2038b22a	iris: Set program key fields for MCS	2019-02-21 10:26:12 -08:00
Kenneth Graunke	92c310fd3f	iris: don't use hiz for MSAA buffers	2019-02-21 10:26:12 -08:00
Kenneth Graunke	2cddc953cd	iris: some initial HiZ bits	2019-02-21 10:26:12 -08:00
Kenneth Graunke	9b1126c990	iris: disable aux for external things	2019-02-21 10:26:12 -08:00
Kenneth Graunke	45f4dab62b	iris: Resolves for compute	2019-02-21 10:26:12 -08:00
Kenneth Graunke	ecc897b8ad	iris: consider framebuffer parameter for aux usages	2019-02-21 10:26:12 -08:00
Kenneth Graunke	b77d2dc71b	iris: Make blit code use actual aux usages	2019-02-21 10:26:12 -08:00
Kenneth Graunke	bfc76d3525	iris: store modifier info in res	2019-02-21 10:26:12 -08:00
Kenneth Graunke	56f1fe3eac	iris: pin the buffers	2019-02-21 10:26:12 -08:00
Kenneth Graunke	f8aa9aa353	iris: resolve before transfer maps	2019-02-21 10:26:12 -08:00
Kenneth Graunke	c53a67d469	iris: be sure to skip buffers in resolve code Buffers don't have ISL surfaces, and this can get us into trouble.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	5eb75345b8	iris: try to fix copyimage vs copybuffers	2019-02-21 10:26:12 -08:00
Kenneth Graunke	d8f3bc1c4c	iris: actually use the multiple surf states for aux modes	2019-02-21 10:26:12 -08:00
Kenneth Graunke	3c979b0e6d	iris: add some draw resolve hooks	2019-02-21 10:26:12 -08:00
Kenneth Graunke	53c484ba8a	iris: blorp using resolve hooks	2019-02-21 10:26:12 -08:00
Kenneth Graunke	77a1070d36	iris: Initial import of resolve code	2019-02-21 10:26:12 -08:00
Kenneth Graunke	f879349398	iris: create aux surface if needed	2019-02-21 10:26:12 -08:00
Kenneth Graunke	3efd5299af	iris: Fill out SURFACE_STATE entries for each possible aux usage	2019-02-21 10:26:12 -08:00
Kenneth Graunke	3cfc6a207b	iris: Fill out res->aux.possible_usages	2019-02-21 10:26:12 -08:00
Kenneth Graunke	a7bc4d6074	iris: Add iris_resource fields for aux surfaces But without fast clears or HiZ per-level tracking just yet.	2019-02-21 10:26:12 -08:00
Jordan Justen	d0996d5fab	iris: Emit default L3 config for the render pipeline Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:12 -08:00
Kenneth Graunke	51ddc40084	iris: Always emit at least one BLEND_STATE	2019-02-21 10:26:12 -08:00
Kenneth Graunke	d6dd57d43c	iris: Add missing depth cache flushes	2019-02-21 10:26:12 -08:00
Kenneth Graunke	1b5c342f33	iris: Simplify iris_get_depth_stencil_resources We can safely assume that the given resource is depth, depth/stencil, or stencil already. The stencil-only case is easily detectable with a single format check, and all other cases are handled identically. This saves some CPU overhead.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	07ec1f0b25	iris: Make an IRIS_MAX_MIPLEVELS define	2019-02-21 10:26:12 -08:00
Rafael Antognolli	455c959689	iris: Store internal_format when getting resource from handle.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	973f01d55a	iris: Move create and bind driver hooks to the end of iris_program.c This just moves the code for dealing with pipe_shader_state / pipe_compute_state / iris_uncompiled_shader to the end of the file. Now that those do precompiles, they want to call the actual compile functions. Putting them at the end eliminates the need for a bunch of prototypes.	2019-02-21 10:26:12 -08:00
Timur Kristóf	cacf84ed5f	iris: implement clearing render target and depth stencil v2 (Kenneth Graunke): split color/depthstencil cases, fix iris_clear	2019-02-21 10:26:12 -08:00
Kenneth Graunke	8ab82bd1fd	iris: Drop XXX about checking for swizzling Caio noted that this is not necessary on Gen8+: "Before Gen8, there was a historical configuration control field to swizzle address bit[6] for in X/Y tiling modes. This was set in three different places: TILECTL[1:0], ARB_MODE[5:4], and DISP_ARB_CTL[14:13]. For Gen8 and subsequent generations, the swizzle fields are all reserved, and the CPU's memory controller performs all address swizzling modifications." Since we don't support earlier hardware, we can skip it entirely.	2019-02-21 10:26:12 -08:00
Kenneth Graunke	bf23e79629	iris: Set HasWriteableRT correctly A bit of irritating state cross dependency here, but nothing too hard	2019-02-21 10:26:12 -08:00
Kenneth Graunke	d612cd1bf8	iris: Set 3DSTATE_WM::ForceThreadDispatchEnable The Vulkan driver only sets this if color writes are disabled, which is more conservative - but would require us to inspect blend state. (If color writes are enabled, we don't need to force anything, because the internal signal is already correct. But it shouldn't hurt to do so.)	2019-02-21 10:26:12 -08:00
Kenneth Graunke	27d751cdd8	iris: Drop XXX about alpha testing I was misreading i965 - the 3DSTATE_WM::PixelShaderKillsPixel bit from Gen < 8 needed all of this, but the 3DSTATE_PS_EXTRA bit only needs prog_data->uses_kill.	2019-02-21 10:26:12 -08:00
Andre Heider	bffb65d28e	iris: improve PIPE_CAP_VIDEO_MEMORY bogus value -1 is a little too bogus for most games ;) Signed-off-by: Andre Heider <a.heider@gmail.com>	2019-02-21 10:26:12 -08:00
Andre Heider	f89a578818	iris: fix build with gallium nine Signed-off-by: Andre Heider <a.heider@gmail.com>	2019-02-21 10:26:12 -08:00
Kenneth Graunke	be49fb051d	iris: Stop chopping off the first nine characters of the renderer string	2019-02-21 10:26:12 -08:00
Kenneth Graunke	15341778ba	iris: rework num textures to util_lastbit	2019-02-21 10:26:12 -08:00
Kenneth Graunke	974229df46	iris: Add PIPE_CAP_MAX_VARYINGS	2019-02-21 10:26:11 -08:00
Kenneth Graunke	1cd001aa63	iris: Make a iris_batch_reference_signal_syncpt helper function. Suggested by Chris Wilson. More obvious what's going on.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	9376799bd6	iris: Use READ_ONCE and WRITE_ONCE for snapshots_landed Suggested by Chris Wilson, if only to make it obvious to the human readers that these are volatile reads. It may also be necessary for the compiler in a few cases.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	18e31a9b31	iris: Fix accidental busy-looping in query waits When switching from bo_wait to sync-points, I missed that we turned an if (not landed) bo_wait into a while (not landed) check_syncpt(), which has a timeout of 0. This meant, rather than sleeping until the batch is complete, we'd busy-loop, continually asking the kernel "is the batch done yet???". This is not what we want at all - if we wanted a busy loop, we'd just loop on !snapshots_landed. We want to sleep. Add an effectively infinite timeout so that we sleep.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	3b1ac8244e	iris: Add a timeout_nsec parameter, rename check_syncpt to wait_syncpt I want to be able to wait with a non-zero timeout from elsewhere.	2019-02-21 10:26:11 -08:00
Sagar Ghuge	c24a574e6c	iris: Don't allocate a BO per query object Instead of allocating 4K BO per query object, we can create a large blob of memory and split it into pieces as required. Having one BO for multiple query objects, we don't want to wait on all of them, instead when we write last snapshot, we create a sync point, and check syncpoints while waiting on particular object. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-02-21 10:26:11 -08:00
Kenneth Graunke	a1ebac3750	iris: Implement ALT mode for ARB_{vertex,fragment}_shader Fixes gl-1.0-spot-light	2019-02-21 10:26:11 -08:00
Kenneth Graunke	732c3a90a4	iris: Fix bug in bound vertex buffer tracking res might be NULL, at which point this is an unbind.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	4bfd12bbf7	iris: minor tidying	2019-02-21 10:26:11 -08:00
Kenneth Graunke	b1bacbf038	iris: Unreference some more things on state module teardown	2019-02-21 10:26:11 -08:00
Kenneth Graunke	e092ed9213	iris: Drop dead state_size hash table I inherited this from i965. It would be nice to track the state size so INTEL_DEBUG=color,bat decoding can print the right number of e.g. binding table entries or blend states, but...without a single point of entry for state, it's a little tricky to get right. Punt for now, and drop the dead code in the meantime.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	6e41f1b459	iris: Drop comment about ISP_DIS i965 re-emits 3DSTATE_CONSTANT_* on every batch, so there's no point in restoring the constants from the context. Iris actually re-pins the constant buffers properly across the batch, and avoids re-emitting the constant packets unless it's necessary. So, we don't want ISP_DIS.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	edd3ce5a63	iris: Enable PIPE_CAP_COMPACT_ARRAYS	2019-02-21 10:26:11 -08:00
Kenneth Graunke	1db394f46b	iris: Remap stream output indexes back to VARYING_SLOT_. Previously I had a hack in st/mesa to make it stop remapping VARYING_SLOT_ into the naively compacted slots, which aren't what we want. But that wasn't very feasible, as we'd have to update all drivers, or add capability bits, and it gets messy fast. It turns out that I can map back to VARYING_SLOT_* in about 5 LOC, so let's just do that. It removes the need for hacks, and is easy. This also fixes KHR-GL46.enhanced_layouts.xfb_capture_struct, which apparently with my hack was still getting the wrong slot info.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	5d3d757178	iris: Zero the compute predicate when changing the render condition 1. Set a render condition. We emit it immediately on the render engine, and stash q->bo as ice->state.compute_predicate in case the compute engine needs it. 2. Clear the render condition. We were incorrectly leaving a stale compute_predicate kicking around... 3. Dispatch compute. We would then read the stale compute predicate, and try to load it into MI_PREDICATE_DATA. But q->bo may have been freed altogether, causing us to try and use garbage memory as a BO, adding it to the validation list, failing asserts, and tripping EINVALs in execbuf. Huge thanks to Mark Janes for narrowing this sporadic GL CTS failure down to a list of 48 tests I could easily run to reproduce it. Huge thanks to the Valgrind authors for the memcheck tool that immediately pinpointed the problem.	2019-02-21 10:26:11 -08:00
Caio Marcelo de Oliveira Filho	4fd1f70e62	iris: always include an extra constbuf0 if using UBOs In st_nir_lower_uniforms_to_ubo() all UBO access in the shader have its index incremented to open room for uniforms in constbuf0. So if we use UBOs, we always need to include the extra binding entry in the table. To avoid doing this checks both when compiling the shader and when assigning binding tables, store the num_cbufs in iris_compiled_shader. Fixes a bunch of tests from Piglit and CTS that use UBOs but don't use uniforms or system values. Note that some tests fitting this criteria were passing because the UBOs were moved to be push constants (avoiding the problem). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 10:26:11 -08:00
Kenneth Graunke	4801af2f26	iris: Do binder address allocations per-context, not globally. iris_bufmgr allocates addresses across the entire screen, since buffers may be shared between multiple contexts. There used to be a single special address, IRIS_BINDER_ADDRESS, that was per-context - and all contexts used the same address. When I moved to the multi-binder system, I made a separate memory zone for them. I wanted there to be 2-3 binders per context, so we could cycle them to avoid the stalls inherent in pinning two buffers to the same address in back-to-back batches. But I figured I'd allow 100 binders just to be wildly excessive/cautious. What I didn't realize was that we need 2-3 binders per context, and what I did was allocate 100 binders per screen. Web browsers, for example, might have 1-2 contexts per tab, leading to hundreds of contexts, and thus binders. To fix this, we stop allocating VMA for binders in bufmgr, and let the binder handle it itself. Binders are per-context, and they can assign context-local addresses for the buffers by simply doing a ringbuffer style approach. We only hold on to one binder BO at a time, so we won't ever have a conflicting address. This fixes dEQP-EGL.functional.multicontext.non_shared_clear. Huge thanks to Tapani Pälli for debugging this whole mess and figuring out what was going wrong. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-21 10:26:11 -08:00
Kenneth Graunke	0f33204f05	iris: Fix memzone_for_address for the surface and binder zones We use > for IRIS_MEMZONE_DYNAMIC because IRIS_BORDER_COLOR_POOL_ADDRESS lives at the very start of that zone. However, IRIS_MEMZONE_SURFACE and IRIS_MEMZONE_BINDER are normal zones. They used to be a single zone (surface) with a single binder BO at the beginning, similar to the border color pool. But when I moved us to multiple binders, I made them have a real zone (if a small one). So both zones should use >=. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-21 10:26:11 -08:00
Kenneth Graunke	3bcb1a7fcd	iris: Don't whack SO dirty bits when finishing a BLORP op Re-emitting 3DSTATE_SO_BUFFERS can be hazardous, as it could zero offsets. Plus, it's just not necessary - BLORP doesn't change these.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	b9697dd820	iris: Fix SO issue with INTEL_DEBUG=reemit, set fewer bits INTEL_DEBUG=reemit was breaking streamout tests, by re-emitting 3DSTATE_SO_BUFFER commands that tell the HW to zero the SO write offsets. We would need to alter them to use 0xFFFFFFFF for the offset. Also, have each upload function only flag bits relevant to its own pipeline.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	61798e3c88	iris: CS stall on VF cache invalidate workarounds See commit `31e4c9ce40` in i965.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	c81941f1e7	iris: Pay attention to blit masks For combined depth/stencil formats, we may want to only blit one half. If PIPE_BLIT_Z is set, blit depth; if PIPE_BLIT_S is set, blit stencil.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	7837fec740	iris: Assert about blits with color masking st/mesa never asks for this today, but in theory someone might, and we don't support it.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	0f677b0d87	iris: Don't enable smooth points when point sprites are enabled dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_*.primitives.points	2019-02-21 10:26:11 -08:00
Kenneth Graunke	3b336a1513	iris: Allow sample mask of 0 I think this was an attempt to work around various sample mask bugs I had early on. It's not correct. A sample mask of 0 is legal and means to disable all samples. Fixes dEQP-GLES31.functional.texture.multisample..sample_mask*	2019-02-21 10:26:11 -08:00
Kenneth Graunke	e17333ea1e	iris: fail to create screen for older unsupported HW loader shouldn't try, but let's be paranoid	2019-02-21 10:26:11 -08:00
Kenneth Graunke	1f91f688e8	iris: Switch to the new PIPELINE_STATISTICS_QUERY_SINGLE capability I had a hack in place earlier to pass the query type as q->index for the regular statistics query, but we ended up adjusting the interface and adding a new query type. Use that instead, fixing pipeline statistics queries since the rebase.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	a23c06cabc	iris: Use new PIPE_STAT_QUERY enums rather than hardcoded numbers.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	5aef30b886	iris: Fix Broadwell WaDividePSInvocationCountBy4 We were dividing by 4 in calculate_result_on_gpu(), and also in iris_get_query_result(). We should stop doing the latter, and instead divide by 4 in calculate_result_on_cpu() as well. Otherwise, if snapshots were available, and you hit the calculate_result_on_cpu() path, but requested it be written to a QBO, you'd fail to get a divide.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	7f318bf2ac	iris: Delete genx->bound_vertex_buffers This is actually stored in ice->state, as it isn't gen-specific	2019-02-21 10:26:11 -08:00
Kenneth Graunke	02991e2878	iris: Drop a dead comment	2019-02-21 10:26:11 -08:00
Kenneth Graunke	572fad1e84	iris: Don't check other batches for our batch BO This is an awkward corner case. We create batches in order, each of which creates and pins a BO. The other batches may not be set up yet, so it may not be safe to ask whether they reference a BO. Just avoid this for now. We could avoid it for other context-local BOs too, but we currently don't have a flag for that (and I'm not certain whether it's worth it).	2019-02-21 10:26:11 -08:00
Kenneth Graunke	8eda6f2288	iris: Handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE somewhat Various places in the transfer code need to know whether they must read the existing resource's values. Rather than checking both flags everywhere, just make PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE also flag PIPE_TRANSFER_DISCARD_RANGE - if we can discard everything, we can discard a subrange, too. Obviously, we can do better for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE, but eventually u_threaded_context should handle swapping out buffers for new idle buffers, anyway. In the meantime, this is at least better.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	bacc722d13	iris: Flush the render cache in flush_and_dirty_for_history BLORP uses the render engine to write to buffers, and we need to flush that data out to the actual surface (finishing the write). Then, the rest of this function invalidates any caches that might have stale data which needs to be refetched.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	7a9e87c224	iris: Implement multi-slice copy_region I don't know if this is required - surprisingly, I haven't seen it matter - but I'd like to use it for multi-slice transfer maps. We may as well do the right thing.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	307f3f9924	iris: Leave a comment about why Broadwell images are broken There are a variety of ways to fix this, many of which are simple, but I could use some advice on which ones other people prefer, and so we'll punt until after the holidays.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	7ed1383c0a	iris: Fix surface states for Gen8 lowered-to-untype images We have to use SURFTYPE_BUFFER and ISL_FORMAT_RAW for these.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	477e7d575b	iris: Fill out brw_image_params for storage images on Broadwell	2019-02-21 10:26:11 -08:00
Kenneth Graunke	7e35333c73	iris: Don't make duplicate system values We were relying on CSE/GVN/etc to coalesce all intrinsics that load the same value, but that's a bad idea. We might have a couple intrinsics that reload the same value. If so, we only want to set up the uniform on the first one we see.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	bc3bb28645	iris: Don't enable push constants just because there are system values System values are built-in uniforms. We set them up as UBO values, and might pull or push them. UBO push analysis will take care of that. We only want to enable push constants if there's an actual range being pushed. Otherwise, we might get into a scenario where 3DSTATE_PS enables push constants but 3DSTATE_CONSTANT_PS isn't pushing anything. This fixes GPU hangs in Broadwell image load store tests which have unused image param system values but no other uniforms. (We shouldn't be making those anyway, but that's a separate fix...)	2019-02-21 10:26:11 -08:00
Kenneth Graunke	2ca0d913ea	iris: Fix framebuffer layer count cso_fb->layers is only valid for no-attachment framebuffers. Use the helper function to get the real value, then stash it so we don't have to call the helper function on the old value for comparison, or at draw time for Force Zero RTA Index setting. This fixes Force Zero RTA Index being set even when attempting layered rendering.	2019-02-21 10:26:11 -08:00
Dave Airlie	df60241ff7	iris: handle qbo fragment shader invocation workaround	2019-02-21 10:26:11 -08:00
Dave Airlie	5ae2e5aa94	iris: add fs invocations query workaround for broadwell	2019-02-21 10:26:11 -08:00
Dave Airlie	8806b29e16	iris: setup gen8 caps	2019-02-21 10:26:11 -08:00
Dave Airlie	1bbf095473	iris: limit gen8 to 8 samples	2019-02-21 10:26:11 -08:00
Dave Airlie	823609b1a3	iris/WIP: add broadwell support This adds all the state changes, MOCS changes,	2019-02-21 10:26:11 -08:00
Kenneth Graunke	5be72d9a20	iris: Delete bogus comment about cube array counting. Both 'z' and 'depth' are counted in slices, according to the Gallium docs (context.rst). In our temporary memory, we allocate `box.depth` slices, so we need to rebase the starting slice (box.z) down to 0, and back again when writing on unmap. There's nothing strange about cubes here.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	73709be0c3	iris: Fix compute scratch pinning Thanks to Eero Tamminen for helping catch this.	2019-02-21 10:26:11 -08:00
Kenneth Graunke	3ab3aa23c2	iris: Add a more long term TODO about timebase scaling	2019-02-21 10:26:11 -08:00
Kenneth Graunke	7ddc1f8ded	iris: Only resolve inputs for actual shader stages We don't need to consider compute at render time, and don't need to consider disabled stages. 4% on drawoverhead.	2019-02-21 10:26:11 -08:00
Rhys Kidd	6c17e7d95f	iris: Fix assertion in iris_resource_from_handle() tiling usage Assertion error: iris_resource_from_handle: Assertion `res->bo->tiling_mode == isl_tiling_to_i915_tiling(res->surf.tiling)' failed. This patch fixes 16 piglit tests on KBL: glx/glx-multithread-texture glx/glx-query-drawable-glx_fbconfig_id-glxpbuffer glx/glx-query-drawable-glx_fbconfig_id-glxpixmap glx/glx-query-drawable-glx_preserved_contents glx/glx-query-drawable-glxpbuffer-glx_height glx/glx-query-drawable-glxpbuffer-glx_width glx/glx-query-drawable-glxpixmap-glx_height glx/glx-query-drawable-glxpixmap-glx_width glx/glx-swap-pixmap glx/glx-swap-pixmap-bad glx/glx-tfp glx/glx-visuals-depth -pixmap glx/glx-visuals-stencil -pixmap spec/egl 1.4/eglcreatepbuffersurface and then glclear spec/egl 1.4/largest possible eglcreatepbuffersurface and then glclear spec/egl_nok_texture_from_pixmap/basic Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>	2019-02-21 10:26:11 -08:00
Kenneth Graunke	73d525f188	iris: Fix scratch space allocation on Icelake. Gen9-10 have fewer than 4 subslices per slice, so they need this to be rounded up. Gen11 isn't documented as needing this hack, and it can also have more than 4 subslices, so the hack actually can break things. Fixes tests/spec/arb_enhanced_layouts/execution/component-layout/ sso-vs-gs-fs-array-interleave	2019-02-21 10:26:11 -08:00
Kenneth Graunke	154e3e45bb	iris: better MOCS	2019-02-21 10:26:11 -08:00
Dave Airlie	aaaf611130	iris: fix gpu calcs for timestamp queries	2019-02-21 10:26:11 -08:00
Kenneth Graunke	3c45d03049	iris: only mark depth/stencil as writable if writes are actually enabled	2019-02-21 10:26:11 -08:00
Kenneth Graunke	3a938a4b23	iris: more dead comments	2019-02-21 10:26:11 -08:00
Kenneth Graunke	e169cb09c3	iris: pin and re-pin the scratch BO	2019-02-21 10:26:11 -08:00
Kenneth Graunke	dd0d47a5d2	iris: delete finished comments	2019-02-21 10:26:11 -08:00
Kenneth Graunke	32ee2e4c27	iris: always pin the binder...in the compute context, too. not sure why this hasn't tripped things up	2019-02-21 10:26:11 -08:00
Kenneth Graunke	fbfe07c4f3	iris: Track blend enables, save outbound for resolve code	2019-02-21 10:26:11 -08:00
Kenneth Graunke	5481887ca8	iris: whitespace fixes	2019-02-21 10:26:11 -08:00
Kenneth Graunke	b2fa90706e	iris: Make a alloc_surface_state helper This does the gtt_offset addition for us	2019-02-21 10:26:11 -08:00
Kenneth Graunke	b358c4b92b	iris: Use a surface state fill helper This will check aux_usage eventually	2019-02-21 10:26:11 -08:00
Kenneth Graunke	b92ca4d0f6	iris: don't print the pointer in INTEL_DEBUG=submit lots of noise in diff, hope was it would be useful for gdb, but the the GEM handle is good enough	2019-02-21 10:26:11 -08:00
Kenneth Graunke	ad969a00c0	iris: Fix the prototype for iris_bo_alloc_tiled This now matches the actual function in iris_bufmgr.c, as well as the equivalent brw_bufmgr.c function...	2019-02-21 10:26:11 -08:00
Kenneth Graunke	598a78849e	iris: Fix for PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET This fixes ext_transform_feedback-builtin-varyings gl_Position after the combination of my transform feedback reworks and my vertex buffer reworks (?)	2019-02-21 10:26:11 -08:00
Kenneth Graunke	392fba5f31	iris: drop unnecessary genx->streamout field	2019-02-21 10:26:11 -08:00
Kenneth Graunke	5307ff6a5f	iris: Implement DrawTransformFeedback() We get the count by dividing the offset by the stride.	2019-02-21 10:26:11 -08:00
Jason Ekstrand	2e103fff63	iris: Copy anv's MI_MATH helpers for multiplication and division (import done by Ken but with author set to Jason because it's his code that's being imported, so he deserves the credit)	2019-02-21 10:26:11 -08:00
Kenneth Graunke	52baba80f3	iris: only get space for one offset in stream output targets Target corresponds to a buffer, buffer only records one offset, not multiple.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	31357bae4b	iris: Move iris_stream_output_target def to iris_context.h now that it doesn't have genxml	2019-02-21 10:26:10 -08:00
Kenneth Graunke	cf4931e586	iris: Don't bother packing 3DSTATE_SO_BUFFER at create time We have to do half the packet late anyway, we may as well just do it all at set time. This also lets us move the struct def out of genxml	2019-02-21 10:26:10 -08:00
Kenneth Graunke	754d678b0a	iris: Add _MI_ALU helpers that don't paste This lets you pass arguments as function parameters	2019-02-21 10:26:10 -08:00
Kenneth Graunke	5094062bbe	iris: Reorder LRR parameters to have dst first. LRI and LRM both put dst first, be consistent.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	2f5d85661f	iris: rewrite set_vertex_buffer and VB handling I was using the Gallium API wrong. set_* functions with start_slot and count parameters are supposed to update a subrange of the items. I had been trashing all bound vertex buffers and starting over. This should hopefully also make it easier to slot in additional VERTEX_BUFFER_STATEs at draw time, say, for shader draw parameters.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	286b8b8f99	iris: handle PatchVerticesIn as a system value.	2019-02-21 10:26:10 -08:00
Tapani Pälli	96bb328e9b	iris: add Android build Note that at least following additional libs/components require changes since they refer to BOARD_GPU_DRIVERS variable which is used to select the driver: - mixins - minigbm - libdrm - drm_gralloc v2: (feedback by Gustaw Smolarczyk) Fix trailing \ in a few cases Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-21 10:26:10 -08:00
Kenneth Graunke	97e82e80f9	iris: override alpha to one src1 blend factors No idea why this used to pass and doesn't after updating...seems like we should have been handling it all along...	2019-02-21 10:26:10 -08:00
Kenneth Graunke	90b2745148	iris: Always do rasterizer discard in clipper but continue doing it in SOL if possible because it's faster Fixes ./bin/ext_transform_feedback-discard-drawarrays - simpler too	2019-02-21 10:26:10 -08:00
Kenneth Graunke	5f511798d0	iris: Fix primitive generated query active flag	2019-02-21 10:26:10 -08:00
Kenneth Graunke	99cab4d381	iris: Enable guardband clipping	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f062dcdfbb	iris: Clamp viewport extents to the framebuffer dimensions Fixes arb_framebuffer_no_attachments-query's resize subtest.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	fb2df1b5d5	iris: Fix clear dimensions Fixes depthstencil-render-miplevels 1024 s=z24_s8	2019-02-21 10:26:10 -08:00
Kenneth Graunke	2e79e46d23	iris: Drop continues in resolve Now that we u_bit_scan we know it exists	2019-02-21 10:26:10 -08:00
Kenneth Graunke	5fde1fa988	iris: Replace num_textures etc with a bitmask we can scan More accurate bounds, plus can skip dead ones	2019-02-21 10:26:10 -08:00
Kenneth Graunke	7ad7d0beea	iris: Fix set_sampler_views with start > 0	2019-02-21 10:26:10 -08:00
Kenneth Graunke	1c6fea8e7b	iris: fix set_sampler_views to not unbind, be better about bounds	2019-02-21 10:26:10 -08:00
Kenneth Graunke	598ce8e88e	iris: fix overhead regression from flushing for storage images st calls us with count = 32 but a NULL pointer...we only really care about the highest non-NULL image...	2019-02-21 10:26:10 -08:00
Kenneth Graunke	4749f6cc4f	iris: Fix NOS mechanism Set bits, not values	2019-02-21 10:26:10 -08:00
Kenneth Graunke	a24734a2d7	iris: re-pin inherited streamout buffers	2019-02-21 10:26:10 -08:00
Kenneth Graunke	19803d0aa7	iris: reemit SBE when sprite coord origin changes fixes arb_point_sprite-checkerboard	2019-02-21 10:26:10 -08:00
Kenneth Graunke	480c62bc7e	iris: omask can kill	2019-02-21 10:26:10 -08:00
Kenneth Graunke	bd031eb2e8	iris: reject all clipping when we can't use streamout render disabled	2019-02-21 10:26:10 -08:00
Kenneth Graunke	72cf2185c8	iris: make clipper statistics dynamic	2019-02-21 10:26:10 -08:00
Kenneth Graunke	1114f0c1ce	iris: CS stall for stream out -> VB i965 doesn't do this, but I suspect it just stalls a lot and doesn't hit this. Fixes ext_transform_feedback-position render among others.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	c03fbb41aa	iris: fix dma buf import strides	2019-02-21 10:26:10 -08:00
Kenneth Graunke	90274bd48f	iris: fix alpha channel for RGB BC1 formats	2019-02-21 10:26:10 -08:00
Jason Ekstrand	47d4ea1a16	iris: Allocate buffer resources separately (cleaned up by Ken - make sure a bunch of things were more obviously not using res->surf, do allow checking res->surf.tiling == LINEAR, drop format cpp checks that aren't needed, drop memzone handling for images, assume buffers / non-buffers in a few places...)	2019-02-21 10:26:10 -08:00
Kenneth Graunke	585c95f8cc	iris: Don't bother considering if the underlying surface is a cube Dave fixed it to consider whether the sampler view is a cube. With that, there's no point (possibly harm) in looking if the original resource was a cube...if it's an array view, we don't want to treat it as a cube anymore...	2019-02-21 10:26:10 -08:00
Kenneth Graunke	773adeb9e9	iris: move some non-buffer case code in a bit	2019-02-21 10:26:10 -08:00
Kenneth Graunke	2c0f001295	iris: Stop leaking iris_uncompiled_shaders like mad Now shader-db actually executes. We still need a plan for culling dead iris_compiled_shaders...	2019-02-21 10:26:10 -08:00
Kenneth Graunke	68d531d7d7	iris: Destroy the bufmgr Plugs a 12360 byte leak	2019-02-21 10:26:10 -08:00
Kenneth Graunke	7c29c3d01e	iris: Fix IRIS_MEMZONE_COUNT to exclude the border color pool This is supposed to exclude single address zones. We were getting too many VMA allocators but failing to set them up, which worked out because we also forgot to destroy them...	2019-02-21 10:26:10 -08:00
Kenneth Graunke	6cb211121b	iris: Unref unbound_tex resource Plugs a 12536 byte leak	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f73fdb4001	iris: Destroy the border color pool This plugs a 12224 byte leak	2019-02-21 10:26:10 -08:00
Kenneth Graunke	3d55e9a2aa	iris: Destroy transfer helper on screen teardown Plugs a 16 byte leak	2019-02-21 10:26:10 -08:00
Kenneth Graunke	bdc1269eb2	iris: Fix failed to compile TCS message	2019-02-21 10:26:10 -08:00
Kenneth Graunke	fbf3124771	iris: Rework tiling/modifiers handling We were being very picky about things being Y tiled. But, not everything can be - for example, > 16382 surfaces on SKL GT1-3 have to fall back to linear. Instead, give ISL options and let it pick.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	761a5fb36a	iris: fix conditional compute, don't stomp predicate for pipelined queries	2019-02-21 10:26:10 -08:00
Kenneth Graunke	40b12c103c	iris: check query first this lets us avoid the predicate bit in more cases, which is nice	2019-02-21 10:26:10 -08:00
Kenneth Graunke	0c3ea03e4b	iris: for BLORP, only use the predicate enable bit when USE_BIT	2019-02-21 10:26:10 -08:00
Dave Airlie	7bbf3ff4a9	iris: add conditional render support	2019-02-21 10:26:10 -08:00
Kenneth Graunke	dbe198d6ba	iris: drop key_size_for_cache dead since my program cache API rework. we could still use it for one function, but it's so trivial to pass the size, that it's probably not worth the extra code	2019-02-21 10:26:10 -08:00
Dave Airlie	e4115eaca0	iris: iris add load register reg32/64 These will be needed for broadwell and conditional render	2019-02-21 10:26:10 -08:00
Dave Airlie	311a1b3198	iris: execute compute related query on compute batch. This only happens for the compute invocations query.	2019-02-21 10:26:10 -08:00
Dave Airlie	00645ea01c	iris: fix cube texture view	2019-02-21 10:26:10 -08:00
Kenneth Graunke	39d1056d10	iris: fix some SO overflow query bugs and tidy the code a bit	2019-02-21 10:26:10 -08:00
Dave Airlie	527e5bcdc7	iris: add initial transform feedback overflow query paths (V3) v2: fix cpu overflow calc v3: use a struct	2019-02-21 10:26:10 -08:00
Kenneth Graunke	0ded23a552	iris: actually flush for storage images	2019-02-21 10:26:10 -08:00
Kenneth Graunke	69e97670bc	iris: add an extra BT assert from Chris Wilson	2019-02-21 10:26:10 -08:00
Kenneth Graunke	4312784674	iris: add assertions about binding table starts	2019-02-21 10:26:10 -08:00
Kenneth Graunke	240615695d	iris: drop pull constant binding table entry nothing uses this	2019-02-21 10:26:10 -08:00
Kenneth Graunke	10d04cdaa4	iris: Use program's num textures not the state tracker's bound the state tracker might bind more textures than the program is using.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	855ff47d36	iris: Enable precompiles	2019-02-21 10:26:10 -08:00
Kenneth Graunke	ed4ffb9715	iris: rework program cache interface This exposes iris_upload_shader() without having to bind it, which will be useful for precompiles. It also lets us examine the old programs and flag dirty bits at a higher level, rather than cramming all that knowledge into the cache layer.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	701a6b6006	iris: Use wrappers for create_xs_state rather than a switch statement	2019-02-21 10:26:10 -08:00
Kenneth Graunke	e628095b9a	iris: fix comment location	2019-02-21 10:26:10 -08:00
Kenneth Graunke	e5df8913e1	iris: export iris_upload_shader	2019-02-21 10:26:10 -08:00
Kenneth Graunke	d525b3dfad	iris: fix prototype warning	2019-02-21 10:26:10 -08:00
Kenneth Graunke	84a8c63527	iris: Re-pin even if nothing is dirty	2019-02-21 10:26:10 -08:00
Kenneth Graunke	415ede346d	iris: Flush for history at various moments When we blit, transfer, or copy_resource to a buffer, we need to flush to ensure any stale data for that buffer is invalidated in the caches. bind_history will inform us which caches need to be flushed. Also, for any push constant buffers, we need to flag those dirty so that we re-emit 3DSTATE_CONSTANT_*, causing the data to be re-pushed.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	c8579e708e	iris: add iris_flush_and_dirty_for_history	2019-02-21 10:26:10 -08:00
Kenneth Graunke	d169747a3e	iris: Track a binding history for buffer resources This will let us know what caches to flush / state to dirty when altering the contents of a buffer.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f49f506b13	iris: drop long dead XXX comment	2019-02-21 10:26:10 -08:00
Kenneth Graunke	5dbd6df9f7	iris: Do the 48-bit vertex buffer address invalidation workaround	2019-02-21 10:26:10 -08:00
Kenneth Graunke	1b1ea23766	iris: Fix VIEWPORT/LAYER in stream output info Fixes glsl-1.50-transform-feedback-builtins and ext_transform_feedback-builtin-varyings gl_PointSize	2019-02-21 10:26:10 -08:00
Kenneth Graunke	c5b22441f1	iris: Fix buffer -> buffer copy_region Size can be too large for a surf, blorp_buffer_copy chops things up into segments we can actually handle Fixes map_buffer_range_test and copy_buffer_coherency	2019-02-21 10:26:10 -08:00
Kenneth Graunke	beb2d5e065	iris: Lie about indirects fixes interpolateAt tests	2019-02-21 10:26:10 -08:00
Kenneth Graunke	b9ccb00e2c	iris: Enable ctx->Const.UseSTD430AsDefaultPacking hooray for obscurely named pipe caps with bizarre descriptions!	2019-02-21 10:26:10 -08:00
Kenneth Graunke	39cb10613c	iris: update comment	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f9612e7682	iris: RT flush for memorybarrier with texture bit PIXEL_BUFFER_BARRIER_BIT turns into PIPE_BARRIER_TEXTURE and it ought to trigger an RT flush, according to brw_memory_barrier	2019-02-21 10:26:10 -08:00
Kenneth Graunke	2c23721397	iris: PIPE_CONTROL workarounds for GPGPU mode	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f1a7392be1	iris: Put batches in an array We keep re-making this array all over the place	2019-02-21 10:26:10 -08:00
Kenneth Graunke	c2a77efa71	iris: put render batch first in fence code this shouldn't matter, but it will make the next refactor easier	2019-02-21 10:26:10 -08:00
Kenneth Graunke	d918c09975	iris: flush the compute batch too if border pool is redone	2019-02-21 10:26:10 -08:00
Kenneth Graunke	017b556609	iris: leave a TODO	2019-02-21 10:26:10 -08:00
Chris Wilson	f459c56be6	iris: Add fence support using drm_syncobj	2019-02-21 10:26:10 -08:00
Kenneth Graunke	db199d9d07	iris: Add wait fences to properly sync between render/compute When flushing a batch due to a data dependency, we need to not only kick off the other batch's work, but stall our execution until it completes. Just wait on last_syncpt after flushing it.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	d69bc4ac12	iris: Hang on to the last batch's sync-point, so we can wait on it	2019-02-21 10:26:10 -08:00
Chris Wilson	fae74234d9	iris: Tag each submitted batch with a syncobj (adjusted by Ken to make the signalling sync object immediately on batch reset, rather than batch finish time. this will work better with deferred flushes...)	2019-02-21 10:26:10 -08:00
Kenneth Graunke	3e332af611	iris: Drop vestiges of throttling code	2019-02-21 10:26:10 -08:00
Chris Wilson	54347c078e	iris: Merge two walks of the exec_bos list	2019-02-21 10:26:10 -08:00
Kenneth Graunke	3455f57575	iris: replace vestiges of fence fds with newer exec_fence API patch by me and Chris Wilson	2019-02-21 10:26:10 -08:00
Kenneth Graunke	11da219be9	iris: Avoid synchronizing due to the workaround BO	2019-02-21 10:26:10 -08:00
Kenneth Graunke	30d7bebc8a	iris: Avoid cross-batch synchronization on read/reads This avoids flushing batches just because e.g. both are reading the same dynamic state streaming buffer, or shader assembly buffer.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	b21e916a62	iris: Combine iris_use_pinned_bo and add_exec_bo	2019-02-21 10:26:10 -08:00
Kenneth Graunke	fb4c898842	iris: Use iris_use_pinned_bo rather than add_exec_bo directly less special this way	2019-02-21 10:26:10 -08:00
Chris Wilson	e5528151a7	iris: Fix assigning the output handle for exporting for KMS Fixes gbm_bo_get_handle() used for KMS in glamor.	2019-02-21 10:26:10 -08:00
Chris Wilson	01e729f883	iris: Tidy exporting the flink handle	2019-02-21 10:26:10 -08:00
Kenneth Graunke	1b69b14c2a	iris: Fix SLM Now that Jason has set up the L3 we can do this. Also, my assert was useless because we hadn't set up the field in the first place. Oops.	2019-02-21 10:26:10 -08:00
Jason Ekstrand	f9c5e277ac	iris: Don't set constant read lengths at upload time They're set in derived_data as part of store_cs_state	2019-02-21 10:26:10 -08:00
Jason Ekstrand	a90a0e22cb	iris: Configure the L3$ on the compute context	2019-02-21 10:26:10 -08:00
Kenneth Graunke	25a41b1aef	iris: properly pin stencil buffers	2019-02-21 10:26:10 -08:00
Kenneth Graunke	8545e39808	iris: Fix TCS/TES slot unification TCS outputs, TES inputs...not TCS inputs Fixes some barrier tests	2019-02-21 10:26:10 -08:00
Kenneth Graunke	da5590496e	iris: more todo notes	2019-02-21 10:26:10 -08:00
Kenneth Graunke	9878ea842f	iris: scissored and mirrored blits	2019-02-21 10:26:10 -08:00
Kenneth Graunke	25f194d5ac	iris: more TODO	2019-02-21 10:26:10 -08:00
Kenneth Graunke	5207a5f5d5	iris: Fix independent alpha blending. independent_blend_enable means per-RT blending, not RGB != A	2019-02-21 10:26:10 -08:00
Kenneth Graunke	c06f6d12a5	iris: "Fix" transfer maps of buffers x should be in bytes, not cpp units This generally worked out because PIPE_BUFFER is supposedly required to be R8_UINT or R8_UNORM. I hear some state trackers pass PIPE_FORMAT_NONE instead, however, which would make this break. Just do the right thing directly, to be defensive and clear.	2019-02-21 10:26:10 -08:00
Kenneth Graunke	b2c04aa3a0	iris: Fix SourceAlphaBlendFactor	2019-02-21 10:26:10 -08:00
Kenneth Graunke	89833eddab	iris: leave another TODO	2019-02-21 10:26:10 -08:00
Kenneth Graunke	983e2ae7d2	iris: only clip lower if there's something to clip against	2019-02-21 10:26:10 -08:00
Kenneth Graunke	e11c497fc6	iris: fix sysval only binding tables	2019-02-21 10:26:10 -08:00
Kenneth Graunke	2ddbc1025e	iris: don't forget to upload CS consts	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f1f84a1ae7	iris: drop param stuffs	2019-02-21 10:26:10 -08:00
Kenneth Graunke	1b5d35319e	iris: don't trip on param asserts I'd rather not rewrite i965's compute system value handling right now :(	2019-02-21 10:26:10 -08:00
Kenneth Graunke	f4829a2fe1	iris: don't support pull constants. I don't think it matters, we won't have any params anyway, but let's be sure it doesn't try	2019-02-21 10:26:10 -08:00
Kenneth Graunke	911f9e8f3f	iris: regather info so we get CLIP_DIST slots, not CLIP_VERTEX	2019-02-21 10:26:09 -08:00
Kenneth Graunke	6d19fe376d	iris: enable push constants if we have sysvals but no uniforms	2019-02-21 10:26:09 -08:00
Kenneth Graunke	1ef68d77c0	iris: drop iris_setup_push_uniform_range it doesn't do anything, we have no params. I guess I thought there would be some, but they all get dead code eliminated even if we try to make them exist in the first place.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	7eeb124c02	iris: fix more uniform setup	2019-02-21 10:26:09 -08:00
Kenneth Graunke	50743eb748	iris: fix num clip plane consts	2019-02-21 10:26:09 -08:00
Kenneth Graunke	a98634a28f	iris: actually upload clip planes.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	c60ce3f4fd	iris: bypass params and do it ourselves the backend keeps dead code eliminating them all, so we can't do that, plus we don't want to because params[] is lame	2019-02-21 10:26:09 -08:00
Kenneth Graunke	78fc760bab	iris: dodge backend UCP lowering	2019-02-21 10:26:09 -08:00
Kenneth Graunke	deb6d588a6	iris: fix system value remapping	2019-02-21 10:26:09 -08:00
Kenneth Graunke	2b0a2915dc	iris: hook up key stuff for clip plane lowering	2019-02-21 10:26:09 -08:00
Kenneth Graunke	2876dd1a37	iris: lower user clip planes	2019-02-21 10:26:09 -08:00
Kenneth Graunke	80c856cbee	iris: only bother with params if there are any...	2019-02-21 10:26:09 -08:00
Kenneth Graunke	2186d83185	iris: fill out params array with built-ins, like clip planes	2019-02-21 10:26:09 -08:00
Kenneth Graunke	d3e8ff143d	iris: add param domain defines	2019-02-21 10:26:09 -08:00
Kenneth Graunke	ecb28b2802	iris: drop unnecessary param[] setup from iris_setup_uniforms the backend just considers these dead anyway	2019-02-21 10:26:09 -08:00
Kenneth Graunke	ed08f022f0	iris: Defer cbuf0 upload to draw time	2019-02-21 10:26:09 -08:00
Kenneth Graunke	e98cf9c24b	iris: Clone the NIR The backend compiler used to do this for us, but after a rebase, it's now the driver's responsibility. This lets us alter it for say, clip vertex lowering, at the global level rather than the per-variant level.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	587e438128	iris: Print the batch name when decoding	2019-02-21 10:26:09 -08:00
Kenneth Graunke	2727a942a4	iris: partial set_query_active_state used to avoid OQ during clears for example fixes occlusion_query_meta_no_fragments	2019-02-21 10:26:09 -08:00
Kenneth Graunke	64af1d9248	iris: Fix multiple RTs with non-independent blending rt[i] isn't filled out in this case, so we have to use rt[0]	2019-02-21 10:26:09 -08:00
Kenneth Graunke	58507c02ce	iris: Fix TextureBarrier I don't know how I came up with the old one, this is now what i965 does Also we now do compute batches too	2019-02-21 10:26:09 -08:00
Kenneth Graunke	e5d84bbd36	iris: Fix MSAA smooth points Fixes bin/ext_framebuffer_multisample-point-smooth 2 -auto -fbo	2019-02-21 10:26:09 -08:00
Kenneth Graunke	4d219b0eb3	iris: implement scratch space! we borrow the approach from anv rather than i965, as it works better with pre-baked state that needs to contain scratch BO addresses fixes a bunch of varying packing tests	2019-02-21 10:26:09 -08:00
Kenneth Graunke	9511b89ef9	iris: tidy more warnings	2019-02-21 10:26:09 -08:00
Kenneth Graunke	846316b258	iris: Enable msaa_map transfer helpers This does the downsampling for us. It'll use BLORP anyway because it uses blit(), and that uses BLORP.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	9ec927497e	iris: Actually create/destroy HW contexts The intention is that render and compute use their own contexts, and each is PIPELINE_SELECT'd to the right pipeline. But we hadn't actually made them, so we got the fd-default context. Thanks to Chris Wilson for catching this!	2019-02-21 10:26:09 -08:00
Kenneth Graunke	cb5f47f585	iris: Don't leak the compute batch	2019-02-21 10:26:09 -08:00
Kenneth Graunke	fbe5d75f11	iris: cross batch flushing	2019-02-21 10:26:09 -08:00
Kenneth Graunke	c3cc525c7a	iris: Cross-link iris_batches so they can potentially flush each other This makes e.g. the render batch aware of the compute batch, so it can ask questions like "is this BO referenced by some other batch?" and do something about that.	2019-02-21 10:26:09 -08:00
Dave Airlie	ed016b2a0b	iris: fix crash in sparse vertex array this fixes crash in array-stride piglit.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	bcac11c8f1	iris: Use at least 1x1 size for null FB surface state. Otherwise we get 0 - 1 = 0xffffffff and fail to pack SURFACE_STATE. Fixes some object namespace pollution gltexsubimage2d tests	2019-02-21 10:26:09 -08:00
Kenneth Graunke	9c8fdf8133	iris: Drop B5G5R5X1 support This is oddly renderable but not supported for sampling, which is the opposite of other X formats. Just skip it and fall back to BGRA.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	4b31f506f8	iris: Enable A8/A16_UNORM in an inefficient manner These are currently just use the 'A' hardware formats, rather than the faster 'R' formats. glBitmap handling needs these, it seems. :(	2019-02-21 10:26:09 -08:00
Kenneth Graunke	80497af192	iris: Enable ARB_shader_stencil_export	2019-02-21 10:26:09 -08:00
Kenneth Graunke	3e6aaa1ba5	iris: Disable a PIPE_CONTROL workaround on Icelake	2019-02-21 10:26:09 -08:00
Kenneth Graunke	84a419432d	iris: Flag constants dirty on program changes 3DSTATE_CONSTANT_* looks at prog_data->ubo_ranges. We were getting saved by iris_set_constant_buffers() usually happening when changing programs (as they usually change uniforms too), but with the clear shader that doesn't use uniforms, we weren't getting one and were leaving push constants enabled, screwing things up. Also clean up a bit of a mess left by the hacks - we were missing bindings in the VS/FS/CS case, among other issues...	2019-02-21 10:26:09 -08:00
Kenneth Graunke	317ba8796f	iris: allow binding a null vertex buffer PBO upload apparently does this...	2019-02-21 10:26:09 -08:00
Kenneth Graunke	aef1ba5ce4	iris: fix overhead regression from "don't stomp each other's dirty bits" The change from dirty = 0ull to dirty &= ~NOT_MY_BITS broke the "nothing to do? skip it!" optimization. thanks to Chris for noticing this!	2019-02-21 10:26:09 -08:00
Kenneth Graunke	525d89cafc	iris: delete dead code	2019-02-21 10:26:09 -08:00
Kenneth Graunke	8a98e90415	iris: Fix refcounting of grid surface	2019-02-21 10:26:09 -08:00
Jason Ekstrand	8e8868d5ad	iris/compute: Zero out the last grid size on indirect dispatches	2019-02-21 10:26:09 -08:00
Jason Ekstrand	c16e711ff2	iris/compute: Don't increment the grid size offset It may be in the dynamic state buffer but the fact that we have a resource takes care of that. We don't need to add in the address of the dynamic state buffer again.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	a3e813c5af	iris: SO_DECL_LIST fix	2019-02-21 10:26:09 -08:00
Kenneth Graunke	927c4a21bd	iris: Fall back to 1x1x1 null surface if no framebuffer supplied If the state tracker never gave us the framebuffer dimensions via a set_framebuffer_state() call, just fall back to the unbound texture null surface, which is 1x1x1. Otherwise we'd use a NULL resource (no pun intended).	2019-02-21 10:26:09 -08:00
Kenneth Graunke	5d1a9db720	iris: Fix off by one in scissoring, empty scissors, default scissors	2019-02-21 10:26:09 -08:00
Kenneth Graunke	938d63b2e8	iris: Move snapshots_landed to the front. Transform feedback overflow queries need to write additional data, and it would be nice to have this field remain at a consistent offset.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	ba2a4207f9	iris: Clamp UBO and SSBO access to the actual BO size, for safety	2019-02-21 10:26:09 -08:00
Kenneth Graunke	a9b32f2bbf	iris: Fix texture buffer / image buffer sizes. Also fix image buffers with offsets.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	d1f8947792	iris: fix SF_CLIP_VIEWPORT array indexing with multiple VPs fixes bunches of viewport stuffs	2019-02-21 10:26:09 -08:00
Kenneth Graunke	5bd49a47b6	iris: flag CC_VIEWPORT when changing num viewports this also has a loop over num_viewports	2019-02-21 10:26:09 -08:00
Kenneth Graunke	d98967d936	iris: fix UBOs with bindings that have an offset	2019-02-21 10:26:09 -08:00
Kenneth Graunke	3f70956a4e	iris: try and avoid pointless compute submissions if apps don't use compute shaders, we don't even want to kick off the compute initialization batch	2019-02-21 10:26:09 -08:00
Kenneth Graunke	97125e9bb3	iris: fix SBA flushing by refactoring code	2019-02-21 10:26:09 -08:00
Kenneth Graunke	8fa99481e7	iris: do PIPELINE_SELECT for render engine, add flushes, GLK hacks	2019-02-21 10:26:09 -08:00
Kenneth Graunke	b2d223b6bf	iris: hack to avoid memorybarriers out the wazoo we don't want to emit piles of pipe controls to a compute batch if it isn't necessary... prevents double-batch-wraps in cs-op-selection-bool-bvec4-bvec4 (but it's still kinda a big ol' hack...)	2019-02-21 10:26:09 -08:00
Kenneth Graunke	b3a40c27a2	iris: don't let render/compute contexts stomp each other's dirty bits only clear what you process	2019-02-21 10:26:09 -08:00
Kenneth Graunke	f8796079da	iris: better dirty checking	2019-02-21 10:26:09 -08:00
Kenneth Graunke	06a993dac2	iris: rewrite grid surface handling now we only upload a new grid when it's actually changed, which saves us from having to emit a new binding table every time it changes. this also moves a bunch of non-gen-specific stuff out of iris_state.c	2019-02-21 10:26:09 -08:00
Kenneth Graunke	155e1a63d5	iris: XXX for compute state tracking :/ Maybe we should just move dirty to batch, it would help with the reset stuff too	2019-02-21 10:26:09 -08:00
Kenneth Graunke	643030f4fb	iris: fix whitespace	2019-02-21 10:26:09 -08:00
Kenneth Graunke	b0dc11993e	iris: bail if SLM is needed	2019-02-21 10:26:09 -08:00
Kenneth Graunke	973b937cac	iris: leave XXX about unnecessary binding table uploads	2019-02-21 10:26:09 -08:00
Kenneth Graunke	7fb8c20d7b	iris: drop unnecessary #ifdefs	2019-02-21 10:26:09 -08:00
Kenneth Graunke	549db5b90e	iris: drop XXX that Jordan handled	2019-02-21 10:26:09 -08:00
Jordan Justen	942bdb2906	iris/compute: Support indirect compute dispatch Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	b35c8f2182	iris/compute: Push subgroup-id Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	229450a2a6	iris/compute: Flush compute batch on memory-barriers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	fb4637797e	iris/compute: Provide binding table entry for gl_NumWorkGroups Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	fcd0364857	iris/compute: Wait on compute batch when mapping Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	ea416d0b5d	iris/program: Don't try to push ubo ranges for compute We only can push constants for compute shaders from one range. Gallium glsl-to-nir (src/mesa/state_tracker/st_glsl_to_nir.cpp) lowers all uniform accesses to a ubo. Unfortunately we also load the subgroup-id as a uniform in the compiler. Since we use the 1 push range for this subgroup-id, we then lose the ability to actually push the ubo with all the normal user uniform values. In other words, there is lots of room for performance improvement, but at least retrieving the uniforms as pull-constants is functional for now. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	c7cfa4000f	iris/compute: Get group counts from grid->grid Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	fd9ccd8b5d	iris/compute: Flush compute batches Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	9b5cda95aa	iris/compute: Add MEDIA_STATE_FLUSH following WALKER Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	6ebd04ac8f	iris: Add iris_restore_compute_saved_bos Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	622aaa290f	iris: Add IRIS_DIRTY_CONSTANTS_CS Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Jordan Justen	25f1625edf	iris/compute: Set mask bits on PIPELINE_SELECT Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Kenneth Graunke	9fc672428d	iris: little bits of compute basics	2019-02-21 10:26:09 -08:00
Kenneth Graunke	860ce6af3f	iris: drop XXX's about swizzling pretty sure this is unnecessary on modern HW	2019-02-21 10:26:09 -08:00
Kenneth Graunke	12de56f53d	iris: drop dead format //'s these just aren't supported	2019-02-21 10:26:09 -08:00
Kenneth Graunke	f6c68066a6	iris: yes	2019-02-21 10:26:09 -08:00
Kenneth Graunke	752abeb690	iris: initial compute caps RET macro borrowed from freedreno	2019-02-21 10:26:09 -08:00
Kenneth Graunke	4da28c2c22	iris: Enable fb fetch needed for ES 3.2	2019-02-21 10:26:09 -08:00
Kenneth Graunke	be905bd461	iris: advertise GL_ARB_shader_texture_image_samples	2019-02-21 10:26:09 -08:00
Jordan Justen	6441e906e8	iris: Set num_uniforms in bytes Ref: brw_nir_lower_uniforms, type_size_scalar_bytes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-02-21 10:26:09 -08:00
Kenneth Graunke	c29fd34259	iris: move images next to textures in binding table	2019-02-21 10:26:09 -08:00
Kenneth Graunke	0d9c5b4e7e	iris: null for non-existent cbufs prevents BTs from being shifted down incorrectly	2019-02-21 10:26:09 -08:00
Kenneth Graunke	98e8f80e7d	iris: actually set image access	2019-02-21 10:26:09 -08:00
Jason Ekstrand	d9aee25a46	iris: Don't lower image formats for write-only images	2019-02-21 10:26:09 -08:00
Kenneth Graunke	a06f0fe517	iris: set image access correctly	2019-02-21 10:26:09 -08:00
Kenneth Graunke	5d1dadfc38	iris: bother with BTIs	2019-02-21 10:26:09 -08:00
Kenneth Graunke	f5b887da6c	iris: implement set_shader_images hook	2019-02-21 10:26:09 -08:00
Kenneth Graunke	26a54ae4b2	iris: lower storage image derefs	2019-02-21 10:26:09 -08:00
Kenneth Graunke	e97a24da89	iris: set the binding table size we weren't doing mark_surface_used on images (i965 does it while uploading the unnecessary image uniforms), so our binding tables were too small...	2019-02-21 10:26:09 -08:00
Kenneth Graunke	28b41992c8	iris: X32_S8X24 :/ This can happen when faking Z32_S8X24 and setting StencilSampling = true I guess we'll just turn it into S8_UINT... Fixes KHR-GL45.texture_swizzle.functional	2019-02-21 10:26:09 -08:00
Kenneth Graunke	6e7957a22d	iris: enable I/L formats	2019-02-21 10:26:09 -08:00
Kenneth Graunke	bfbebbaa36	iris: Use R/RG instead of I/L/A when sampling	2019-02-21 10:26:09 -08:00
Kenneth Graunke	94569a6458	iris: rework format translation apis	2019-02-21 10:26:09 -08:00
Kenneth Graunke	b9eeed3e8f	iris: Allow PIPE_CONTROL with Stall at Scoreboard and RT flush It's nonsensical, but not illegal, and mandatory on Icelake	2019-02-21 10:26:09 -08:00
Kenneth Graunke	65d1cda995	iris: add gen11 to genX_call	2019-02-21 10:26:09 -08:00
Kenneth Graunke	0fdcb20803	iris: inline stage_from_pipe to avoid unused warnings	2019-02-21 10:26:09 -08:00
Kenneth Graunke	6fbb6ba290	iris: pipe to scs -> iris_pipe.h	2019-02-21 10:26:09 -08:00
Kenneth Graunke	87351b8dfe	iris: force persample interp cap	2019-02-21 10:26:09 -08:00
Kenneth Graunke	90b9efc1f9	iris: stencil texturing	2019-02-21 10:26:09 -08:00
Kenneth Graunke	9b229d266d	iris: fix Z32_S8 depth sampling We were accidentally using the ISL_FORMAT_R32_FLOAT_X8X24_TYPELESS format, which is NOT what we use. We just store R32_FLOAT depth. fixes Piglit's texwrap GL_ARB_depth_buffer_float	2019-02-21 10:26:09 -08:00
Kenneth Graunke	822f91508e	iris: don't mark contains_draw = false when chaining batches chaining to a new batch reuses create_batch(), but we don't need to do the work of pinning BOs we inherit from a previous batch...when that is actually part of the same execbuf invocation. instead, just flag it when setting primary_batch_size = 0, in iris_batch_reset	2019-02-21 10:26:09 -08:00
Kenneth Graunke	294ce58a30	iris: vma_free bo->size, not bo_size this is more obviously correct. I think the two end up being the same in practice, since this is in the alloc_from_cache case, and presumably bo from the bucket has bo->size == bucket->size, and bo_size also is bucket->size... still. better to do the obvious thing. brw_bufmgr already does it this way.	2019-02-21 10:26:09 -08:00
Kenneth Graunke	2f24000662	iris: drop a bunch of pipe_sampler_state stuff we don't need	2019-02-21 10:26:09 -08:00
Kenneth Graunke	c6016d3761	iris: just mark snapshots_landed from the CPU otherwise, get results may check q->map->snapshots_landed...before our commands to initialize it to false have actually executed...so it'd get some random garbage from the BO...	2019-02-21 10:26:09 -08:00
Kenneth Graunke	3c0ef22edb	iris: Enable ARB_shader_vote The easiest get out the vote campaign ever	2019-02-21 10:26:08 -08:00
Kenneth Graunke	0395eba20f	iris: magic number 36 -> #define	2019-02-21 10:26:08 -08:00
Kenneth Graunke	57f8a623c5	iris: better query file comment	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d3a5d87219	iris: early return properly	2019-02-21 10:26:08 -08:00
Kenneth Graunke	07ff8c752f	iris: 36-bit overflow fixes	2019-02-21 10:26:08 -08:00
Kenneth Graunke	dff174c103	iris: Need to \| 1 when asking for timestamps	2019-02-21 10:26:08 -08:00
Kenneth Graunke	1d91eba7dc	iris: glGet timestamps, more correct timestamps	2019-02-21 10:26:08 -08:00
Kenneth Graunke	36fbcfb06c	iris: ...and SO prims emitted queries looks like we have queries some fails still due to races between snapshots_written and start/end not being garbage...not sure what that's about	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ec82be57e8	iris: timestamps	2019-02-21 10:26:08 -08:00
Kenneth Graunke	23572cdd07	iris: drop explicit pinning writes will already rw_bo or ro_bo that	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d8875fe406	iris: primitives generated query support	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ffae6e3105	iris: pipeline stats	2019-02-21 10:26:08 -08:00
Kenneth Graunke	7840d0e091	iris: play chicken with timer queries for now they have been crashy in the past and I don't want to risk tanking my laptop right before my XDC talk	2019-02-21 10:26:08 -08:00
Kenneth Graunke	0b095c665d	iris: gpr0 to bool I think OQ is basically working now.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	f5a8908bd1	iris: fix random failures via CS stall...but why?	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ad14795805	iris: flush batch when asking for result via QBO	2019-02-21 10:26:08 -08:00
Kenneth Graunke	cf261caad9	iris: results write	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d4e4517569	iris: gen10+ workarounds and break fix	2019-02-21 10:26:08 -08:00
Kenneth Graunke	dca5632de1	iris: initial query code	2019-02-21 10:26:08 -08:00
Kenneth Graunke	dd478913d5	iris: LRM/SRM/SDI hooks	2019-02-21 10:26:08 -08:00
Kenneth Graunke	af9fe0d472	iris: rw_bo for pipe controls this is used for WRITE IMMEDIATE... but maybe we don't want to for the workaround BO?	2019-02-21 10:26:08 -08:00
Kenneth Graunke	30c370ed4b	iris: use 0 for TCS passthrough program string ID the passthrough shader doesn't need a real program string ID - that's basically used for ARB programs indicating total program source code changes, or other pre-baked uniform changes, etc...none of which a passthrough shader has...so we don't need a unique identifier to distinguish them. We want to use a consistent value so we find existing passthrough shaders in the cache.	2019-02-21 10:26:08 -08:00
Caio Marcelo de Oliveira Filho	54e23442e2	iris: Add support for TCS passthrough If no TCS is provided, create a "passthrough" TCS that will take the default values set in the API as constants and pass to the TES, along with any other inputs it expects. The code to create the NIR shader is the same as in i965. Tested with ./piglit run -t 'tess' quick_shader r and fixed a dozen crashes from that list.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	5395658c61	iris: inherit the index buffer properly	2019-02-21 10:26:08 -08:00
Kenneth Graunke	a858b69880	iris: delete bogus comment Caio asked what was wrong. There is nothing wrong. :)	2019-02-21 10:26:08 -08:00
Kenneth Graunke	f2f506fa43	iris: properly re-pin stencil buffers	2019-02-21 10:26:08 -08:00
Kenneth Graunke	aaced066e8	iris: fix context restore of 3DSTATE_CONSTANT ranges if clean we want to DO the pinning...not SKIP the pinning. thanks to Jordan Justen for catching this!	2019-02-21 10:26:08 -08:00
Kenneth Graunke	58a6c99ebe	iris: silence const warning not sure why this is labeled const, I'm pretty sure we are taking the reference and owning this, so there's no particular reason we can't change it. it certainly seems to be working for non-compute. and, freedreno's ir3_shader.c seems to do this as well. still...gross :/	2019-02-21 10:26:08 -08:00
Kenneth Graunke	897f8d9232	iris: refactor program CSO stuff	2019-02-21 10:26:08 -08:00
Caio Marcelo de Oliveira Filho	fb4a3e2736	iris: Fix uses of gl_TessLevel* The backend compiler expects the gl_TessLevel* variables to be mapped as inputs instead of system values. Use the new PIPE_CAP to get this behavior from GLSL compiler. Tested with: tests/spec/arb_tessellation_shader/execution/vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test	2019-02-21 10:26:08 -08:00
Kenneth Graunke	2b956a093a	iris: totally untested icelake support	2019-02-21 10:26:08 -08:00
Kenneth Graunke	921790b080	iris: initialize "don't suck" bits, as Ben likes to call them	2019-02-21 10:26:08 -08:00
Kenneth Graunke	73a4cef220	iris: refactor LRIs in context setup we're going to have more of them, so reduce the boilerplate	2019-02-21 10:26:08 -08:00
Kenneth Graunke	2d1db44e8e	iris: enable ARB_enhanced_layouts	2019-02-21 10:26:08 -08:00
Kenneth Graunke	c0422d623c	iris: re-pin binding table contents if we didn't re-emit them fixes glsl-vs-loop and other regressions from multibinder.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	2963276a58	iris: move binder pinning outside the dirty == 0 check This might be a new batch with back to back non-dirty calls, if so we need to inherit the old binder...	2019-02-21 10:26:08 -08:00
Chris Wilson	1a61a211f0	iris: fix memzone_for_address since multibinder changes	2019-02-21 10:26:08 -08:00
Kenneth Graunke	f6924e2379	iris: update comments for multibinder	2019-02-21 10:26:08 -08:00
Kenneth Graunke	5cb0527c4f	iris: fix SO offset writes for multiple streams	2019-02-21 10:26:08 -08:00
Kenneth Graunke	eff081cdd9	iris: Support multiple binder BOs, update Surface State Base Address	2019-02-21 10:26:08 -08:00
Kenneth Graunke	148e315d96	iris: fix null FB and unbound tex surface state addresses	2019-02-21 10:26:08 -08:00
Kenneth Graunke	f838400a59	iris: set EXEC_OBJECT_CAPTURE on all driver internal buffers	2019-02-21 10:26:08 -08:00
Kenneth Graunke	938afd484a	iris: fix constant buffer 0 to be absolute thanks to Jason for catching this. Fixes some va64 tests. Surprisingly not much else, as apparently getting to UBO range 4 is uncommon!	2019-02-21 10:26:08 -08:00
Kenneth Graunke	5a2257bb2f	iris: don't unconditionally emit 3DSTATE_VF / 3DSTATE_VF_TOPOLOGY this was just laziness on my part	2019-02-21 10:26:08 -08:00
Kenneth Graunke	4c27cb031c	iris: skip over whole function if dirty == 0 kinda pointless in non-pathological cases, but does boost our score in the drawarrays case by 50%...	2019-02-21 10:26:08 -08:00
Kenneth Graunke	888efcd192	iris: Allow inlining of require/get_command_space eliminates so many callqs for ptr++	2019-02-21 10:26:08 -08:00
Kenneth Graunke	2ebce6f8c8	iris: use Eric's new caps helper this does change a couple caps...PRIMITIVE_RESTART_FOR_PATCHES...	2019-02-21 10:26:08 -08:00
Kenneth Graunke	3e7a41f228	iris: new caps	2019-02-21 10:26:08 -08:00
Kenneth Graunke	52eb8d5593	iris: fix blend state memcpy thanks to Jason for noticing grumpy valgrind	2019-02-21 10:26:08 -08:00
Kenneth Graunke	9ce92fa036	iris: Skip primitive ID overrides if the shader wrote a custom value Fixes glsl-1.50/execution/geometry/primitive-id-out	2019-02-21 10:26:08 -08:00
Kenneth Graunke	47d3019c4a	iris: fix crash when binding optional shader for the first time	2019-02-21 10:26:08 -08:00
Kenneth Graunke	6331b754df	iris: handle level/layer in direct maps needed now that we do 1D linear	2019-02-21 10:26:08 -08:00
Kenneth Graunke	9f7654139b	iris: use linear for 1D textures This gets us the gen9 compact linear storage	2019-02-21 10:26:08 -08:00
Kenneth Graunke	b2a5e1ebb3	iris: big old hack for tex-miplevel-selection copied from ilo. I don't understand this at all..	2019-02-21 10:26:08 -08:00
Kenneth Graunke	e4d22b16c8	iris: fix sampler state setting	2019-02-21 10:26:08 -08:00
Kenneth Graunke	b3bb33c4c1	iris: try to hack around binder issue	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d2516358f9	iris: fix line-aa-width we should probably move the roundf to st_atom_raster	2019-02-21 10:26:08 -08:00
Kenneth Graunke	701b47a197	iris: implement get_sample_position Fixes arb_sample_shading/builtin-gl-sample-position	2019-02-21 10:26:08 -08:00
Kenneth Graunke	7ed4b80233	iris: z_res -> s_res fixes crashes introduced a few commits ago	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d1cb4b330a	iris: reenable R32G32B32 texture buffers This dropped us from GL 4.2 to GL 3.3 by mistake. Thanks to Dave for catching this!	2019-02-21 10:26:08 -08:00
Chris Wilson	367f6bbd01	iris: Record reusability of bo on construction We know that if the bufmgr->reuse is set to false or if the bo is too large for a bucket, the same will be true when we come to free the bo.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	abe7dbfa4a	iris: Reduce binder alignment from 64 to 32 3DSTATE_BINDING_TABLE_POINTER_XS's alignment requirement is only 32B. Makes us waste less precious binder space.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	04e8c5bb43	iris: precompute hashes for cache tracking saves a touch of cpu overhead in the new resolve tracking	2019-02-21 10:26:08 -08:00
Chris Wilson	d209cc5170	iris: AMD_pinned_memory (rebased by Ken, mainly set res->internal_format)	2019-02-21 10:26:08 -08:00
Kenneth Graunke	93c1921ce2	iris: proper cache tracking this is copied from the i965 aux resolve stuff...minus the aux resolves	2019-02-21 10:26:08 -08:00
Kenneth Graunke	5e30b1083b	iris: Move cache tracking to iris_resolve.c	2019-02-21 10:26:08 -08:00
Kenneth Graunke	42dccb1233	iris: use consistent copyright formatting some of them had typos, didn't say 'authors or copyright holders', or other mistakes. This is now https://opensource.org/licenses/MIT text, formatted consistently.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	1d33982e9b	iris: track depth/stencil writes enabled	2019-02-21 10:26:08 -08:00
Kenneth Graunke	3fecb1c44d	iris: Move iris_sampler_view declaration to iris_resource.h We'll need this for resolve tracking. There's also no genxml stuff here	2019-02-21 10:26:08 -08:00
Kenneth Graunke	b75b52530a	iris: Move things to iris_shader_state We didn't originally have this struct, so we had lots of ad-hoc arrays. Now that we have it, it makes sense to group things there.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	410a555bfb	iris: move iris_shader_state from ice->shaders.state to ice->state.shaders it's more state related...	2019-02-21 10:26:08 -08:00
Kenneth Graunke	33701d5341	iris: Drop bogus sampler state saving We do this in an earlier loop. This was just reading things out of the array, and saving them back over the same array...but in the wrong slots	2019-02-21 10:26:08 -08:00
Kenneth Graunke	aba2cee711	iris: rename pipe to base	2019-02-21 10:26:08 -08:00
Kenneth Graunke	7705f62cb6	iris: don't emit SBE all the time	2019-02-21 10:26:08 -08:00
Kenneth Graunke	630d602900	iris: port non-bucket alignment bugfix Sergii's `24839663a4`	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ad6ba5a712	iris: drop pwrite nobody uses it	2019-02-21 10:26:08 -08:00
Kenneth Graunke	aad70ad8a1	iris: drop dead assignments Eric's commit `9a6a631762`	2019-02-21 10:26:08 -08:00
Kenneth Graunke	2bd7d6fa71	iris: last VUE map NOS, handle > 16 FS inputs not sure if the UNCOMPILED_FS flagging is still needed, should reevaluate those hacks at some point	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ee8cb7e0ee	iris: implement ARB_clear_texture	2019-02-21 10:26:08 -08:00
Kenneth Graunke	84b30a2900	iris: call maybe_flush for each blorp operation otherwise with high layer counts we may exceed two batches worth of commands... (!)	2019-02-21 10:26:08 -08:00
Kenneth Graunke	0e059e4829	iris: assert depth is 1 in resource_copy_region given the dstz parameter I don't think it does multiple slices..	2019-02-21 10:26:08 -08:00
Kenneth Graunke	03933a2d1b	iris: blorp blit multiple slices fixes getteximage-depth	2019-02-21 10:26:08 -08:00
Kenneth Graunke	84832ab7d4	iris: Fix tiled memcpy for cubes...and for array slices tiled_memcpy_map was not offsetting map->ptr based on the slice, while unmap was. also, we were doing offsetting wrong for cubes.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	bce7398646	iris: disallow RGB32 formats too	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ea19d359cc	iris: Convert RGBX to RGBA for rendering. Fixes a bunch of RGB bugs.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	906becec70	iris: we can do multisample Z resolves	2019-02-21 10:26:08 -08:00
Kenneth Graunke	1f156f004b	iris: deal with Marek's new MSAA caps storage sample count is equal to sample count for us, for now, so 0 the pipe cap and ignore the new parameter	2019-02-21 10:26:08 -08:00
Kenneth Graunke	532cf23d25	iris: say no to more formats copied from brw_surface_formats.c	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d5146ba670	iris: actually do stencil blits	2019-02-21 10:26:08 -08:00
Kenneth Graunke	ad76389f88	iris: refcounting, who needs it? that's right, we do!	2019-02-21 10:26:08 -08:00
Kenneth Graunke	be60e3247c	iris: drop stencil handling now that u_transfer_helper does it	2019-02-21 10:26:08 -08:00
Kenneth Graunke	b932938d01	iris: use u_transfer_helper for depth stencil packing/unpacking	2019-02-21 10:26:08 -08:00
Kenneth Graunke	853230b5e6	iris: WTF transfers stencil unfortunately is stored in the Weird Tile Format (WTF or Tile-W) which needs special CPU detiling code.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	d93a20e258	iris: allow S8 as a stencil format	2019-02-21 10:26:08 -08:00
Kenneth Graunke	7972599eab	iris: actually emit stencil packets	2019-02-21 10:26:08 -08:00
Kenneth Graunke	753646dd6b	iris: clear stencil	2019-02-21 10:26:08 -08:00
Kenneth Graunke	9ec2d3640e	iris: depth or stencil fixes	2019-02-21 10:26:08 -08:00
Kenneth Graunke	763f9095ea	iris: fill out more caps	2019-02-21 10:26:08 -08:00
Kenneth Graunke	2d578e71d5	iris: get angry about execbuf failures want this to be easy to detect for now	2019-02-21 10:26:08 -08:00
Kenneth Graunke	a378ee3607	iris: simplify batch len qword alignment Split from a patch by Chris Wilson so I can test it independently	2019-02-21 10:26:08 -08:00
Kenneth Graunke	621cb43f41	iris: rename ring to engine makes more sense these days. split from a patch by Chris Wilson	2019-02-21 10:26:08 -08:00
Kenneth Graunke	1a9651f29a	iris: remember to set bo->userptr	2019-02-21 10:26:08 -08:00
Chris Wilson	796ad6fe97	iris: Wrap userptr for creating bo	2019-02-21 10:26:08 -08:00
Kenneth Graunke	5911fb8801	iris: sync bugfixes from brw_bufmgr I wrote softpin support here first, then debugged and landed it in brw; some of those fixes need to get brought back.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	dfe1ee4f6f	iris: comment everything 1. Write the code 2. Add comments 3. PROFIT (or just avoid cost of explaining or relearning things...)	2019-02-21 10:26:08 -08:00
Kenneth Graunke	387a414f2c	iris: add minor comments	2019-02-21 10:26:08 -08:00
Dave Airlie	9d39e69219	iris: fix some hangs around null framebuffers This fixes some cases in fbo-none* and framebuffer_no_attachments. I'm not sure this is correct otherwise, the tests don't all pass yet No idea if this is in any way the correct answer	2019-02-21 10:26:08 -08:00
Chris Wilson	02b82fe80a	iris: Set resource modifier on handle Required for gdm_bo_create_with_modifiers	2019-02-21 10:26:08 -08:00
Kenneth Graunke	682aeff8d0	iris: we don't support textureGatherOffsets, need it lowered	2019-02-21 10:26:08 -08:00
Kenneth Graunke	03dc99475d	iris: cube arrays are cubes too	2019-02-21 10:26:08 -08:00
Kenneth Graunke	80c7096672	iris: fix sample mask 0xffffffff does not mean 1, it means enable as many as there actually are. we don't get set_sample_mask() calls until some masking is actually applied...i.e. it doesn't get updated based on # of samples in the FBO changing.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	e990558152	iris: drop pipe_shader_state looking at the freedreno code, this is totally unnecessary! we can just store the NIR and be happy, and not have any vestiges of TGSI. plus we can reuse this structure for compute shaders, without needing a pipe_compute_state base.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	834b97c34b	iris: fix GS output component limit this is total, so should be 1024, not 128	2019-02-21 10:26:08 -08:00
Kenneth Graunke	c9f9a6f61b	iris: Avoid croaking when trying to create FBO surfaces with bad formats create_surface happens before st_validate_attachment, which actually does the "hey, this is a render target now, is that OK?" check Fixes asserts in ./bin/arb_texture_view-rendering-formats, allowing the rest of the tests to run.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	8da91ebb68	iris: enable texture gather	2019-02-21 10:26:08 -08:00
Kenneth Graunke	f3dd70182d	iris: BIG OL' HACK for UBO updates We need to re-push data when UBO changes. This will need to be replaced with a usage history based flushing system later.	2019-02-21 10:26:08 -08:00
Kenneth Graunke	a7311ef068	iris: update a todo comment	2019-02-21 10:26:07 -08:00
Kenneth Graunke	8e7b0deee2	iris: Don't reserve new binding table section unless things are dirty	2019-02-21 10:26:07 -08:00
Kenneth Graunke	870f2e8434	iris: implement texture/memory barriers	2019-02-21 10:26:07 -08:00
Kenneth Graunke	82ee971497	iris: drop unused bo parameter	2019-02-21 10:26:07 -08:00
Kenneth Graunke	f0159d5ca3	iris: update bindings when changing programs the binding table layout depends on program info. not known to fix anything yet.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	b0e9c5797b	iris: fix for disabling ssbos	2019-02-21 10:26:07 -08:00
Kenneth Graunke	b7b061c4e2	iris: fix SSBO indexing st/nir offsets SSBO indexes by MaxABOs. This is not what we want, as it bloats the binding tables. We'll need to adjust it to use info->num_abos as the offset and buffer base instead. For now, just use the inefficient format to get us rolling. We can add a PIPE_CAP later.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	376c7253f8	iris: enable SSBOs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	75709d982b	iris: fix TBO alignment to match 965	2019-02-21 10:26:07 -08:00
Kenneth Graunke	77b9219818	iris: unbind compiled shaders if none are present avoids the case where you have a stale compiled shader bound, but no uncompiled shader bound, which is not just boats, but an entire marina	2019-02-21 10:26:07 -08:00
Kenneth Graunke	fd5ed7b46b	iris: shorten loop num_ubos doesn't include Tim's magic UBO for regular uniforms, so +1	2019-02-21 10:26:07 -08:00
Kenneth Graunke	bf795b0244	iris: emit binding table for atomic counters and SSBOs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	2d5f545464	iris: implement set_shader_buffers for SSBOs/ABOs. We just stream out SURFACE_STATE for now...since it's a set_* API...and the buffer offset may change...not sure where else we'd do it.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	541cb60e7e	iris: export get_shader_info	2019-02-21 10:26:07 -08:00
Kenneth Graunke	f0558ca22c	iris: fix msaa flipping filters	2019-02-21 10:26:07 -08:00
Kenneth Graunke	2c73d7e3f1	iris: expose more things that we already support	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5b8dd5f303	iris: fix blorp filters we have to switch to blorp enums after the rebase, but also we were probably doing it wrong for MSAA before this.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	3aa1fcc65a	iris: hack around samples confusion	2019-02-21 10:26:07 -08:00
Kenneth Graunke	2c15f38a29	iris: point sprite enables	2019-02-21 10:26:07 -08:00
Kenneth Graunke	c60a4de1f5	iris: reemit blend state for alpha test function changes fixes bin/fbo-alphatest-formats GL_EXT_texture_snorm	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a4036635b1	iris: fix Z24 This was backwards. thanks to Jason Ekstrand for realizing that I was seeing the wrong bits.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a12a370d7b	iris: fix EmitNoIndirect we were using pipe stages, which are ordered dumbly for historical reasons. we want gl_shader_stage here. this got us the wrong options	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5bd861de8b	iris: assert about passthrough shaders to make this easier to detect otherwise it just silently fails and looks like some obscure problem	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5e19885d5a	iris: fill out MAX_PATCH_VERTICES	2019-02-21 10:26:07 -08:00
Kenneth Graunke	3e9e3121e5	iris: fix SGVS when there are no valid vertex elements tessellation nop.shader_test now passes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5520a54bc5	iris: vertex ID, instance ID	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a9083bdb71	iris: don't emit SO_BUFFERS and SO_DECL_LIST unless streamout is enabled Otherwise on the first draw, if XFB isn't enabled, we get a pile of MI_NOOPS where SO_BUFFERS should be	2019-02-21 10:26:07 -08:00
Kenneth Graunke	ebb960c6d3	iris: compile a TCS...don't bother with passthrough yet	2019-02-21 10:26:07 -08:00
Kenneth Graunke	9aa8be3d8e	iris: TES program key inputs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	fcee21da6b	iris: fix texture buffer stride	2019-02-21 10:26:07 -08:00
Kenneth Graunke	3c41d4cf3f	iris: fix sampler views of TBOs we can't read levels/layers, they're invalid for PIPE_BUFFER	2019-02-21 10:26:07 -08:00
Kenneth Graunke	6e7e49cc4f	iris: fix crash	2019-02-21 10:26:07 -08:00
Kenneth Graunke	841fc3e3ca	iris: record FS NOS	2019-02-21 10:26:07 -08:00
Kenneth Graunke	d223b316ad	iris: NOS mechanics	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a6d480f892	iris: bind state helper function	2019-02-21 10:26:07 -08:00
Kenneth Graunke	48b826cdaf	iris: s/hwcso/state/g	2019-02-21 10:26:07 -08:00
Kenneth Graunke	aeb6fc8782	iris: bits of multisample program key	2019-02-21 10:26:07 -08:00
Kenneth Graunke	e6b1cc2106	iris: save query type	2019-02-21 10:26:07 -08:00
Kenneth Graunke	44ba48eba7	iris: draw indirect support?	2019-02-21 10:26:07 -08:00
Kenneth Graunke	b030671298	iris: fix CC_VIEWPORT I was confusing depth bounds test with depth clamping	2019-02-21 10:26:07 -08:00
Kenneth Graunke	fdbc205552	iris: multislice transfer maps	2019-02-21 10:26:07 -08:00
Kenneth Graunke	44248d16d2	iris: disable 6x MSAA support	2019-02-21 10:26:07 -08:00
Kenneth Graunke	bc1b4db3b3	iris: fix sample mask for MSAA-off	2019-02-21 10:26:07 -08:00
Kenneth Graunke	7b8c0f058e	iris: actually pin the buffers	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5635abadef	iris: fix SO_DECL_LIST	2019-02-21 10:26:07 -08:00
Kenneth Graunke	dc3b927e97	iris: bother setting program_string_id... not sure how useful this really is... ./bin/ext_transform_feedback-tessellation triangles flat_first is hitting a case where we rebind the same VS program, but with different streamout info...which isn't in the key...but is in the cache...so we don't rebuild it...	2019-02-21 10:26:07 -08:00
Kenneth Graunke	9c1cefff52	iris: set even if no outputs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	cef0b8b13b	iris: streamout	2019-02-21 10:26:07 -08:00
Kenneth Graunke	059c096eff	iris: SO buffers	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5c00f5fdca	iris: Implement 3DSTATE_SO_DECL_LIST	2019-02-21 10:26:07 -08:00
Kenneth Graunke	6794f1ffb9	iris: rearrange iris_resource.h	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a3f77eceb4	iris: slab allocate transfers apparently we need this for u_threaded_context	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5165308169	iris: don't crash on shader perf logs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	f20fc950a7	iris: fix depth bounds clamp enables fixes depthrange-clear among others	2019-02-21 10:26:07 -08:00
Kenneth Graunke	eb274a31bc	iris: fix clip flagging on fb changes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	0232fbc2c4	iris: comment out l/a/i/la in hopes of r/rg fallbacks	2019-02-21 10:26:07 -08:00
Kenneth Graunke	cf34dd7a61	iris: actually handle array layers in blits	2019-02-21 10:26:07 -08:00
Kenneth Graunke	33a17d566f	iris: keep DISCARD_RANGE this isn't really an iris_bo_map flag, but the various resource mappers want to check for it to avoid making temp copies.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	c0ab9c9890	iris: actually set cube bit properly	2019-02-21 10:26:07 -08:00
Kenneth Graunke	d849501f4c	iris: rename map->stride	2019-02-21 10:26:07 -08:00
Kenneth Graunke	36301bbe40	iris: fix zoffset asserts with 2DArray/Cube	2019-02-21 10:26:07 -08:00
Kenneth Graunke	7f39f4843f	iris: SBE change stash not used yet, but want to flag it so I don't forget	2019-02-21 10:26:07 -08:00
Kenneth Graunke	8a080223e6	iris: just malloc one iris_genx_state instead of a bunch of oddball pieces Things that are gen-specific can go in iris_genx_state. Things that are gen-agnostic can go directly in ice->state.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a7e0edffb6	iris: dead pointer	2019-02-21 10:26:07 -08:00
Kenneth Graunke	ccec5bab5b	iris: implement border color, fix other sampler nonsense	2019-02-21 10:26:07 -08:00
Kenneth Graunke	8a16249285	iris: border color memory zone :( They took away our pointer bits, so now we need a pile of special code to handle this instead of just using u_upload_mgr. :(	2019-02-21 10:26:07 -08:00
Kenneth Graunke	1c19e3b21f	iris: don't include binder in surface VMA range	2019-02-21 10:26:07 -08:00
Kenneth Graunke	1cea195a95	iris: state ref tuple	2019-02-21 10:26:07 -08:00
Kenneth Graunke	c0e80a8d0a	iris: null surface for unbound textures avoids crashes...may not be really right	2019-02-21 10:26:07 -08:00
Kenneth Graunke	d358a4a040	iris: depth clears	2019-02-21 10:26:07 -08:00
Kenneth Graunke	470fb01a7a	iris: fix GS dispatch mode	2019-02-21 10:26:07 -08:00
Kenneth Graunke	01483c7933	iris: fix 3DSTATE_VERTEX_ELEMENTS / VF_INSTANCING for 0 elements	2019-02-21 10:26:07 -08:00
Kenneth Graunke	4c9067ae1d	iris: don't emit garbage 3DSTATE_VERTEX_BUFFERS when there aren't any	2019-02-21 10:26:07 -08:00
Kenneth Graunke	adf0c20461	iris: geometry shader support	2019-02-21 10:26:07 -08:00
Kenneth Graunke	de08ac9b0f	iris: TES uniform fixes not that we have a TES, but...	2019-02-21 10:26:07 -08:00
Kenneth Graunke	d207f97840	iris: larger polygon offset	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5188e54e97	iris: fix provoking vertex ordering had this backwards	2019-02-21 10:26:07 -08:00
Kenneth Graunke	cbbd6a61c4	iris: maybe-flush before blorp operations otherwise if we have a lot of back-to-back blorp operations we can potentially overflow even the chained batch	2019-02-21 10:26:07 -08:00
Kenneth Graunke	e0f3971280	iris: lightmodel flat	2019-02-21 10:26:07 -08:00
Kenneth Graunke	4d04111bfb	iris: implement copy image	2019-02-21 10:26:07 -08:00
Kenneth Graunke	40fd2fd603	iris: fall back to u_generate_mipmap It just does blits between layers, which is all we'd do anyway, and it already should use BLORP because of iris_blit(). Plus it handles 3D, which our code in i965 doesn't.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	6cf04c6ded	iris: clear fix	2019-02-21 10:26:07 -08:00
Kenneth Graunke	d416b81779	iris: shader dirty bits	2019-02-21 10:26:07 -08:00
Kenneth Graunke	b7cd3a083a	iris: rework DEBUG_REEMIT don't want to have to special case this everywhere	2019-02-21 10:26:07 -08:00
Kenneth Graunke	72416a2d0d	iris: clears	2019-02-21 10:26:07 -08:00
Kenneth Graunke	eef0d33cee	iris: better boxing on maps	2019-02-21 10:26:07 -08:00
Kenneth Graunke	419fac2fc6	iris: fix fragcoord ytransform the TGSI in the name is a misnomer, it actually controls wpos_ytransform lowering in NIR these days.	2019-02-21 10:26:07 -08:00
Kenneth Graunke	e67951227d	iris: Disable unsupported mirror clamp modes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	234cf647a4	iris: tidy comments about mirroring modes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a3a998f19a	iris: iris - fix QWord aligned endings after batch chaining rework I need to save the primary batch size after expanding it to include MI_BATCH_BUFFER_END and the QWord padding NOP	2019-02-21 10:26:07 -08:00
Kenneth Graunke	aacbcbbf47	iris: colorize batchbuffer failures to make them stand out	2019-02-21 10:26:07 -08:00
Kenneth Graunke	8e2b71b190	iris: bad inherited comments	2019-02-21 10:26:07 -08:00
Kenneth Graunke	8c54433275	iris: Handle batch submission failure "better" We used to not reset the batch, and just keep appending to it, so you'd get the same invalid contents over and over. I'd also really like to know about this, so aborting seems wise for now, if not for the long term	2019-02-21 10:26:07 -08:00
Kenneth Graunke	d0b55ca782	iris: don't always flush	2019-02-21 10:26:07 -08:00
Kenneth Graunke	9226ebfa85	iris: print second batch size separately	2019-02-21 10:26:07 -08:00
Kenneth Graunke	f12b079c0e	iris: actually init num_viewports fixes regressions	2019-02-21 10:26:07 -08:00
Kenneth Graunke	81f899c148	iris: scissor count fixes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	92d6a70853	iris: fix VP iteration	2019-02-21 10:26:07 -08:00
Kenneth Graunke	4a94628513	iris: fix num viewports to be based on programs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	b17215800c	iris: fix viewport counts and settings seeing set_viewport_state 0 1 set_viewport_state 1 15 which gives us a total of 16 viewports, updated incrementally so keep old values around and update them...	2019-02-21 10:26:07 -08:00
Kenneth Graunke	636cf8971e	iris: max VP index	2019-02-21 10:26:07 -08:00
Kenneth Graunke	7cdc6b1173	iris: emit 3DSTATE_SBE_SWIZ	2019-02-21 10:26:07 -08:00
Kenneth Graunke	26db2ea782	iris: avoid crashing on unbound constant resources instead, read from the workaround BO	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a7770501a7	iris: fix caps so tests run again	2019-02-21 10:26:07 -08:00
Kenneth Graunke	a6aeca9727	iris: fix major refcounting bug with resources DONTBLOCK -> NULL was happening after taking a reference, causing those to live forever This resolves the OOM problems	2019-02-21 10:26:07 -08:00
Kenneth Graunke	49f9c88801	iris: support signed vertex buffer offsets	2019-02-21 10:26:07 -08:00
Kenneth Graunke	0a43c9defa	iris: print refcounts in INTEL_DEBUG=submit	2019-02-21 10:26:07 -08:00
Kenneth Graunke	7d1e6f1fa1	iris: redo VB CSO a bit	2019-02-21 10:26:07 -08:00
Kenneth Graunke	432790bacd	iris: print binder utilization in INTEL_DEBUG=submit	2019-02-21 10:26:07 -08:00
Kenneth Graunke	f8179dc760	iris: clean up some warnings so I can see through the noise	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5f3a7ee701	iris: use pipe resources not direct BOs	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5619c15ecc	iris: indentation	2019-02-21 10:26:07 -08:00
Kenneth Graunke	27d45eb2f2	iris: don't leak keyboxes when searching for an existing program	2019-02-21 10:26:07 -08:00
Kenneth Graunke	7d504f3d52	iris: don't leak sampler state table resources	2019-02-21 10:26:07 -08:00
Kenneth Graunke	8e186cef2c	iris: rzalloc iris_compiled_shader so memcmp works even if padding creeps in	2019-02-21 10:26:07 -08:00
Kenneth Graunke	5f722bf7c4	iris: remove 4 bytes of padding in iris_compiled_shader	2019-02-21 10:26:07 -08:00
Kenneth Graunke	0db86016f7	iris: pc fixes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	f9f8ea7070	iris: more leak fixes	2019-02-21 10:26:07 -08:00
Kenneth Graunke	c763ecaa65	iris: plug leaks	2019-02-21 10:26:07 -08:00
Kenneth Graunke	477ea6c39a	iris: clear dirty	2019-02-21 10:26:07 -08:00
Kenneth Graunke	23987df412	iris: some dirty fixes two scissor bits, constants not being flagged, ZeroRTA, clip not being flagged	2019-02-21 10:26:07 -08:00
Kenneth Graunke	ccf37c7da9	iris: bindings dirty tracking	2019-02-21 10:26:07 -08:00
Kenneth Graunke	bbc6d15b59	iris: flag DIRTY_WM properly	2019-02-21 10:26:06 -08:00
Kenneth Graunke	3f863cf680	iris: fix the validation list on new batches	2019-02-21 10:26:06 -08:00
Kenneth Graunke	80dee31846	iris: save pointers to streamed state resources will be used for cross-batch validation list fixing	2019-02-21 10:26:06 -08:00
Kenneth Graunke	daceb04bc0	iris: put back the always flush - fixes some things :(	2019-02-21 10:26:06 -08:00
Kenneth Graunke	149408a360	iris: untested SAMPLER_STATE pin BO fix	2019-02-21 10:26:06 -08:00
Kenneth Graunke	de782e5b39	iris: delete some pointless STATIC_ASSERTS these were useful when I was patching relocs	2019-02-21 10:26:06 -08:00
Kenneth Graunke	3eebea88dc	iris: untested index buffer upload	2019-02-21 10:26:06 -08:00
Kenneth Graunke	9247546181	iris: state cleaning	2019-02-21 10:26:06 -08:00
Kenneth Graunke	7c40cdc12f	iris: comment about reemitting and flushing	2019-02-21 10:26:06 -08:00
Kenneth Graunke	d46c5b7c6c	iris: allow mapped buffers during execution (faster)	2019-02-21 10:26:06 -08:00
Kenneth Graunke	92de0f5aa6	iris: disable __gen_validate_value in release mode	2019-02-21 10:26:06 -08:00
Kenneth Graunke	08d1f13818	iris: drop assert for now	2019-02-21 10:26:06 -08:00
Kenneth Graunke	a9e357caac	iris: fix release builds	2019-02-21 10:26:06 -08:00
Kenneth Graunke	73f3c2cad0	iris: better VFI	2019-02-21 10:26:06 -08:00
Chris Wilson	2cbd42cddd	iris: IndexFormat = size/2 brw uses: IndexFormat = index_size >> 1 anv uses: IndexFromat = index_type[index_size]	2019-02-21 10:26:06 -08:00
Kenneth Graunke	5dcf62bb43	iris: use u_transfer helpers for now	2019-02-21 10:26:06 -08:00
Kenneth Graunke	48dc8bd4b0	iris: fix pull bufs that aren't the first user upload	2019-02-21 10:26:06 -08:00
Kenneth Graunke	eed7f7253e	iris: fill out pull constant buffers	2019-02-21 10:26:06 -08:00
Kenneth Graunke	90046b43cc	iris: make surface states for cbufs	2019-02-21 10:26:06 -08:00
Kenneth Graunke	4e007dbb30	iris: have more than one const_offset	2019-02-21 10:26:06 -08:00
Kenneth Graunke	9ea05ccf1f	iris: completely rewrite binder now we get a new one per batch, and flush if it fills up	2019-02-21 10:26:06 -08:00
Kenneth Graunke	26cc609927	iris: better ubo handling	2019-02-21 10:26:06 -08:00
Chris Wilson	a504b98e72	iris: fix import from dri2/3	2019-02-21 10:26:06 -08:00
Kenneth Graunke	badefe50a0	iris: fix constant packet length to match i965	2019-02-21 10:26:06 -08:00
Kenneth Graunke	201a4d923c	iris: maybe slightly less boats uniforms	2019-02-21 10:26:06 -08:00
Kenneth Graunke	a6dd9caf0d	iris: flush always	2019-02-21 10:26:06 -08:00
Kenneth Graunke	04d1a3a7de	iris: transfers	2019-02-21 10:26:06 -08:00
Kenneth Graunke	7437c28c0d	iris: util_copy_framebuffer_state (ported from Rob's v3d patches)	2019-02-21 10:26:06 -08:00
Kenneth Graunke	f6017da83f	iris: fix VF INSTANCING length	2019-02-21 10:26:06 -08:00
Kenneth Graunke	7fb7704b2e	iris: more depth stuffs... still missing stencil	2019-02-21 10:26:06 -08:00
Kenneth Graunke	02890c75b5	iris: fix 3DSTATE_VERTEX_ELEMENTS length	2019-02-21 10:26:06 -08:00
Kenneth Graunke	601ee4c189	iris: fix whitespace	2019-02-21 10:26:06 -08:00
Kenneth Graunke	4d24874236	iris: Lower the max number of decoded VBO lines saint foo, vbo lines!	2019-02-21 10:26:06 -08:00
Kenneth Graunke	48ddd7212d	iris: fix decoding and undo testing code	2019-02-21 10:26:06 -08:00
Kenneth Graunke	f31eea1f00	iris: fix batch chaining... don't chain a batch just for the end	2019-02-21 10:26:06 -08:00
Kenneth Graunke	5b914a6d58	iris: caps	2019-02-21 10:26:06 -08:00
Kenneth Graunke	604a1a1614	iris: chaining not growing	2019-02-21 10:26:06 -08:00
Kenneth Graunke	053fb51125	iris: just turn batch reset_and_clear_caches into reset	2019-02-21 10:26:06 -08:00
Kenneth Graunke	ca735c5e0c	iris: delete growing code and just die for now we need proper batch chaining. without relocations, we can't grow, since we've only allocated so much VMA for the batch, and the mechanism only works if we can pin it at the old address	2019-02-21 10:26:06 -08:00
Kenneth Graunke	7167c6d508	iris: blorp bug fixes I wrote this earlier, but it got lost somehow...	2019-02-21 10:26:06 -08:00
Kenneth Graunke	3650f8dfa1	iris: properly reject formats, fixes RGB32 rendering with texture float	2019-02-21 10:26:06 -08:00
Kenneth Graunke	4510098b9c	iris: proper # of uniforms or at least closer...we were using bytes, we want 256-bit units...	2019-02-21 10:26:06 -08:00
Kenneth Graunke	6091dc470f	iris: proper length for VE packet?	2019-02-21 10:26:06 -08:00
Kenneth Graunke	64a3f7423a	iris: uniforms for VS	2019-02-21 10:26:06 -08:00
Kenneth Graunke	d4a64e0a64	iris: bump GL version to 4.2	2019-02-21 10:26:06 -08:00
Kenneth Graunke	44993d451c	iris: some depth stuff :(	2019-02-21 10:26:06 -08:00
Kenneth Graunke	eb12cc70f0	iris: assert surf init	2019-02-21 10:26:06 -08:00
Kenneth Graunke	a4a426008b	iris: no more drawing rectangle in blorp there's some bug here as Jason's patches for only emitting 3DS_DR once got reverted by Mark later on, apparently they regressed MSAA tests. need to sort that out.	2019-02-21 10:26:06 -08:00
Kenneth Graunke	0e3870b9de	iris: blorp URB	2019-02-21 10:26:06 -08:00
Kenneth Graunke	01fe6df0ed	iris: make blorp pin the binder	2019-02-21 10:26:06 -08:00
Kenneth Graunke	063fc7bbb0	iris: linear staging buffers - fast CPU access...	2019-02-21 10:26:06 -08:00
Kenneth Graunke	84abf77c67	iris: hacky flushing for now	2019-02-21 10:26:06 -08:00
Kenneth Graunke	75a1639262	iris: drop the 48b printout, we never use anything else	2019-02-21 10:26:06 -08:00
Kenneth Graunke	86d7fd71f4	iris: add INTEL_DEBUG=reemit	2019-02-21 10:26:06 -08:00
Kenneth Graunke	b8a11ad256	iris: fix blorp prog data crashes	2019-02-21 10:26:06 -08:00
Kenneth Graunke	e2ba98ba39	iris: more blorp	2019-02-21 10:26:06 -08:00
Kenneth Graunke	1bba60a4bf	iris: fix sampler view crashes	2019-02-21 10:26:06 -08:00
Kenneth Graunke	e22da1e7b1	iris: drop bogus binder free I was malloc'ing it but then I changed my mind and embedded it directly	2019-02-21 10:26:06 -08:00
Kenneth Graunke	698d45b725	iris: more blitting code to make readpixels work	2019-02-21 10:26:06 -08:00
Kenneth Graunke	c9d9e44720	iris: bits of blorp code	2019-02-21 10:26:06 -08:00
Kenneth Graunke	79466c1313	iris: move bo_offset_from_sba for wider use	2019-02-21 10:26:06 -08:00
Kenneth Graunke	60d708bb80	iris: copy over i965's cache tracking needed to split out vtbl so I can pipe control without ice	2019-02-21 10:26:06 -08:00
Kenneth Graunke	dbd4770397	iris: pull in newer comments	2019-02-21 10:26:06 -08:00
Kenneth Graunke	841b3b9003	iris: Defines for base addresses rather than numbers everywhere	2019-02-21 10:26:06 -08:00
Kenneth Graunke	c75a1254a4	iris: Move get_command_space to iris_batch.c for reuse in blorp. it's a better interface anyway.	2019-02-21 10:26:06 -08:00
Kenneth Graunke	39e795d473	iris: fix texturing!	2019-02-21 10:26:06 -08:00
Kenneth Graunke	4929f020c3	iris: better SBE	2019-02-21 10:26:06 -08:00
Kenneth Graunke	8bf167c9e9	iris: vma - fix assert	2019-02-21 10:26:06 -08:00
Kenneth Graunke	10e4f1e68c	iris: vma fixes - don't free binder address	2019-02-21 10:26:06 -08:00
Kenneth Graunke	5a101e6434	iris: bo reuse	2019-02-21 10:26:06 -08:00
Kenneth Graunke	21acc00490	iris: crazy pipe control code imported from ~kwg/mesa pcx-2, gen < 8 code dropped	2019-02-21 10:26:06 -08:00
Kenneth Graunke	87aa880795	iris: fixes	2019-02-21 10:26:06 -08:00
Kenneth Graunke	3fbf7294b1	iris: fixes from i965	2019-02-21 10:26:06 -08:00
Kenneth Graunke	999ed6e213	iris: port bug fix from i965	2019-02-21 10:26:05 -08:00
Kenneth Graunke	19d11a6df3	iris: fix index	2019-02-21 10:26:05 -08:00
Kenneth Graunke	010e845af7	iris: increase allocator alignment	2019-02-21 10:26:05 -08:00
Kenneth Graunke	35afa8c8f3	iris: better BT asserts Probably nothing is working because texture upload isn't implemented	2019-02-21 10:26:05 -08:00
Kenneth Graunke	0148bd6839	iris: decoder fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	5d2673ba7e	iris: set sampler views	2019-02-21 10:26:05 -08:00
Kenneth Graunke	34164ce622	iris: isv freeing fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	012154c20f	iris: TES stash TODO: key setup	2019-02-21 10:26:05 -08:00
Kenneth Graunke	d890aee15d	iris: SBA once at context creation, not per batch hooray!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	e0eac28bd4	iris: fix a scissor bug	2019-02-21 10:26:05 -08:00
Kenneth Graunke	0707ff3f2f	iris: assemble SAMPLER_STATE table at bind time It's useless to allocate SAMPLER_STATEs in GPU memory on creation like we do for SURFACE_STATES, because they need to be organized into a contiguous block of memory. But we can do that at bind time, rather than draw time.	2019-02-21 10:26:05 -08:00
Kenneth Graunke	199c080926	iris: same treatment for sampler views	2019-02-21 10:26:05 -08:00
Kenneth Graunke	f51204a160	iris: allocate SURFACE_STATEs up front and stop streaming them	2019-02-21 10:26:05 -08:00
Kenneth Graunke	bf90d8a125	iris: delete more trash	2019-02-21 10:26:05 -08:00
Kenneth Graunke	1398c99aff	iris: canonicalize addresses. Back to working! Woo!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	b69a85bc4d	iris: validation dumping improvements backported from i965. don't bother with (pinned) because everything is.	2019-02-21 10:26:05 -08:00
Kenneth Graunke	24bcf1054b	iris: update vb BO handling now that we have softpin	2019-02-21 10:26:05 -08:00
Kenneth Graunke	9ac81f1890	iris: decoder fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	9955e8334b	iris: binder fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	65073c2217	iris: hook up batch decoder	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6cbd1d1692	iris: binders	2019-02-21 10:26:05 -08:00
Kenneth Graunke	209692c716	iris: include p_defines.h in iris_bufmgr.h for PIPE_TRANSFER_WRITE and friends	2019-02-21 10:26:05 -08:00
Kenneth Graunke	1af84d345a	iris: set EXEC_OBJECT_WRITE	2019-02-21 10:26:05 -08:00
Kenneth Graunke	651be7cf3d	iris: rewrite to use memzones and not relocs	2019-02-21 10:26:05 -08:00
Kenneth Graunke	68229caa38	iris: more uploaders	2019-02-21 10:26:05 -08:00
Kenneth Graunke	3861d24e23	iris: Also set SUPPORTS_48B? Not sure if necessary.	2019-02-21 10:26:05 -08:00
Kenneth Graunke	e95ad5994a	iris: dump gtt offset in dump_validation_list	2019-02-21 10:26:05 -08:00
Kenneth Graunke	d78be0188e	iris: fix icache memzone	2019-02-21 10:26:05 -08:00
Kenneth Graunke	e4aa8338c3	iris: Soft-pin the universe Breaks everything, woo!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	3693307670	iris: some thinking about binding tables	2019-02-21 10:26:05 -08:00
Kenneth Graunke	f6be3d4f3a	iris: bufmgr updates. Drop BO_ALLOC_BUSY (best not to hand people a loaded gun...) Drop vestiges of alignment	2019-02-21 10:26:05 -08:00
Kenneth Graunke	902a122404	iris: stop adding 9 to our varyings	2019-02-21 10:26:05 -08:00
Kenneth Graunke	a235da3e68	iris: set strides on transfers	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6891f70d87	iris: enable a few more formats	2019-02-21 10:26:05 -08:00
Kenneth Graunke	7130c43d96	iris: decode batches if they fail to submit	2019-02-21 10:26:05 -08:00
Kenneth Graunke	23367688e9	iris: NOOP pad batches correctly	2019-02-21 10:26:05 -08:00
Kenneth Graunke	f3150e9ecd	iris: warn if execbuf fails	2019-02-21 10:26:05 -08:00
Kenneth Graunke	a50a3a8edf	iris: uniform bits...badly	2019-02-21 10:26:05 -08:00
Kenneth Graunke	213b70a222	iris: sample mask...not 0. We now have a first triangle!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	1a6bb266cf	iris: write DISABLES are not write ENABLES...whoops	2019-02-21 10:26:05 -08:00
Kenneth Graunke	50a2596f46	iris: fix extents	2019-02-21 10:26:05 -08:00
Kenneth Graunke	ffcd84f55a	iris: catastrophic state pointer mistake	2019-02-21 10:26:05 -08:00
Kenneth Graunke	1739dc0d5e	iris: more SF CL VPs	2019-02-21 10:26:05 -08:00
Kenneth Graunke	ade381fb9c	iris: fix dmabuf retval comparisons 0 means success	2019-02-21 10:26:05 -08:00
Kenneth Graunke	ed42ae2f9b	iris: more sketchy SBE	2019-02-21 10:26:05 -08:00
Kenneth Graunke	9be4b3baaf	iris: compctrl oh, also run things	2019-02-21 10:26:05 -08:00
Kenneth Graunke	db15993cfd	iris: actually pin the instruction cache buffers	2019-02-21 10:26:05 -08:00
Kenneth Graunke	bda9a77b47	iris: smaller blend state	2019-02-21 10:26:05 -08:00
Kenneth Graunke	f9d834d588	iris: don't do samplers for disabled stages	2019-02-21 10:26:05 -08:00
Kenneth Graunke	e21bddeb4f	iris: render targets!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	8503578e82	iris: fix silly unused batch with addr macro	2019-02-21 10:26:05 -08:00
Kenneth Graunke	352ec1f378	iris: warning fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	54ba8a60d5	iris: basic SBE code	2019-02-21 10:26:05 -08:00
Kenneth Graunke	5af16f5e20	iris: alpha testing in PSB	2019-02-21 10:26:05 -08:00
Kenneth Graunke	c96132d5fd	iris: blend state	2019-02-21 10:26:05 -08:00
Kenneth Graunke	bb3c0be7a8	iris: dummy constants	2019-02-21 10:26:05 -08:00
Kenneth Graunke	538decc0de	iris: URB configs.	2019-02-21 10:26:05 -08:00
Kenneth Graunke	b1115799e6	iris: actually set KSP offsets	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6f1c07d7dd	iris: actually softpin at an address	2019-02-21 10:26:05 -08:00
Kenneth Graunke	acdff2f9a6	iris: actually destroy the cache	2019-02-21 10:26:05 -08:00
Kenneth Graunke	9437e135ed	iris: rewrite program cache to use u_upload_mgr	2019-02-21 10:26:05 -08:00
Kenneth Graunke	67ca2be992	iris: no NEW_SBA	2019-02-21 10:26:05 -08:00
Kenneth Graunke	e7a729ba34	iris: shuffle comments	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6ecc93f764	iris: bits of WM key	2019-02-21 10:26:05 -08:00
Kenneth Graunke	bba13b1501	iris: move key pop to state module shader key population needs to read state	2019-02-21 10:26:05 -08:00
Kenneth Graunke	5864c9414a	iris: fix SBA	2019-02-21 10:26:05 -08:00
Kenneth Graunke	5ae278da18	iris: use vtbl to avoid multiple symbols, fix state base address	2019-02-21 10:26:05 -08:00
Kenneth Graunke	876417f9e8	iris: softpin some things	2019-02-21 10:26:05 -08:00
Kenneth Graunke	c493fee73f	iris: drop const from prog data parameters we ralloc steal things, which makes it a little bogus	2019-02-21 10:26:05 -08:00
Kenneth Graunke	cf7ba838ad	iris: more comes from bits filled in tomorrow, fix the build system to avoid symbol clashes somehow... we're getting gen9 functions because they happen to be listed before 10 in the link list.	2019-02-21 10:26:05 -08:00
Kenneth Graunke	8dffc9b195	iris: index buffer BO	2019-02-21 10:26:05 -08:00
Kenneth Graunke	8665dfd602	iris: WM. I could have added a dirty bit for this, but it doesn't seem worth it	2019-02-21 10:26:05 -08:00
Kenneth Graunke	bae5414594	iris: initial gpu state	2019-02-21 10:26:05 -08:00
Kenneth Graunke	0477591355	iris: reorganize commands to match brw	2019-02-21 10:26:05 -08:00
Kenneth Graunke	3e684d0eb7	iris: don't forget about TE	2019-02-21 10:26:05 -08:00
Kenneth Graunke	d71d2028ef	iris: convert IRIS_DIRTY_* to #defines enums are SIGNED. so IRIS_DIRTY_VS << 4 gets sign extended, making it not equal to IRIS_DIRTY_FS. Surprising!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	cfd5fcc256	iris: emit shader packets	2019-02-21 10:26:05 -08:00
Kenneth Graunke	1cf21cc813	iris: actually save derived state	2019-02-21 10:26:05 -08:00
Kenneth Graunke	57c1b71418	iris: promote iris_program_cache_item to iris_compiled_shader	2019-02-21 10:26:05 -08:00
Kenneth Graunke	581459a9fe	iris: some shader bits	2019-02-21 10:26:05 -08:00
Kenneth Graunke	df401aaa11	iris: scissor slots	2019-02-21 10:26:05 -08:00
Kenneth Graunke	dc4453d886	iris: bind_state -> compute state	2019-02-21 10:26:05 -08:00
Kenneth Graunke	2f100c6e31	iris: 3DPRIMITIVE fields	2019-02-21 10:26:05 -08:00
Kenneth Graunke	b3646e2b48	iris: fix VF instancing length so we don't get garbage in batch	2019-02-21 10:26:05 -08:00
Kenneth Graunke	317263ab11	iris: vertex packet fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	129fae5a90	iris: fix VBs	2019-02-21 10:26:05 -08:00
Kenneth Graunke	fc5ddc64f9	iris: fix assert	2019-02-21 10:26:05 -08:00
Kenneth Graunke	e91289908a	iris: fix indentation	2019-02-21 10:26:05 -08:00
Kenneth Graunke	41b32a4eda	iris: hack to stop crashing on samplers for now	2019-02-21 10:26:05 -08:00
Kenneth Graunke	dcfb06375a	iris: initialize dirty bits to ~0ull	2019-02-21 10:26:05 -08:00
Kenneth Graunke	0a513d63a1	iris: actually advance forward when emitting commands	2019-02-21 10:26:05 -08:00
Kenneth Graunke	24cc627612	iris: actually flush the commands	2019-02-21 10:26:05 -08:00
Kenneth Graunke	082911409e	iris: actually APPEND commands, not stomp over the top and never incr	2019-02-21 10:26:05 -08:00
Kenneth Graunke	b332ff489c	iris: VB fixes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	50b1e01996	iris: DEBUG=bat Deleted in the interest of making the branch compile at each step	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6e01bc0637	iris: VB addresses	2019-02-21 10:26:05 -08:00
Kenneth Graunke	b574b56325	iris: reference VB BOs	2019-02-21 10:26:05 -08:00
Kenneth Graunke	4dc683f64b	iris: so, sba then.	2019-02-21 10:26:05 -08:00
Kenneth Graunke	d900a235b1	iris: try and have an iris address	2019-02-21 10:26:05 -08:00
Kenneth Graunke	f31ae76216	iris: flag SBA updates when instruction BO changes	2019-02-21 10:26:05 -08:00
Kenneth Graunke	7d90cc8da4	iris: bit of SBA code genxml MOCS is stupid, addresses are hard news at 11	2019-02-21 10:26:05 -08:00
Kenneth Graunke	ff5c886fb3	iris: move MAX defines to iris_batch.h for SBA	2019-02-21 10:26:05 -08:00
Kenneth Graunke	7bfc8f7d7d	iris: kill iris_new_batch reset and new are too similar, and this had exactly one caller	2019-02-21 10:26:05 -08:00
Kenneth Graunke	b701096ab9	iris: make iris_batch target a particular ring	2019-02-21 10:26:05 -08:00
Kenneth Graunke	64f043570d	iris: lower io	2019-02-21 10:26:05 -08:00
Kenneth Graunke	695bd55d1a	iris: do the FS...asserts because we don't lower uniforms yet	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6aa15cadf3	iris: import program cache code	2019-02-21 10:26:05 -08:00
Kenneth Graunke	4525dda75f	iris: reworks, FS compile pieces	2019-02-21 10:26:05 -08:00
Kenneth Graunke	628a71c2e3	iris: parse INTEL_DEBUG	2019-02-21 10:26:05 -08:00
Kenneth Graunke	d62b0b9ee8	iris: draw->restart_index is uninitialized if PR is not enabled	2019-02-21 10:26:05 -08:00
Kenneth Graunke	5fad62cef1	iris: fix bogus index buffer reference	2019-02-21 10:26:05 -08:00
Kenneth Graunke	95fe254cf2	iris: fix prim type	2019-02-21 10:26:05 -08:00
Kenneth Graunke	793276cd8b	iris: msaa sample count packing problems 0 -> ffffffffffffffffffffffffffff	2019-02-21 10:26:05 -08:00
Kenneth Graunke	0252fb36e9	iris: actually save VBs	2019-02-21 10:26:05 -08:00
Kenneth Graunke	ed6ee3e270	iris: fix/rework line stipple	2019-02-21 10:26:05 -08:00
Kenneth Graunke	231935efa2	iris: init the batch!	2019-02-21 10:26:05 -08:00
Kenneth Graunke	9ca58ca517	iris: delete iris_pipe.c, shuffle code around	2019-02-21 10:26:05 -08:00
Kenneth Graunke	455e2d6dce	iris: disable execbuf for now	2019-02-21 10:26:05 -08:00
Kenneth Graunke	86e0c08b14	iris: make an ice->render_batch field we may want a second one for transfers	2019-02-21 10:26:05 -08:00
Kenneth Graunke	ffd7f13b4d	iris: drop unused field	2019-02-21 10:26:05 -08:00
Kenneth Graunke	8097dc9dd9	iris: shader debug log	2019-02-21 10:26:05 -08:00
Kenneth Graunke	6c7a276470	iris: maps	2019-02-21 10:26:05 -08:00
Kenneth Graunke	49896861ce	iris: linear resources	2019-02-21 10:26:05 -08:00
Kenneth Graunke	c820f5a4bd	iris: some program code	2019-02-21 10:26:04 -08:00
Kenneth Graunke	d48dc416fa	iris: basic push constant alloc	2019-02-21 10:26:04 -08:00
Kenneth Graunke	21c016b496	iris: emit 3DSTATE_SAMPLER_STATE_POINTERS	2019-02-21 10:26:04 -08:00
Kenneth Graunke	7b80f4587d	iris: sampler states	2019-02-21 10:26:04 -08:00
Kenneth Graunke	60208d12b4	iris: COLOR_CALC_STATE	2019-02-21 10:26:04 -08:00
Kenneth Graunke	9367c44639	iris: fix crash - CSO binding can be NULL (when destroying context)	2019-02-21 10:26:04 -08:00
Kenneth Graunke	efea4d96d9	iris: some draw info, vbs, sample mask	2019-02-21 10:26:04 -08:00
Kenneth Graunke	d6ad9f4732	iris: a bit of depth still need to allocate separate stencil	2019-02-21 10:26:04 -08:00
Kenneth Graunke	7abe5aefd3	iris: fix SF_CL length	2019-02-21 10:26:04 -08:00
Kenneth Graunke	c1c6c3a18a	iris: don't segfault on !old_cso	2019-02-21 10:26:04 -08:00
Kenneth Graunke	3eadb1b3a1	iris: framebuffers	2019-02-21 10:26:04 -08:00
Kenneth Graunke	e7c9bddda7	iris: stipples and vertex elements	2019-02-21 10:26:04 -08:00
Kenneth Graunke	d0aab78dc3	iris: sampler views	2019-02-21 10:26:04 -08:00
Kenneth Graunke	831d630b8b	iris: Surfaces!	2019-02-21 10:26:04 -08:00
Kenneth Graunke	4ec5f8be3e	iris: SF_CLIP_VIEWPORT	2019-02-21 10:26:04 -08:00
Kenneth Graunke	970836c34e	iris: scissors	2019-02-21 10:26:04 -08:00
Kenneth Graunke	7c875deaf0	iris: RASTER + SF + some CLIP, fix DIRTY vs. NEW	2019-02-21 10:26:04 -08:00
Kenneth Graunke	02f583b0a0	iris: initial gpu state, merges	2019-02-21 10:26:04 -08:00
Kenneth Graunke	a13d417ac1	iris: merge pack this lets us merge dynamic and pre-baked state, also like anv	2019-02-21 10:26:04 -08:00
Kenneth Graunke	aee39df710	iris: packing with valgrind. borrowed macros from anv!	2019-02-21 10:26:04 -08:00
Kenneth Graunke	d3d6ef37f6	iris: initial render state upload	2019-02-21 10:26:04 -08:00
Kenneth Graunke	26fb5a8ae2	iris: port over batchbuffer updates	2019-02-21 10:26:04 -08:00
Kenneth Graunke	14ca30507f	iris: viewport state, sort of	2019-02-21 10:26:04 -08:00
Kenneth Graunke	2dce0e94a3	iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs. This commit introduces a new Gallium driver for Intel Gen8+ GPUs, named 'iris_dri.so' after the hardware. Developed by: - Kenneth Graunke (overall driver) - Dave Airlie (shaders, conditional render, overflow query, Gen8 port) - Chris Wilson (fencing, pinned memory, ...) - Jordan Justen (compute shaders) - Jason Ekstrand (image load store) - Caio Marcelo de Oliveira Filho (tessellation control passthrough) - Rafael Antognolli (auxiliary buffer fixes) - The rest of the i965 contributors and the Mesa community	2019-02-21 10:26:04 -08:00
James Zhu	eac822eac1	gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646 Problem 1,4: they are caused by imcomplete blend comute shader implementation. So Reverts rgba back to frament shader. Fixes: `9364d66cb7` (Add video compositor compute shader render) Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Tested-by: Bruno Milreu <bmilreu@gmail.com>	2019-02-21 13:11:53 -05:00
Lionel Landwerlin	20c370c6b1	vulkan: add an overlay layer Just a starting point to display frame timings & drawcalls/submissions per frame. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> +1-by: Mike Lothian <mike@fireburn.co.uk> +1-by: Tapani Pälli <tapani.palli@intel.com> +1-by: Eric Engestrom <eric.engestrom@intel.com> +1-by: Yurii Kolesnykov <root@yurikoles.com> +1-by: myfreeweb <greg@unrelenting.technology> +1-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 18:06:05 +00:00
Lionel Landwerlin	89f03d1872	imgui: make sure our copy of imgui doesn't clash with others in the same process Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> +1-by: Mike Lothian <mike@fireburn.co.uk> +1-by: Tapani Pälli <tapani.palli@intel.com> +1-by: Eric Engestrom <eric.engestrom@intel.com> +1-by: Yurii Kolesnykov <root@yurikoles.com> +1-by: myfreeweb <greg@unrelenting.technology> +1-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 18:06:05 +00:00
Lionel Landwerlin	3950e7c11e	imgui: bump copy Updated at : commit f977871854af941289f2a9090dcc90f7aa3449a8 Author: omar <omarcornut@gmail.com> Date: Fri Feb 15 13:10:22 2019 +0100 ImFont: Minor adjustment to the structure. Examples: Removed unused variable. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> +1-by: Mike Lothian <mike@fireburn.co.uk> +1-by: Tapani Pälli <tapani.palli@intel.com> +1-by: Eric Engestrom <eric.engestrom@intel.com> +1-by: Yurii Kolesnykov <root@yurikoles.com> +1-by: myfreeweb <greg@unrelenting.technology> +1-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 18:06:05 +00:00
Lionel Landwerlin	51047cd2e8	build: move imgui out of src/intel/tools to be reused Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> +1-by: Mike Lothian <mike@fireburn.co.uk> +1-by: Tapani Pälli <tapani.palli@intel.com> +1-by: Eric Engestrom <eric.engestrom@intel.com> +1-by: Yurii Kolesnykov <root@yurikoles.com> +1-by: myfreeweb <greg@unrelenting.technology> +1-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 18:06:05 +00:00
Jason Ekstrand	f98fd9d15a	nir/lower_clip_cull: Fix an incorrect assert Copy+paste error. It was supposed to test cull and not clip. Fixes: `4e69fba534` "nir: Rewrite lower_clip_cull_distance_arrays..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-21 12:05:12 -06:00
Jason Ekstrand	f9b2f10a41	nir: Fix a compile warning	2019-02-21 09:44:42 -06:00
Rob Clark	908d5ee9eb	freedreno/a6xx: enable tiled images Turns out we can write to tiled images as well as read. This avoids having to linearize or do the tiling in the shader. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-21 09:06:06 -05:00
Alejandro Piñeiro	0629b2a462	nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fs On GLSL that info is set as a layout qualifier when redeclaring gl_FragCoord, so somehow tied to a specific variable. But in practice, they behave as a global of the shader. On ARB programs they are set using a global OPTION (defined at ARB_fragment_coord_conventions), and on SPIR-V using ExecutionModes, that are also not tied specifically to the builtin. This patch moves that info from nir variable and ir variable to nir shader and gl_program shader_info respectively, so the map is more similar to SPIR-V, and ARB programs, instead of more similar to GLSL. FWIW, shader_info.fs already had pixel_center_integer, so this change also removes some redundancy. Also, as struct gl_program also includes a shader_info, we removed gl_program::OriginUpperLeft and PixelCenterInteger, as it would be superfluous. This change was needed because recently spirv_to_nir changed the order in which execution modes and variables are handled, so the variables didn't get the correct values. Now the info is set on the shader itself, and we don't need to go back to the builtin variable to set it. Fixes: `e68871f6a` ("spirv: Handle constants and types before execution modes") v2: (Jason) * glsl_to_nir: get the info before glsl_to_nir, while all the rest of the info gathering is happening * prog_to_nir: gather the info on a general info-gathering pass, not on variable setup. v3: (Jason) * Squash with the patch that removes that info from ir variable * anv: assert that OriginUpperLeft is true. It should be already set by spirv_to_nir. * blorp: set origin_upper_left on its core "compile fragment shader", not just on some specific places (for this we added an helper on a previous patch). * prog_to_nir: no need to gather specifically this fragcoord modes as the full gl_program shader_info is copied. * spirv_to_nir: assert that we are a fragment shader when handling this execution modes. v4: (reported by failing gitlab pipeline #18750) * state_tracker: update too due changes on ir.h/gl_program v5: * blorp: minor change after change on previous patch * radeonsi: update due this change. v6: (Timothy Arceri) * prog_to_nir: remove extra whitespace * shader_info: don't use :1 on origin_upper_left * glsl: program.fs.origin_upper_left/pixel_center_integer can be move out of the shader list loop	2019-02-21 11:47:59 +01:00
Alejandro Piñeiro	675eabb560	blorp: introduce helper method blorp_nir_init_shader This initializes the nir shader that will be used by blorp. Right now it doesn't do too much beyond calling nir_builder_init_simple_shader, and setting a name. More stuff will be added on following patches. v2: there is a case were it is used a VERTEX_SHADER (Alejandro)	2019-02-21 11:47:51 +01:00
Alyssa Rosenzweig	705723e6be	panfrost: Verify and print brx condition in disasm The condition code in extended branches is repeated 8 times for unclear reasons; accordingly, the code would be disassembled as "unknown5555", "unknownAAAA", etc. This patch correctly masks off the lower two bits to find the true code to print, verifying that the code is repeated as believed to be necessary (providing some assurance for compiler quality and an assert trip in case we encounter a shader in the wild that breaks the convention). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:09:06 +00:00
Alyssa Rosenzweig	779e140b1a	panfrost: Dynamically set discard branch targets discard and discard_if are both implemented with the branching pipeline on Midgard; essentially, we branch to the end of the fragment shader in a special "discard" mode, setting the condition as necessary. Previously, we hardcoded the form of this instruction, which worked for very simple shaders but was incorrect for anything remotely interesting. This patch instead emits logical branches in the IR, which are flattened to real discard ops the same way other branches are, allowing targets to be computed correctly. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:08:59 +00:00
Alyssa Rosenzweig	5abb7b559e	panfrost/midgard: Emit extended branches Previously, we only emitted compact branches; however, the offset range of these branches is too small for many real world shaders. This patch implements support for emitting extended branches and switches to always using them for control flow. This incurs a code size and possibly performance penalty, but expands the range of working shaders and provides opportunity for further optimization. Support for emitting compact branches is retained but this code path is presently unused. In the future, we'll want to heuristically determine which type of branch should be emitted for optimal codegen. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:08:47 +00:00
Alyssa Rosenzweig	813bb34fd8	panfrost: Rectify doubleplusungood extended branch Midgard features "compact branches" and "extended branches", i.e. corresponds to short jumps and far jumps. The form of the extended branch was previously incorrect in the ISA headers; this patch corrects it and updates the disassembler (simultaneous to preserve bisectability). Additionally, we fix some a corner case in the disassembly of extended branches, and we now prefix extended branches with "brx", to visually differentiate from compact branches prefixed with "br". Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:07:39 +00:00
Alyssa Rosenzweig	2c74709517	panfrost/midgard: Fix nested/chained if-else An if-else statement is compiled to a conditional branch (from the start to the second block) and an unconditional branch (from the end of the first block to the end of the else). We previously incorrectly computed the block index of the unconditional branch to be exactly one after that of the conditional branch, valid for a single if-else statement but nothing fancier. This patch correctly computes the unconditional branch target, fixing more complex if-else chains. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:06:26 +00:00
Alyssa Rosenzweig	5e55c11a1b	panfrost/midgard: Refactor tag lookahead code Each Midgard instruction is scheduled to a particular instruction type ("tag"). Presumably the hardware prefetches memory based on tag, so it is required to report out the first tag to the command stream and the next tag of a branch target. This procedure was implemented in two separate parts of the compiler (one time with a slight bug relating to empty blocks); this patch refactors to unite the two routines and solve the bug when branching to empty blocks. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:05:59 +00:00
Alyssa Rosenzweig	396eb1440a	panfrost: Implement pantrace (command stream dump) Historically, Panfrost debugging entailed the use of the LD_PRELOADable `panwrap` tool. This setup is a tad fragile; Panfrost can be traced directly without the intermediate layer. pantrace implements the quivalent functionality of panwrap into Panfrost proper, allowing dumps to work regardless of the kernel layer in use. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:03:21 +00:00
Alyssa Rosenzweig	f611782045	panfrost: Add pandecode (command stream debugger) The `panwrap` utility can be LD_PRELOAD'd into a GLES app, intercepting communication between the driver and the kernel. Modern panwrap versions do no processing of their own; instead, they create a trace directory. This directory contains the following files: - control.log: a line-by-line plain text file, denoting important syscalls (mmaps and job submits) along with their arguments - memory_.bin, shader_.bin: binary dumps of mapped memory Together, these files contain enough information to reconstruct the command stream and shaders of (at minimum) a single frame. The `pandecode` utility takes this directory structure as input, reconstructing the mapped memory and using the job submit command as an entrypoint. It then walks the descriptors as the hardware would, parsing and pretty-printing. Its final output is the pretty-printed command stream interleaved with the disassembled shaders, suitable for driver debugging. For instance, the behaviour of two driver versions (one working, one broken) can be compared by diff'ing their decoded logs. pandecode/decode.c was originally a part of `panwrap`; it is the oldest living code in the project. Its history is generally not worth preserving. panwrap itself will continue to live downstream for the foreseeable future, as it is specifically written for the vendor kernel. It is possible, however, to produce equivalent traces directly from Panfrost, bypassing the intermediate wrapping layer for well-behaved drivers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 07:01:48 +00:00
Alyssa Rosenzweig	fb3bbd0c1c	panfrost: Stub out separate stencil functions This is not yet functional, but it resolves a crash in various apps and provides a framework for further work. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-21 06:58:50 +00:00
Marek Olšák	edbd2c1ff5	radeonsi: use SDMA for uploading data through const_uploader v2: use tc.stream_uploader in si buffer_transfer_map if not called from the driver thread Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-02-20 21:04:29 -05:00
Marek Olšák	54f7545cd7	gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with persistent mappings for radeonsi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-02-20 21:04:29 -05:00
Marek Olšák	dc8a2c139d	gallium/u_threaded: always unmap const_uploader radeonsi will require this. It's a no-op for drivers supporting persistent mappings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-02-20 21:04:29 -05:00
Marek Olšák	8ef6f68fa5	st/mesa: always unmap the uploader in st_atom_array.c This is a no-op for drivers supporting persistent mappings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-02-20 21:04:29 -05:00
Jason Ekstrand	1a93fc382b	nir/xfb: Handle compact arrays in gather_xfb_info This makes us properly handle gl_ClipDistance and gl_CullDistance. Fixes: `19064b8c` "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-21 00:08:42 +00:00
Jason Ekstrand	558c314504	nir/xfb: Work in terms of components rather than slots We needed to better handle cases where a chunk of a variable starts at some non-zero location_frac and rolls over into the next slot but may not be more than 4 dwords. For example, if gl_CullDistance is an array of 3 things and has location_frac = 2, it will span across two vec4s but is not, itself, bigger than a vec4. If you ignore the clip/cull special case, it's not allowed to happen for anything else because the only things that can span more than one slot is dvec3 and dvec4 and they're both bigger than a vec4. The current code uses this attrib_slot thing where we count attribute slots and iterate over them. However, that doesn't work in the case above because gl_CullDistance will have an attrib_slot count of 1 even though it does span two slots. We could fix this by adjusting attrib_slot but we already have comp_mask and it's easier to just handle it that way. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-21 00:08:42 +00:00
Jason Ekstrand	4e69fba534	nir: Rewrite lower_clip_cull_distance_arrays to do a lot less lowering Instead of going to all the work of to combine them into one array, just make two arrays and use location_frac to colocate them within CLIP0. Then the back-end can sort things out and stack them on top of each other. Thanks to `ef99f4c8`, we also don't need to set compact anymore. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-21 00:08:42 +00:00
Jason Ekstrand	8f0fe71cc5	nir/xfb: Properly align 64-bit values Fixes: `19064b8c` "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-21 00:08:42 +00:00
Jason Ekstrand	30b548fc62	compiler/types: Add a contains_64bit helper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-21 00:08:42 +00:00
Rob Clark	323958908e	freedreno/a6xx: samplerBuffer fixes Use the 'UNK31' bit (which should probably be called 'BUFFER') for samplerBuffer case, which increases the size of supported buffer texture beyond 2^15 elements. Also need to fix the 2nd coord injected to handle the tex instructions that take integer coords. Fixes dEQP-GLES31.functional.texture.texture_buffer.render.as_fragment_texture.buffer_size_131071 and similar Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	50dd773a2d	freedreno/ir3/a6xx: use ldib for ssbo reads ... instead of isam. It seems like when using isam, plus atomics, we can have the problem of old data being in the texture cache. Plus this way we don't have to load a component at a time. Note that blob still seems to use isam in some cases. I suppose it might be preferable in the case of loading a single component, when atomics are not in the picture (or that the ssbo does not need to otherwise be coherent). Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	c543a2cf6f	freedreno/ir3: sync instr/disasm and add ldib encoding Resync disasm and instr header from envytools, and add ldib encoding. This replaces an opcode from a3xx which was never seen in practice, since that seemed easier than dealing with the same opc # meaning a different thing on a6xx. (Not really sure if 'sti' was actually a real thing, I think it was only seen in fuzzing.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	cadf6def0c	freedreno/ir3/a6xx: fix load_ssbo barrier type. Silly copy/pasta bug, since load_image is actually the same instruction but different barrier class. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	0df0fc28a5	freedreno/ir3: rename put_dst() This was overlooked when it moved to ir3_context.c and ceased to be static.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	7fe9e790e7	freedreno: fix crash w/ masked non-SSA dst Fixes dEQP-GLES3.functional.shaders.indexing.varying_array.vec3_dynamic_write_dynamic_loop_read regression. Fixes: `c1a27ba9ba` freedreno/ir3: HIGH reg w/a for a6xx Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	8c486083d0	freedreno/a6xx: 3d and cube image fixes Fixes dEQP-GLES31.functional.image_load_store.{3d,cube}.store.* and a bunch more Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	97479df8aa	freedreno/ir3: fix crash in compile fail case The variant will be NULL if RA failed. Which isn't ideal, but at least lets not segfault and bring down the rest of the dEQP run with us. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Rob Clark	f5ee8c54ed	freedreno/ir3: fix legalize for vecN inputs The wrmask is handled in regmask_get()/regmask_set(), but it wasn't being propagated from SSA src to dst. So for example, an SSBO read value that is passed in as src2.y component to atomic op, wasn't getting the (sy) flag set. Causing lots of fail. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-20 18:50:08 -05:00
Bas Nieuwenhuizen	688f5e456a	radv: Disable depth clamping even without EXT_depth_range_unrestricted. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-20 23:24:31 +00:00
Bas Nieuwenhuizen	9f7e0523ce	radv: Implement VK_EXT_depth_clip_enable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-20 23:24:31 +00:00
Timothy Arceri	03783253b1	nir: remove non-ssa support from nir_copy_prop() Even in a very basic shader this reduces the time spent in nir_copy_prop() by ~17%. No shader-db changes for radeonsi NIR or i965. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-21 10:18:24 +11:00
Bas Nieuwenhuizen	1ef2855692	radv: Handle clip+cull distances more generally as compact arrays. Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-20 22:49:52 +00:00
Eric Anholt	8cfc17bdda	kmsro: Add the rest of the current set of tinydrm drivers. While I haven't tested them all, given that they're all using the same allocation paths and modifiers in the kernel they should be fine to use in the same way. v2: Rebase on other kmsro changes. v3: Skip repeated '[with_gallium_kmsro,' in the meson build. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-20 21:49:41 +00:00
Andrii Simiklit	f4f4ec941e	i965: re-emit index buffer state on a reset option change. Seems like we forget to update the index buffer (ib) status and IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART in some cases. The index buffer (ib) status should be re-emmited after the reset option change to avoid some unexpected behavior. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Signed-off-by: Andrii Simiklit <asimiklit.work@gmail.com>	2019-02-20 20:27:56 +02:00
Kenneth Graunke	d6337b59f6	nir: Don't forget if-uses in new nir_opt_dead_cf liveness check Commit `08bfd710a2`. (nir/dead_cf: Stop relying on liveness analysis) introduced a new check that iterated through a SSA def's uses, to see if it's used. But it only checked normal uses, and not uses which are part of an 'if' condition. This led to it thinking more nodes were dead than possible. Fixes Piglit's variable-indexing/tcs-output-array-float-index-wr test (and related tests) with the out-of-tree Iris driver. Fixes: `08bfd710a2` nir/dead_cf: Stop relying on liveness analysis Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 09:44:06 -08:00
Kristian H. Kristensen	b9eed05e7f	freedreno/a6xx: Support MSAA resolve blits on blitter This gets stencil and depth resolves working properly. Fixes: dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth24_stencil8 dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_color dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-20 08:56:21 -08:00
Kristian H. Kristensen	686211f4c9	freedreno/a6xx: Copy stencil as R8_UINT Blitter does support it after all. Previous attempt to use R8_UINT failed because we overwrote the a6xx format in emit_blit_texture(), but some of the later setup still looked at the gallium format. If we overwrite it in the pipe_blit_info before we even call into emit_blit_texture() it works properly. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-20 08:56:21 -08:00
Kristian H. Kristensen	e827ea8c83	freedreno: Update headers Add support for multisampled sources for the blitter. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-20 08:56:21 -08:00
Eric Engestrom	a16c398668	anv: use anv_shader_bin_write_to_blob()'s return value Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 16:40:13 +00:00
Eric Engestrom	d3115f34a6	anv: drop unused imports Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Eric Engestrom	8cbfcab425	anv: make sure the extensions stay sorted Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Eric Engestrom	bc76ce1033	anv: sort vendors extensions after KHR and EXT Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Eric Engestrom	427aa9d154	anv: sort extensions alphabetically Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 14:28:55 +00:00
Tapani Pälli	886cee1f96	anv: anv: refactor error handling in anv_shader_bin_write_to_blob() v2: blob manages error state internally, just return true if errors did not occur (Jason) CID: 1442546 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 15:39:19 +02:00
Carlos Garnacho	30a01cd923	wayland/egl: Ensure EGL surface is resized on DRI update_buffers() Fullscreening and unfullscreening a totem window while playing a video sometimes results in the video subsurface not changing size along. This is also reproducible with epiphany. If a surface gets resized while we have an active back buffer for it, the resized dimensions won't get neither immediately applied on the resize callback, nor correctly synchronized on update_buffers(), as the (now stale) surface size and currently attached buffer size still do match. There's actually 2 things to synchronize here, first the surface query size might not be updated yet to the wl_egl_window's (i.e. resize_callback happened while there is a back buffer), and second the wayland buffers would need dropping if new surface size differs with the currently attached buffer. These are done in separate steps now. https://bugzilla.redhat.com/show_bug.cgi?id=1650929 https://bugs.freedesktop.org/show_bug.cgi?id=109594 Fixes: `a9fb331ea7` ("wayland/egl: update surface size on window resize") Signed-off-by: Carlos Garnacho <carlosg@gnome.org> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Bastien Nocera <hadess@hadess.net> Tested-by: Denys Kostin <denys.kostin@globallogic.com>	2019-02-20 12:04:33 +01:00
Lionel Landwerlin	f509213675	anv: implement VK_EXT_depth_clip_enable A new extension allowing the user to explictly specify the clipping behavior. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-20 09:57:58 +00:00
Lionel Landwerlin	fa4e103c32	vulkan: Update the XML and headers to 1.1.101	2019-02-20 09:57:58 +00:00
Samuel Iglesias Gonsálvez	63a919a3ce	isl: remove the cache line size alignment requirement The cacheline size was a requirement for using the BLT engine, which we don't use anymore except for a few things on old HW, so we drop it. Fixes CTS's CL#3500 test: dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r8g8b8_unorm Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-20 08:28:31 +01:00
Bas Nieuwenhuizen	572854e706	radv: Clean up a bunch of compiler warnings. Random unused vars. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-20 03:21:09 +01:00
Bas Nieuwenhuizen	7631feaa00	radv: Sync ETC2 whitelisted devices. Fixes: `4bb6c49375` "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9." Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-02-20 02:55:41 +01:00
Timothy Arceri	3d7611e9a6	st/nir: use NIR for asm programs This uses prog_to_nir to translate ARB assembly programs to NIR. Co-authored by Tim Arceri, Dave Airlie, and Ken Graunke: - [Tim Arceri]: original patch - [Dave Airlie]: fix crashes with parameter names - [Ken Graunke]: - Rebase on SCALAR_ISA cap, lower wpos_ytransform too. - Rebase on streamout fixes. - Lower system values for fragcoord support. - Don't try to use prog_to_nir for ATI_fragment_shader programs. - Create TGSI for fixed-function or ARB vertex shaders even if the driver prefers NIR, so we can create draw module shaders for feedback/select emulation, which rely on TGSI. Tested on: - iris (Intel Skylake/Kabylake): Piglit & GL CTS - Ken Graunke - radeonsi (AMD Vega 64): Piglit - Ken Graunke - vc4/v3d - Piglit - Eric Anholt - freedreno - dEQP - Kristian Høgsberg Fixes lit_degenerate_case on vc4 and v3d, and vp-address-01, vp-arl-constant-array-huge-offset-neg, and vp-arl-neg-array on v3d. No Piglit regressions on radeonsi; no dEQP regressions on freedreno. Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 15:56:26 -08:00
Kenneth Graunke	3b4929ec6e	st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders. Even if the driver wants to use NIR shaders, we may need to have TGSI tokens for creating draw module vertex shaders for the feedback/select render modes. So...if the st_vertex_program has any TGSI...copy it to the variant. Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 15:56:19 -08:00
Kenneth Graunke	ba7519ca36	radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL leaves it undefined). Performing fpow lowering in NIR would break this behavior, preventing us from using prog_to_nir. According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>, which presumably does a zero-wins multiply. Lowering in NIR results in a non-legacy multiply, where: pow(0, 0) = 2^(log2(0) * 0) = 2^(-INF * 0) = 2^(-NaN) = -NaN which isn't the desired result. This reverts: - commit `d6b7539206` (ac/nir: remove emission of nir_op_fpow) - commit `22430224fe` (radeonsi/nir: enable lowering of fpow) and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir after enabling prog_to_nir in st/mesa later in this series. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 15:56:19 -08:00
Timothy Arceri	9c4d5926aa	radeonsi/nir: set shader_buffers_declared properly Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-20 10:46:19 +11:00
Timothy Arceri	94a3df62d7	radeonsi/nir: set colors_read properly shader-db results for VEGA64: Totals from affected shaders: SGPRS: 1976 -> 1976 (0.00 %) VGPRS: 1240 -> 1144 (-7.74 %) Spilled SGPRs: 145 -> 145 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 34632 -> 34604 (-0.08 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 261 -> 285 (9.20 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-20 10:46:19 +11:00
Timothy Arceri	05cc1dd764	radeonsi/nir: set input_usage_mask properly shader-db results for VEGA64: Totals from affected shaders: SGPRS: 791528 -> 792616 (0.14 %) VGPRS: 421624 -> 410784 (-2.57 %) Spilled SGPRs: 1639 -> 1674 (2.14 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 16103516 -> 16063696 (-0.25 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 136307 -> 137830 (1.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-20 10:46:19 +11:00
Timur Kristóf	9429bcc4b0	radeonsi/nir: Use uniform location when calculating const_file_max. The nine state tracker can produce NIR uniform variables whose location is explicitly set. radeonsi did not take that into account when calculating const_file_max, resulting in rendering glitches. This patch fixes that. Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-20 10:37:47 +11:00
Mario Kleiner	afb15d14ca	drirc: Add sddm-greeter to adaptive_sync blacklist. This is the sddm login screen. Fixes: `a9c36dbf9c` ("drirc: Initial blacklist for adaptive sync") Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-02-19 18:03:05 -05:00
Marek Olšák	bff8da6c59	driconf: add Civ6Sub executable for Civilization 6 I'm getting Civ6Sub instead of Civ6. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 17:59:17 -05:00
Marek Olšák	ae21bdf47c	radeonsi: always enable NIR for Civilization 6 to fix corruption Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 17:59:17 -05:00
Marek Olšák	ccbfe44e5f	radeonsi: add driconf option radeonsi_enable_nir Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 17:59:17 -05:00
Kenneth Graunke	f9c835eb56	mesa: Align doubles to a 64-bit starting boundary, even if packing. In the new Intel Iris driver, I am using Tim's new packed uniform storage system. It works great, with one caveat: our scalar compiler backend assumes that uniform offsets will be aligned to the underlying data type. For example, doubles must be 64-bit aligned, floats 32-bit, half-floats 16-bit, and so on. It does not need any other padding. Currently, _mesa_add_parameter aligns everything to 32-bit offsets, creating doubles that have an unaligned offset. This patch alters that code to align doubles to 64-bit offsets. This may be slightly less optimal for drivers which can support full packing, and allow reads from unaligned offsets at no penalty. We could make this extra alignment optional. However, it only comes into play when intermixing double and single precision uniforms. Doubles are already not too common, and intermixed values (floats then doubles) is probably even less common. At most, we burn a single 32-bit slot to the alignment, which is not that expensive. So, it doesn't seem worthwhile to add the extra complexity. Eventually, we'll likely want to update this code to allow half-float values to be packed tighter than 32-bit offsets. At that point, we'll probably want to revisit what drivers ultimately want, and add options. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 13:26:58 -08:00
Kenneth Graunke	3c2c6bd1c7	compiler: Make is_64bit(GL_*) helper more broadly available I'd like to use this in the prog_parameter.c code, so I need to move it into C, make it non-static, and so on. This probably isn't the ideal place for it, but I couldn't think of a better one. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 13:26:58 -08:00
Eric Engestrom	daf8ada08d	gitlab-ci: automatically run the CI on pushes to `ci/` branches Last commit limited the CI to master and MRs, but to avoid having to manually trigger CI runs, let's add a 3rd, automatic way: by pushing to a branch named `ci/` (or `ci-*` or just `ci`) (which you can delete afterwards, the pipeline results will remain). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-19 16:57:32 +00:00
Eric Engestrom	861ade7042	gitlab-ci: limit the automatic CI to master and MRs Runs on random other branches (stables RCs, personal forks) can still be triggered manually via the web interface, or an app using the API. This should massively help with the current voracious state of our CI. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-19 16:57:28 +00:00
Eric Engestrom	f84f833981	tegra/autotools: add missing libdrm cflags Fixes: `f1374805a8` "drm-uapi: use local files, not system libdrm" Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109647 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-19 13:29:05 +00:00
Eric Engestrom	b787403a21	tegra/meson: add missing dep_libdrm Fixes: `f1374805a8` "drm-uapi: use local files, not system libdrm" Bug: https://bugs.freedesktop.org/show_bug.cgi?id=109645 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-19 13:29:00 +00:00
Rhys Perry	238730daef	ac/nir: implement half-float nir_op_ldexp Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:04:46 +00:00
Rhys Perry	6971e8d342	ac/nir: implement half-float nir_op_frsq v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:04:41 +00:00
Rhys Perry	2038aec22a	ac/nir: implement half-float nir_op_frcp v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:04:35 +00:00
Rhys Perry	4261edc067	ac/nir: make ac_build_fdiv support 16-bit floats v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:04:29 +00:00
Rhys Perry	6790b3a8db	ac/nir: make ac_build_isign work on all bit sizes v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:04:20 +00:00
Rhys Perry	bbbfdef683	ac/nir: make ac_build_clamp work on all bit sizes v2: don't use ac_get_zerof() and ac_get_onef() v3: rename "intr" to "name" Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:03:58 +00:00
Rhys Perry	7e5004e30a	ac/nir: fix 64-bit nir_op_f2f16_rtz Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:03:44 +00:00
Rhys Perry	c4ea20c0a0	ac/nir: implement 8-bit nir_load_const_instr Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:03:33 +00:00
Rhys Perry	0ca550e01a	radv: ensure export arguments are always float So that the signature is correct and consistent, the inputs to a export intrinsic should always be 32-bit floats. This and the previous commit fixes a large amount crashes from dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_* tests Fixes: `b722b29f10` ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:03:22 +00:00
Rhys Perry	64065aa504	radv: bitcast 16-bit outputs to integers 16-bit outputs are stored as 16-bit floats in the outputs array, so they have to be bitcast. Fixes: `b722b29f10` ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-19 11:03:18 +00:00
Eric Engestrom	23b485c920	gitlab-ci: use ccache to speed up builds Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-19 10:09:51 +00:00
Eric Anholt	dbe3af67a4	v3d: Move i2b and f2b support into emit_comparison. This lets us save a resolve to NIR true/false for ifs and discard_if. No change in shader-db.	2019-02-18 18:18:37 -08:00
Eric Anholt	0bba9c8489	v3d: Emit a simpler negate for the iabs implementation. One program affected in my shader-db. instructions in affected programs: 110 -> 108 (-1.82%)	2019-02-18 18:13:09 -08:00
Eric Anholt	1a775d43c9	v3d: Delay emitting ldvpm on V3D 4.x until it's actually used. For V3D 3.x, we emitted the ldvpms all at the top so that we didn't need to do VPM setup when the load_inputs are out of order. For V3D 4.x, we can reduce register pressure by delaying our loads until they're actually needed. This also avoids a bunch of silly MOVs in the pre-opt VIR dump. total instructions in shared programs: 6421415 -> 6419933 (-0.02%) total uniforms in shared programs: 2393139 -> 2393140 (<.01%) total threads in shared programs: 153864 -> 153906 (0.03%)	2019-02-18 18:09:07 -08:00
Eric Anholt	5a84d46896	v3d: Stop tracking num_inputs for VPM loads. It's unused in the VS (since we need vattr_sizes[] anyway), so move it to FS prog data.	2019-02-18 18:09:07 -08:00
Eric Anholt	581eba072d	v3d: Add a function to describe what the c->execute.file check means. This is what pointed out that we were misusing the check for last_thrsw in the previous commit.	2019-02-18 18:09:07 -08:00
Eric Anholt	441294962c	v3d: Fix the check for "is the last thrsw inside control flow" The execute.file check used to be good enough, until I stopped setting up the execute mask for uniform ifs. No known tests fixed, noticed while doing a refactor. Fixes: `0805060573` ("v3d: Handle dynamically uniform IF statements with uniform control flow.")	2019-02-18 18:09:07 -08:00
Eric Anholt	07d5b5a972	v3d: Fix f2b32 behavior. Now that we don't have the vir_PF() magic, it's obvious that we were doing the wrong thing for f2b32 by allowing -0.0 to produce true instead of false.	2019-02-18 18:09:07 -08:00
Eric Anholt	3022b4bd82	v3d: Kill off vir_PF(), which is hard to use right. You were allowed to pass in any old temp so that you could hopefully fold the PF up into the def of the temp. If we couldn't find one, it implicitly generated a MOV(nop, reg). However, that PF could have different behavior depending on whether the def being folded into was a float or int opcode, which the caller doesn't necessarily control. Due to the fragility of the function, just switch all callers over to vir_set_pf(). This also encourages the callers to use a _dest call for the inst they're putting the PF on, eliminating a bunch of temps in the pre-optimization VIR. shader-db says the change is in the noise: total instructions in shared programs: 6226247 -> 6227184 (0.02%) instructions in affected programs: 851068 -> 852005 (0.11%)	2019-02-18 18:09:06 -08:00
Eric Anholt	6186a8d44e	v3d: Do bool-to-cond for discard_if as well. Turns this minimal conditional discard (glsl-fs-discard-01.shader_test): 0x3de0b086c5fe9000 fcmp.pushn -, r1, r5; mov r2, 0 0x3dec3086bbfc001f nop ; mov.ifa r2, -1 0x3c047186bbe80000 nop ; mov.pushz -, r2 0x3dea3186ba837000 setmsf.ifna -, 0 ; nop into: 0x3c00b186c582a000 fcmp.pushn -, r2, r5; nop 0x3de83186ba837000 setmsf.ifa -, 0 ; nop total instructions in shared programs: 6229820 -> 6226247 (-0.06%)	2019-02-18 18:09:06 -08:00
Eric Anholt	718eef62cb	v3d: Refactor bcsel and if condition handling. Both were doing the same thing to try to get a condition to predicate on. Noticed when I wanted to do this for discard_if as well. No change in shader-db.	2019-02-18 18:09:06 -08:00
Eric Anholt	4586f9f902	v3d: Add a helper function for getting a nop register. Just a little refactor to explain what's going on with QFILE_NULL.	2019-02-18 18:09:06 -08:00
Eric Anholt	339155122b	v3d: Drop our hand-lowered nir_op_ffract. The NIR lowering works fine, though it causes some slight noise due to what looks like choices about propagating constants up multiply chains changing. total instructions in shared programs: 6229671 -> 6229820 (<.01%) total uniforms in shared programs: 2312171 -> 2312324 (<.01%)	2019-02-18 18:09:06 -08:00
Eric Anholt	16f5085490	v3d: Drop a perf note about merging unpack_half_*, which has been implemented. This is handled with copy-propagation now.	2019-02-18 18:09:06 -08:00
Eric Anholt	146e432b49	v3d: Fix incorrect flagging of ldtmu as writing r4 on v3d 4.x. Fixes some stalls in 3DMMES's main vertex shader. total instructions in shared programs: 6280751 -> 6211270 (-1.11%) instructions in affected programs: 2935050 -> 2865569 (-2.37%)	2019-02-18 18:09:06 -08:00
Eric Anholt	cd5e0b2729	v3d: Use the early_fragment_tests flag for the shader's disable-EZ field. Apparently we need disable-EZ flagged, not just "does Z writes". Fixes dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo on 7278, even though it passed in simulation. Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: `051a41d3d5` ("v3d: Add support for the early_fragment_tests flag.")	2019-02-18 18:09:06 -08:00
Eric Anholt	332b969c4e	v3d: Sync indirect draws on the last rendering. Fixes intermittent fails in dEQP-GLES31.functional.draw_indirect.compute_interop.separate.drawelements_compute_cmd_and_data_and_indices and others (particularly when run as part of a CTS run)	2019-02-18 18:09:06 -08:00
Eric Anholt	32f16b0b1e	v3d: Clear the GMP on initialization of the simulator. Otherwise, we might have pages accessible that shouldn't be and miss out on errors. This is unlikely for most tests since v3d_hw_get_mem() is big enough that it'll be a freshly zeroed mmap, but if screens are destroyed and recreated then we'd be reusing the old v3d_hw_get_mem() contents.	2019-02-18 18:09:06 -08:00
Emil Velikov	ba652394a3	docs: update calendar, add news item and link release notes for 18.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-18 18:38:14 +00:00
Emil Velikov	d7108dac73	docs: add sha256 checksums for 18.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `bfb5bdaa97`)	2019-02-18 18:36:23 +00:00
Emil Velikov	a1ccff4aaf	docs: add release notes for 18.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b26488dead`)	2019-02-18 18:36:21 +00:00
Ilia Mirkin	57441af8bf	i965: always enable EXT_float_blend From the table in isl_format.c, it appears that all generations support blending on 32-bit float surfaces. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-18 12:13:54 -05:00
Ilia Mirkin	9fec653093	st/mesa: enable GL_EXT_float_blend when possible If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip EXT_float_blend on (which will affect ES3 contexts). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-18 12:13:54 -05:00
Ilia Mirkin	070a5e5d92	mesa: add explicit enable for EXT_float_blend, and error condition If EXT_float_blend is not supported, error out on blending of FP32 attachments in an ES2 context. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-18 12:13:54 -05:00
Samuel Pitoiset	47616810ed	radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled This version is better and safer. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-18 18:06:07 +01:00
Rob Clark	d6c43cceff	freedreno/ir3: handle quirky atomic dst for a6xx The new encoding returns a value via the 2nd src. The legalize pass needs to be aware of this to set the correct needs_sy flag, otherwise we can, in cases where the atomic dst is not used, overwrite the register that hardware will asynchronously load result into without (sy) flag, so it gets clobbered by the atomic result. This fixes a whole lot of rando ssbo+atomic fails, like dEQP-GLES31.functional.ssbo.layout.single_basic_type.packed.highp_vec4. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-18 12:01:36 -05:00
Rob Clark	28fc6733cd	freedreno/a6xx: fix helper_invocation (sampler mask/id) Since gl_HelperInvocation is lowered to: !((1 << sample_id) & sample_mask_in)) Not setting these enable bits was causing it be broken. (And probably a bunch of other stuff too.) Fixes dEQP-GLES31.functional.shaders.helper_invocation.* Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-18 10:37:54 -05:00
Samuel Pitoiset	32ab7a59bb	radv: remove unused variable in gather_push_constant_info() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-18 13:30:16 +01:00
Lionel Landwerlin	8c87d029bc	i965: scale factor changes should trigger recompile Found by inspection. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3da858a6b9` ("intel/compiler: add scale_factors to sampler_prog_key_data") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-18 12:18:13 +00:00
Samuel Pitoiset	0d8f096293	radv: write the alpha channel of MRT0 when alpha coverage is enabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-18 12:14:22 +01:00
Samuel Pitoiset	2cf5433b99	ac: use new LLVM 8 intrinsic when loading 16-bit values Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-18 12:14:20 +01:00
Samuel Pitoiset	f0223143a8	ac: add ac_build_llvm8_tbuffer_load() helper It uses the new LLVM intrinsics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-18 12:14:17 +01:00
Tapani Pälli	9762a9f893	mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment This fixes invalid access to Attachment array which would occur if caller would exceed MaxColorAttachments. In practice this should not ever happen because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be valid and InvalidateFramebuffer will error out before but this should make coverity happy. v2: const, remove _EXT (Ian) CID: 1442559 Fixes: `0c42b5f3cb` "mesa: wire up InvalidateFramebuffer" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-18 07:51:55 +02:00
Alyssa Rosenzweig	2c6a7fbeb7	panfrost: Fix clipping region Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:13:50 +00:00
Alyssa Rosenzweig	fa1b36ddc2	panfrost: Preserve w sign in perspective division This fixes issues where polygons that should be culled (due to negative w, for instance) may not be. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:13:34 +00:00
Alyssa Rosenzweig	49985cebea	panfrost: Cleanup mali_viewport (clipping) code Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:13:03 +00:00
Alyssa Rosenzweig	a94463732a	panfrost: Swap order of tiled texture (de)alloc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:10:33 +00:00
Alyssa Rosenzweig	4a4ed53c01	panfrost: Free imported BOs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:10:06 +00:00
Alyssa Rosenzweig	b5a01296f4	panfrost: Fix various leaks unmapping resources v2: Don't check for NULL before free() Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:09:41 +00:00
Kenneth Graunke	535251487b	nir: Don't reassociate add/mul chains containing only constants The idea here is to reassociate a * (b * c) into (a * c) * b, when b is a non-constant value, but a and c are constants, allowing them to be combined. But nothing was enforcing that 'b' must be non-constant, which meant that running opt_algebraic in a loop would never terminate if the IR contained non-folded constant expressions like 256 * 0.5 * 2. Normally, we call constant folding in such a loop too, but IMO it's better for nir_opt_algebraic to be robust and not rely on that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581 Fixes: `32e266a9a5` i965: Compile fp64 funcs only if we do not have 64-bit hardware support Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-16 23:36:14 -08:00
Chris Wilson	e9882b879b	i965: Assert the execobject handles match for this device Object handles are local to the device fd, so double check we are not mixing together objects from multiple screens on execbuf submission. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-16 23:35:29 -08:00
Rob Clark	99b90ecd35	freedreno/a6xx: cache flush harder Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	1af0c5d320	freedreno/a6xx: compute support Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	5118dcf8c3	freedreno/a6xx: image/ssbo state emit Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	2183d9cff7	freedreno/a6xx: border-color offset helper Soon we'll need this logic to deal w/ image/SSBO case, so split out a helper rather than duplicate the logic. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	c1a27ba9ba	freedreno/ir3: HIGH reg w/a for a6xx It seems like some instructions (noticed this w/ cat3), cannot read HIGH regs.. cat1 (mov/cov) can, and possibly some/all of cat2. The blob seems to stick w/ an extra mov into low regs. So lets do the same. This fixes WGID on a6xx, which unsurprisingly is related to a lot of deqp compute fails. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	947848524d	freedreno/ir3: add a6xx+ SSBO/image support Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	b46d5b8a84	freedreno/ir3: add a6xx instruction encoding For the handful of instructions that use a new encoding. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	2e0ea3f09c	freedreno/ir3: add image/ssbo <-> ibo/tex mapping Images and SSBOs don't map directly to the hw. They end up being part texture and part something else. Starting with a6xx, the hack used for a5xx to smash the image tex state into hw texture state starting from MAX counting down won't work, because we start using tex state also for SSBO read. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	75f3a5245e	freedreno/ir3: fix ncomp for _store_image() src Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	feee3050d3	freedreno/ir3: split out a4xx+ instructions Note that image/ssbo support is currently only implemented for a5xx. But the instruction encoding is the same for a4xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	42af0640f6	freedreno/ir3: split out image helpers Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	aefdb9bed2	freedreno/a6xx: clean up some open-coded bits Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	b51de44dea	freedreno/a6xx: move stream-out emit to helper Split out of the main fd6_emit() code, since it was already getting to be a pretty giant function. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:26:14 -05:00
Rob Clark	c0d6be11d6	freedreno/ir3: fix varying packing vs. tex sharp edge We probably need to rethink how we detect which instruction first defines higher register classes. But for now, this at least fixes the symptom. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:26:14 -05:00
Samuel Pitoiset	52bdb043af	radv: fix invalid element type when filling vertex input default values The elements added into a vector should have the same type as the first one, otherwise this hits an assertion in LLVM. Fixes: `4b3549c084` ("radv: reduce the number of loaded channels for vertex input fetches") reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-16 15:33:18 +01:00
Eleni Maria Stea	7188e2ba15	i965: Removed the field etc_format from the struct intel_mipmap_tree After the previous changes to emulate the ETC/EAC formats using the secondary shadow miptree, the etc_format field of the intel_mipmap_tree struct became redundant and the remaining check that used it has been replaced. (Nanley Chery) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-02-15 15:54:41 -08:00
Eleni Maria Stea	248f2e7888	i965: Enabled the OES_copy_image extension on Gen 7 GPUs OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-02-15 15:54:41 -08:00
Eleni Maria Stea	db0c379c06	i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) v3: - As we now update the tree before the rendering we don't need to copy the data during the unmap anymore. Removed the unnecessary update from the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery) v4: - Fixed unrelated empty line removal (Nanley Chery) - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only called inside its following function, we don't need to declare it at the top of the file anymore. (Nanley Chery) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-02-15 15:54:41 -08:00
Eleni Maria Stea	d8eb7287fe	i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) v4: - Removed the functions intel_miptree_(map\|unmap)_etc and the check if we need to call them as with the new changes, they became unreachable. (Nanley Chery) - We'd rather calculate the level width and height using the shadow miptree instead of the main in intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Fixed the format in the mt_surface_usage, set at the miptree creation, in miptree_create of intel_mipmap_tree.c (Nanley Chery) v5: - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery) - Update the flag shadow_needs_update outside the function intel_miptree_update_etc_shadow (Nanley Chery) - Fixed indentation error (Nanley Chery) v6: - Fixed typo in commit message (Nanley Chery) - Simplified the assignment of the mt_fmt in the miptree_create of the intel_mipmap_tree.c (Nanley Chery) - Combined declarations and assignments where it was possible in the intel_miptree_update_etc_shadow and intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c (Nanley Chery) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-02-15 15:54:41 -08:00
Nanley Chery	c6dada70f0	i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea <estea@igalia.com>	2019-02-15 15:54:41 -08:00
Timothy Arceri	a801196ec9	nir: remove simple dead if detection from nir_opt_dead_cf() This was probably useful when it was first written, however it looks to be no longer necessary. As far as I can tell these days dce is smart enough to remove useless instructions from if branches. Once this is done nir_opt_peephole_select() will end up removing the empty if. Removing this support reduces the dolphin uber shader compilation time spent in nir_opt_dead_cf() by a little over 7x. No shader-db changes on i965 or radeonsi. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-02-16 10:45:31 +11:00
Alok Hota	f695e43354	swr/rast: Add translation support to streamout Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:29 -06:00
Alok Hota	a7fa0cc0a5	swr/rast: simdlib cleanup, clipper stack space fixes Reduce stack space used by clipper, which had lead to crashes in some versions for MSVC Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:23 -06:00
Alok Hota	f9c29a301a	swr/rast: convert DWORD->uint32_t, QWORD->uint64_t Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:19 -06:00
Alok Hota	c503b58878	swr/rast: Refactor scratch space variable names Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:14 -06:00
Alok Hota	0b4db43705	swr/rast: FP consistency between POSH/RENDER pipes - Ensure all threads have optimal floating-point control state - Disable auto-generation of fused FP ops for VERTEX shader stage - Disable "fast" FP ops for VERTEX shader stage Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:09 -06:00
Alok Hota	dc7b3c95a4	swr/rast: Move knob defaults to generated cpp file Reduces amount of compile churn when testing different default values Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:04 -06:00
Alok Hota	05e4ff33f5	swr/rast: Flip BitScanReverse index calculation The intrinsic returns the number of leading zeros, not the bit number of the first nonzero, so just flip it based on the mask size Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:53:58 -06:00
Alok Hota	ae400a9b11	swr/rast: Correctly align 64-byte spills/fills Fixes crashes on some compute shaders when running on AVX512 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:53:54 -06:00
Alok Hota	78bab66479	swr/rast: Disable use of __forceinline by default - Was not useful to inline in release builds - FORCEINLINE can be used if absolutely necessary Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:52:51 -06:00
Alok Hota	20d5c88760	swr/rast: Convert system memory pointers to gfxptr_t Fulfills an unused internal interface Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:52:32 -06:00
Bas Nieuwenhuizen	4b03a19a0b	radv: Use correct num formats to detect whether we should be use 1.0 or 1. normalized and scaled formats also return floats. Fixes: `4b3549c084` ("radv: reduce the number of loaded channels for vertex input fetches") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-15 20:24:16 +00:00
Ian Romanick	979b43b347	nir/algebraic: Simplify comparison with sequential integers starting with 0 All of the affected shaders are Unreal4 demos. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15437170 -> 15437001 (<.01%) instructions in affected programs: 21536 -> 21367 (-0.78%) helped: 43 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.93 x̃: 4 helped stats (rel) min: 0.68% max: 1.01% x̄: 0.80% x̃: 0.80% 95% mean confidence interval for instructions value: -4.07 -3.79 95% mean confidence interval for instructions %-change: -0.83% -0.77% Instructions are helped. total cycles in shared programs: 383007896 -> 383007378 (<.01%) cycles in affected programs: 158640 -> 158122 (-0.33%) helped: 38 HURT: 4 helped stats (abs) min: 1 max: 48 x̄: 13.89 x̃: 6 helped stats (rel) min: 0.03% max: 1.01% x̄: 0.33% x̃: 0.19% HURT stats (abs) min: 2 max: 3 x̄: 2.50 x̃: 2 HURT stats (rel) min: 0.06% max: 0.09% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -16.90 -7.77 95% mean confidence interval for cycles %-change: -0.39% -0.19% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8213746 -> 8213745 (<.01%) instructions in affected programs: 127 -> 126 (-0.79%) helped: 1 HURT: 0 total cycles in shared programs: 187734146 -> 187734144 (<.01%) cycles in affected programs: 2132 -> 2130 (-0.09%) helped: 1 HURT: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-15 11:11:02 -08:00
Ian Romanick	ad05920258	nir/algebraic: Convert some f2u to f2i Section 5.4.1 (Conversion and Scalar Constructors) of the GLSL 4.60 spec says: It is undefined to convert a negative floating-point value to an uint. Assuming that (uint)some_float behaves like (uint)(int)some_float allows some optimizations in the i965 backend to proceed. This basically undoes the small amount of damage done by "intel/compiler: Avoid propagating inequality cmods if types are different". v2: Replicate part of the commit message as a comment in the code. Suggested by Jason. shader-db results compairing before "intel/compiler: Avoid propagating inequality cmods if types are different" and after this commit: Skylake total cycles in shared programs: 383007996 -> 383007896 (<.01%) cycles in affected programs: 85208 -> 85108 (-0.12%) helped: 13 HURT: 8 helped stats (abs) min: 2 max: 26 x̄: 10.77 x̃: 6 helped stats (rel) min: 0.09% max: 0.65% x̄: 0.28% x̃: 0.14% HURT stats (abs) min: 2 max: 12 x̄: 5.00 x̃: 3 HURT stats (rel) min: 0.04% max: 0.32% x̄: 0.12% x̃: 0.07% 95% mean confidence interval for cycles value: -9.31 -0.21 95% mean confidence interval for cycles %-change: -0.24% <.01% Cycles are helped. Broadwell total cycles in shared programs: 415251194 -> 415251370 (<.01%) cycles in affected programs: 83750 -> 83926 (0.21%) helped: 7 HURT: 13 helped stats (abs) min: 10 max: 12 x̄: 11.43 x̃: 12 helped stats (rel) min: 0.30% max: 0.30% x̄: 0.30% x̃: 0.30% HURT stats (abs) min: 2 max: 36 x̄: 19.69 x̃: 22 HURT stats (rel) min: 0.05% max: 0.89% x̄: 0.44% x̃: 0.47% 95% mean confidence interval for cycles value: 0.76 16.84 95% mean confidence interval for cycles %-change: <.01% 0.37% Inconclusive result (%-change mean confidence interval includes 0). Haswell total instructions in shared programs: 13823885 -> 13823886 (<.01%) instructions in affected programs: 2249 -> 2250 (0.04%) helped: 0 HURT: 1 total cycles in shared programs: 390094243 -> 390094001 (<.01%) cycles in affected programs: 85640 -> 85398 (-0.28%) helped: 15 HURT: 6 helped stats (abs) min: 4 max: 26 x̄: 18.53 x̃: 18 helped stats (rel) min: 0.09% max: 0.66% x̄: 0.47% x̃: 0.42% HURT stats (abs) min: 2 max: 14 x̄: 6.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.37% x̄: 0.15% x̃: 0.04% 95% mean confidence interval for cycles value: -17.36 -5.69 95% mean confidence interval for cycles %-change: -0.44% -0.14% Cycles are helped. Ivy Bridge total cycles in shared programs: 180986448 -> 180986552 (<.01%) cycles in affected programs: 34835 -> 34939 (0.30%) helped: 0 HURT: 10 HURT stats (abs) min: 2 max: 18 x̄: 10.40 x̃: 10 HURT stats (rel) min: 0.06% max: 0.36% x̄: 0.28% x̃: 0.30% 95% mean confidence interval for cycles value: 4.67 16.13 95% mean confidence interval for cycles %-change: 0.20% 0.35% Cycles are HURT. Sandy Bridge total cycles in shared programs: 154603969 -> 154603970 (<.01%) cycles in affected programs: 171514 -> 171515 (<.01%) helped: 25 HURT: 14 helped stats (abs) min: 1 max: 4 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.02% max: 0.10% x̄: 0.04% x̃: 0.04% HURT stats (abs) min: 1 max: 8 x̄: 3.29 x̃: 3 HURT stats (rel) min: 0.03% max: 0.28% x̄: 0.10% x̃: 0.11% 95% mean confidence interval for cycles value: -0.91 0.96 95% mean confidence interval for cycles %-change: -0.02% 0.04% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-15 11:11:02 -08:00
Matt Turner	ac21dd4aee	intel/compiler/test: Add unit test for mismatched signedness comparison v2 (idr): Move adding the test to after adding the fix. Reordering the two commits prevents possible headaches for git-bisect with scripts that always do 'ninja check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 11:11:02 -08:00
Matt Turner	2dff9a66b6	intel/compiler: Avoid propagating inequality cmods if types are different v2: Fix silly bug in logic. s/\|\|/&&/ All but one of the affected shaders is in an Unreal4 demo. The other is in Tomb Raider. All of the cases that Ian investigated appear to be sequences like the following if (int(uint(some_float)) < 0) /* other relations too */ ... At least in Tomb Raider, it's not obvious that this sequence came from the original shader. In some of the Unreal demos, the shader contains code like if (int(uint(textureLod(...))) > 0) ... which explicitly generates the offending sequence. All Gen6+ platforms had similar results (Skylake shown): total instructions in shared programs: 15437170 -> 15437187 (<.01%) instructions in affected programs: 4492 -> 4509 (0.38%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.57% 0.75% Instructions are HURT. total cycles in shared programs: 383007996 -> 383007992 (<.01%) cycles in affected programs: 20542 -> 20538 (-0.02%) helped: 6 HURT: 7 helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6 helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for cycles value: -3.30 2.69 95% mean confidence interval for cycles %-change: -0.19% 0.19% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: nagrigoriadis@gmail.com Tested-by: Danylo Piliaiev <danylo.piliaiev@gmail.com>	2019-02-15 11:11:02 -08:00
Matt Turner	e50db60d16	intel/compiler/test: Set devinfo->gen = 7 We emit an FBL instruction which only exists since Gen7. This prevents the test from segfaulting when run with TEST_DEBUG=1. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 11:11:02 -08:00
James Zhu	9364d66cb7	gallium/auxiliary/vl: Add video compositor compute shader render Add compute shader initilization, assign and cleanup in vl_compositor API. Set video compositor compute shader render as default when pipe support it. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	f6ac0b5d71	gallium/auxiliary/vl: Add compute shader to support video compositor render Add compute shader to support video compositor render. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	299e2bc046	gallium/auxiliary/vl: Rename csc_matrix and increase its size. Rename csc_matrix to shader_params, and increase shader_params size to store more constants for compute shader, Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	7b7b5f2029	gallium/auxiliary/vl: Split vl_compositor graphic shaders from vl_compositor API Split vl_compositor graphic shaders from vl_compositor API in order to share vl_compositor API with vl_compositor compute shader later. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	b34d7c5daa	gallium/auxiliary/vl: Move dirty define to header file Move dirty define to header file to share with compute shader. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
Juan A. Suarez Romero	1fb24080b7	nir: remove jump from two merging jump-ending blocks In opt_peel_initial_if optimization, when moving the continue list to end of the continue block, before the jump, could happen that the continue list itself also ends with a jump. This would mean that we would have two jump instructions in a row: the first one from the continue list and the second one from the contine block. As inserting an instruction after a jump is not allowed (and it does not make sense, as it will not be executed), remove the jump from the continue block and keep the one from continue list, as it will be executed first. CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-15 15:16:24 +01:00
Juan A. Suarez Romero	69be9934a7	nir: move ALU instruction before the jump instruction opt_split_alu_of_phi moves ALU instruction to the end of continue block. But if the continue block ends with a jump instruction (an explicit "continue" instruction) then the ALU must be inserted before the jump, as it is illegal to add instructions after the jump. CC: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0881e90c09` ("nir: Split ALU instructions in loops that read phis") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 15:14:36 +01:00
Andres Gomez	a43596df62	mesa: INVALID_VALUE for wrong type or format in ClearBufferData Instead of generating a GL_INVALID_ENUM error when the type or format is incorrect while using glClear{Named}Buffer{Sub}Data, generate GL_INVALID_VALUE. From page 72 (page 94 of the PDF) of the OpenGL 4.6 spec: " An INVALID_VALUE error is generated if type is not one of the types in table 8.2. An INVALID_VALUE error is generated if format is not one of the formats in table 8.3." Fixes the following test: KHR-GL45.direct_state_access.buffers_errors v2: correct the doxygen documentation. Cc: Pi Tabred <servuswiegehtz@yahoo.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-15 14:28:06 +02:00
Gurchetan Singh	67426ccd42	virgl: use virgl_transfer_inline_write even less We've noticed the Team Fortress 2 engine seems to do many small calls to glSubData(..). Let's pick our heuristic based on the resource base width, not the size of a particular upload. This will cause transfers to be batched together in the transfer queue. Revelant glbench microbenchmark -- Before: buffer_upload_dynamic_element_array_131072 = 131.17 mbytes_sec After: buffer_upload_dynamic_element_array_131072 = 6828.24 mbytes_sec Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	f0e71b1088	virgl: use transfer queue This improves Unigine Valley benchmark by 3 to 10 fps (depending on the scene). It also improves the Team Fortress 2 benchmark from 6 fps to 13 fps (host: 20 fps). Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	4a7857b377	virgl: introduce transfer queue Transfers will be placed here at unmap time instead of incurring a VM exit. There's an attempt to deduplicate intersecting 1D transfers, which are surprisingly common. This can also help with mipmapped texture upload and smaller textures, where the majority of the time is spent in the guest kernel / QEMU -- not virglrenderer. This is shown by the GLbench texture upload benchmark: Before: texture_upload_rgba_teximage2d_32 = 64.23 mtexel_sec After: texture_upload_rgba_teximage2d_32 = 367.44 mtexel_sec v2: Split up list iteration functions (@gerddie) v3: Support for optimizing glBufferSubData Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	9c4930946a	virgl: add encoder functions for new protocol Let's encode the new protocol with new helper functions. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	5510cc67e0	virgl: make winsys modifications for encoded transfers The idea is to have two command buffers: 1) One for transfers 2) One for commands, which can include transfers At flush time, (2) will be filled. Otherwise, (1) will be used to submit transfers if there are enough of them. v2: Pass size directly to cmd_buf_create (@gerddie) Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	90e9650585	virgl: add extra checks in virgl_res_needs_flush_wait This is motivated by the following scenario: glSubBufferData(GL_ARRAY_BUFFER, ...) glFlush(..) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) This increases @davidriley's Team Fortress 2 apitrace from 1 fps to 6 fps and helps with the Chromium glbench microbenchmarks: Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec buffer_upload_dynamic_array_12 = 0.02 mbytes_sec buffer_upload_dynamic_array_576 = 1.07 mbytes_sec After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec buffer_upload_dynamic_array_12 = 2.22 mbytes_sec buffer_upload_dynamic_array_576 = 164.89 mbytes_sec Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	ab6ea6e9ce	virgl: pass virgl transfer to virgl_res_needs_flush_wait Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	d98fbd9c92	virgl: keep track of number of computations It's good to keep track of these things. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	35515985a9	virgl: limit command length to 16 bits Much of our logic is based around the idea the upper 16 bits of a command dword can encode the length of the command. Now that the command buffer >= 2^16 - 1, we should check for this. v2: alignment, and only check VIRGL_ENCODE_MAX_DWORDS Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	503ffe46bb	virgl: use virgl_transfer in inline write Let's define a helper function and use it. This commit also allows resources to be emitted into different command buffers. Like the ioctls, send 0 for layer_stride and stride. If we actually send the real values, there are various assumptions in virglrenderer for non-1D buffers that may need to be modified. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	0fcd48bac5	virgl: add protocol for resource transfers Mostly similar to VIRGL_CCMD_RESOURCE_INLINE_WRITE. However, this uses the resource's already attached iovecs rather than the command buffer to transfer the data. v2: Used (1 << 16) not (1 << 15) [@gerddie] Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	168c3ffce3	virgl: when creating / freeing transfers, pass slab pool directly This will allow us to destroy transfers w/o having a pointer to the context. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	d5c2dacc15	virgl: unmap uploader at flush time This should save some memory when allocating and freeing transfers. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	14f265b533	virgl: make alignment smaller when uploading index user buffers Since we're just uploading to guest memory, let's just align to dword size. Fixes: e0f932 ("u_upload_mgr: pass alignment to u_upload_data manually") Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	7626e6e189	virgl: track level cleanliness rather than resource cleanliness This allows a minor optimization for texture upload. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	c19aedcf1a	virgl: don't mark unclean after a flush The guest memory is still clean until host GL touches it, which we should track elsewhere. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	5b6a2ae987	virgl: use virgl_resource_dirty helper Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	1d294ad264	virgl: add ability to do finer grain dirty tracking There are levels to cleanliness. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Alyssa Rosenzweig	acc52fff20	panfrost: Improve logging and patch memory leaks Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:47:54 +00:00
Alyssa Rosenzweig	c70ed4ca18	panfrost: Don't align framebuffer dims Fixes regressions with EGL clients Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:46:30 +00:00
Alyssa Rosenzweig	5155bcf099	panfrost: Implement PIPE_QUERY_OCCLUSION_COUNTER Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:46:02 +00:00
Alyssa Rosenzweig	2d22b5380c	panfrost: Identify MALI_OCCLUSION_PRECISE bit Setting this is required for desktop-style occlusion queries. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:45:56 +00:00
Tapani Pälli	595af46f0f	drirc/i965: add option to disable 565 configs and visuals We have cases where we would not like to expose these. v2: call the option allow_rgb565_configs for consistency with existing allow_rgb10_configs (Eric, Jason) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-15 09:38:36 +02:00
Alyssa Rosenzweig	97aa05470a	panfrost: Backport driver to Mali T600/T700 There are a few differenes between Mali T860 (Panfrost's primary reference target) and the older Midgard generations (T600/T700): - Miscellaneous different magic numbers. It's not clear what these numbers mean on either the old or new configurations yet. - Errata fixes. T800 is the final Midgard generation and presumably the least buggy. Older Midgard has some extra hardware errata we have to workaround. - SFBD vs MFBD split. Essentially, older Midgard use a Single FrameBuffer Descriptor (SFBD), which corresponds to single render-target rendering. Newer Midgard (T760+) use a Multiple FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these descriptors serve the same function, but we implement both, depending on the version of the hardware. - CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and vice versa for 64-bit. Our target T760 systems are 32-bit whereas our target T860 systems are 64-bit. More work is needed in this area. This patch fixes support in these areas for supporting older Midgard hardware. It is tested on Mali T760 and Mali T860. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:22:42 +00:00
Alyssa Rosenzweig	f96e871c26	panfrost: Fix build; depend on libdrm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-15 07:19:43 +00:00
Jason Ekstrand	08bfd710a2	nir/dead_cf: Stop relying on liveness analysis The liveness analysis pass is fairly expensive because it has to build large bit-sets and run a fix-point algorithm on them. Instead of requiring liveness for detecting if values escape a CF node, just take advantage of the structured nature of NIR and use block indices instead. This only requires the block index metadata which is the fastest we have metadata to generate. No shader-db changes on Kaby Lake Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-14 23:06:29 -06:00
Jason Ekstrand	b50465d197	nir/dead_cf: Inline cf_node_has_side_effects We want to handle live SSA values differently and it's going to involve walking the instructions. We can make it a single instruction walk if we combine it with cf_node_has_side_effects. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-14 23:05:28 -06:00
Jason Ekstrand	367b0ede4d	intel/fs: Bail in optimize_extract_to_float if we have modifiers This fixes a bug in runscape where we were optimizing x >> 16 to an extract and then negating and converting to float. The NIR to fs pass was dropping the negate on the floor breaking a geometry shader and causing it to render nothing. Fixes: `1f862e923c` "i965/fs: Optimize float conversions of byte/word..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-02-14 23:02:44 -06:00
Ilia Mirkin	8c859367df	swr: set PIPE_CAP_MAX_VARYINGS correctly Unfortunately swr was missed in the original commit. The number of varyings should generally match up to what's reported as the shader caps for fragment inputs. Fixes: `6010d7b8e8` (gallium: add PIPE_CAP_MAX_VARYINGS) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alok Hota <alok.hota@intel.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-14 20:29:36 -05:00
Jason Ekstrand	5064464931	intel/fs: Silence a compiler warning Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:47 -06:00
Jason Ekstrand	9b202239ba	anv: Silence some compiler warnings in release builds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:45 -06:00
Jason Ekstrand	cd60c995a6	anv/blorp: Delete a pointless assert Just a little higher up in the function we assert that the aspect masks are actually equal so there's no reason for the weaker check. Also, the temporary variables were causing compiler warnings in release builds. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:42 -06:00
Jason Ekstrand	b14d7a6b60	nir: Silence a couple of warnings in release builds [28/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_gather_xfb_info.c.o'. ../src/compiler/nir/nir_gather_xfb_info.c: In function ‘nir_gather_xfb_info’: ../src/compiler/nir/nir_gather_xfb_info.c:171:13: warning: variable ‘max_offset’ set but not used [-Wunused-but-set-variable] unsigned max_offset[NIR_MAX_XFB_BUFFERS] = {0}; ^~~~~~~~~~ [36/716] Compiling C object 'src/compiler/nir/068b2c8@@nir@sta/nir_instr_set.c.o'. ../src/compiler/nir/nir_instr_set.c:502:1: warning: ‘instr_each_src_and_dest_is_ssa’ defined but not used [-Wunused-function] instr_each_src_and_dest_is_ssa(nir_instr *instr) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:35 -06:00
Kenneth Graunke	6775665e5e	spirv: Eliminate dead input/output variables after translation. spirv_to_nir can generate input/output variables which are illegal for the current shader stage, which would cause nir_validate_shader to balk. After my recent commit to start decorating arrays as compact, dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started hitting validation errors due to outputs in a TCS (not intended for the TCS at all) not being per-vertex arrays. Thanks to Jason Ekstrand for suggesting this approach. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573 Fixes: `ef99f4c8d1` compiler: Mark clip/cull distance arrays as compact before lowering. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-02-14 11:03:56 -08:00
Kenneth Graunke	39aee57523	anv: Put MOCS in the correct location My patch to switch from struct-based MOCS to numeric MOCS accidentally divided all MOCS entries by 2 in the Vulkan driver. MOCS on Gen9+ is just an array index into a table. But in the hardware packets, the index starts at bit 1. So we need to shift it. Fixes: `0b44644ca6` (genxml: Consistently use a numeric "MOCS" field) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-14 11:03:28 -08:00
Ian Romanick	9a918050e0	spirv: Add missing break Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `c6465fec0c` ("spirv: add SpvCapabilityInt64Atomics") CID: 1442555	2019-02-14 08:35:59 -08:00
Eric Engestrom	c2b4b46fa9	util/tests: compile to something sensible in release builds assert()-based tests make no sense without asserts, so make sure asserts are compiled in, even if the rest of the code has asserts turned off. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-14 12:52:34 +00:00
Eric Engestrom	f7c56475d2	anv/tests: compile to something sensible in release builds assert()-based tests make no sense without asserts, so make sure asserts are compiled in, even if the rest of the code has asserts turned off. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-14 12:52:34 +00:00
Eric Engestrom	4c1ca5b074	etnaviv: drop duplicate #define Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	7f68b38439	st/dri: drop duplicate #define Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	2fa165e757	gbm: drop duplicate #defines Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	f1374805a8	drm-uapi: use local files, not system libdrm There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Eric Engestrom	69e4c273c4	drm-uapi/README: remove explicit list of driver names These headers are used by a lot more than just the intel drivers nowadays. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Samuel Pitoiset	227df98fa6	radv: fix radv_fixup_vertex_input_fetches() We should check that num_channels is 4, otherwise that breaks the world. Sorry for the short breakage. Fixes: `4b3549c084` ("radv: reduce the number of loaded channels for vertex input fetches") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-14 09:44:35 +01:00
Samuel Pitoiset	4b3549c084	radv: reduce the number of loaded channels for vertex input fetches It's unnecessary to load more channels than the vertex attribute format. The remaining channels are filled with 0 for y and z, and 1 for w. 29077 shaders in 15096 tests Totals: SGPRS: 1321605 -> 1318869 (-0.21 %) VGPRS: 935236 -> 932252 (-0.32 %) Spilled SGPRs: 24860 -> 24776 (-0.34 %) Code Size: 49832348 -> 49819464 (-0.03 %) bytes Max Waves: 242101 -> 242611 (0.21 %) Totals from affected shaders: SGPRS: 93675 -> 90939 (-2.92 %) VGPRS: 58016 -> 55032 (-5.14 %) Spilled SGPRs: 172 -> 88 (-48.84 %) Code Size: 2862740 -> 2849856 (-0.45 %) bytes Max Waves: 15474 -> 15984 (3.30 %) This mostly helps Croteam games (Talos/Sam2017). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:10:56 +01:00
Samuel Pitoiset	210aec3612	radv: store vertex attribute formats as pipeline keys The formats will be used for reducing the number of loaded channels. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:10:09 +01:00
Samuel Pitoiset	45382baef6	radv: use MAX_{VBS,VERTEX_ATTRIBS} when defining max vertex input limits Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:09:51 +01:00
Samuel Pitoiset	2154fac6f3	ac: make use of ac_build_expand_to_vec4() in visit_image_store() And make ac_build_expand() a static function. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-14 09:09:48 +01:00
Eric Anholt	338d399fd0	freedreno: Use the NIR lowering for isign. I think this will save an instruction and hopefully not increase any other costs (possibly the immediate -1 and 1?), but I haven't actually tested. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 00:32:30 +00:00
Eric Anholt	8f3694e1ab	intel: Use the NIR lowering for isign. Drops one instruction from fs-sign-int.shader_test. No change in shader-db due to it having 0 instances of sign(genIType). This may hurt isign64 if algebraic runs before int64 lowering, but I wasn't sure how to mark the algebraic opt as "every bit size but 64". v2: Update commit message about shader-db. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2019-02-14 00:32:30 +00:00
Eric Anholt	3f22b35a43	v3d: Use the NIR lowering for isign instead of rolling our own. min/max instead of comparisons saves 2 instructions on fs-sign-int.shader_test.	2019-02-14 00:32:30 +00:00
Eric Anholt	42d2cae907	nir: Move panfrost's isign lowering to nir_opt_algebraic. I wanted to reuse this from v3d. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-14 00:32:30 +00:00
Timothy Arceri	68baf96824	nir: turn an ssa check in nir_search into an assert Everything should be in ssa form when we call this. This is a hotpath so replace the check with an assert. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-02-14 09:35:32 +11:00
Timothy Arceri	46a4d2c867	nir: turn ssa check into an assert Everthing should be in ssa form when this is called. Checking for it here is expensive so turn this into an assert instead. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-02-14 09:35:32 +11:00
Timothy Arceri	0a89c9779a	nir: prehash instruction in nir_instr_set_add_or_rewrite() There is no need to hash the instruction twice, especially as we end up adding it in the majority of cases. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-14 09:35:32 +11:00
Dylan Baker	279060cd32	meson: Add dependency on genxml to anvil Currently the Intel "anvil" driver races with the generation of genxml files, while i965 has an explicit dependency. This patch adds the same dependency to anvil. Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-13 22:01:00 +00:00
Samuel Pitoiset	334da034d8	radv: always export gl_SampleMask when the fragment shader uses it For some reasons, this breaks trees rendering in Project Cars. Fixes: `85010585cd` ("radv: only enable gl_SampleMask if MSAA is enabled too") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-13 23:01:30 +01:00
Alok Hota	736241892f	gallium/aux: add PIPE_CAP_MAX_VARYINGS to u_screen Allows drivers using `u_pipe_screen_get_param_defaults` to use a fallback value for the new pipe cap. Default value of 8 based on GL 2.1 MAX_VARYING_FLOATS Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-13 15:08:14 -06:00
Kristian H. Kristensen	e8566d7098	.mailmap: Add a few more alises for myself Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-13 12:03:41 -08:00
Samuel Pitoiset	5e18000d1b	radv/winsys: fix BO list creation when RADV_DEBUG=allbos is set Fixes: `50fd253bd6` ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-13 20:51:40 +01:00
Kristian H. Kristensen	0a41ddbd4e	freedreno/a6xx: Fix point coord Use ir3_next_varying() for iterating through varyings and unset the global point coord invert bit. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.pointcoord Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-13 11:14:06 -08:00
Kristian H. Kristensen	2fbd2d5f58	freedreno/a6xx: Front facing needs UNK3 bit We need to set UNK3 in GRAS_CNTL and RB_RENDER_CONTROL0 for the value to be reliably delivered. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.frontfacing Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-13 11:14:06 -08:00
Kristian H. Kristensen	1831238c8e	freedreno/a6xx: Update headers This pulls in changes for compute shaders and a6xx ssbo/image support. FACENESS bit moved from position 1 to 2 and there's a global invert bit for point coord. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-13 11:14:06 -08:00
Kristian H. Kristensen	182e5c011f	freedreno/a6xx: Clean up mixed use of swap and swizzle for texture state Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-13 11:03:29 -08:00
Rob Clark	61094629cb	freedreno/a6xx: small compiler warning fix Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-13 13:54:05 -05:00
Dylan Baker	aff52dd2c6	get-pick-list: Add --pretty=medium to the arguments for Cc patches Because none of them have been picked up for 19.0 due to this bug being reintroduced. v2: - Fix fixes tags Fixes: `e6b3a3b201` ("bin/get-pick-list.sh: handle "typod" usecase.") Fixes: `fac10169bb` ("bin/get-pick-list.sh: prefix output with "[stable] "") Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-13 08:59:30 -08:00
Eric Engestrom	68a9383c6f	gitlab-ci: limit ninja to 4 threads max I tried bumping the limit on make and scons instead, but that just thrashed the runners, so let's not do that (sorry @daniels :]). Instead, remove the automatic thread management from ninja and limit it to 4 instead, in line with make and scons. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-13 16:15:43 +00:00
Konstantin Kharlamov	fccc9d3de6	mapi: work around GCC LTO dropping assembly-defined functions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109391 Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-13 14:20:51 +00:00
Caio Marcelo de Oliveira Filho	017349997f	nir: fix example in opt_peel_loop_initial_if description Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 20:33:20 -08:00
Karol Herbst	7e08f22a72	nir/opt_if: don't mark progress if nothing changes if we have something like this: loop { ... if x { break; } else { continue; } } opt_if_loop_last_continue returns true marking progress allthough nothing changes. Fixes: `5921a19d4b` "nir: add if opt opt_if_loop_last_continue()" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-13 00:21:35 +01:00
Oscar Blumberg	3c540e0a74	radeonsi: Fix guardband computation for large render targets Stop using 12.12 quantization for viewports that are not contained in the lower 4k corner of the render target as the hardware needs to keep both absolute and relative coordinates representable. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-12 17:21:46 -05:00
Chia-I Wu	2f8734e13b	egl: fix KHR_partial_update without EXT_buffer_age EGL_BUFFER_AGE_EXT can be queried without EXT_buffer_age. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-12 19:14:34 +00:00
Kenneth Graunke	5a006b026d	mesa: Advertise EXT_float_blend in ES 3.0+ contexts. This extension simply drops a draw time restriction: "Furthermore, an INVALID_OPERATION error is generated by DrawArrays and the other drawing commands defined in section 2.8.3 (10.5 in ES 3.1) if blending is enabled (see below) and any draw buffer has 32-bit floating-point format components." We never correctly enforced this restriction anyway, so we were basically already implementing it. We just need to advertise it for our behavior to be correct. The extension requires EXT_color_buffer_float, but we already enable that via dummy_true. So we can dummy_true this one as well. Found while debugging WebGL conformance tests. Does not fix any. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-12 10:57:25 -08:00
Alok Hota	d3dfa86a30	gallium/swr: Param defaults for unhandled PIPE_CAPs Without using this function, we fail the -Wswitch flag when compiling the default debugoptimized mode in Meson Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-12 18:55:14 +00:00
Juan A. Suarez Romero	1ad26f9417	anv/cmd_buffer: check for NULL framebuffer This can happen when we record a VkCmdDraw in a secondary buffer that was created inheriting from the primary buffer, but with the framebuffer set to NULL in the VkCommandBufferInheritanceInfo. Vulkan 1.1.81 spec says that "the application must ensure (using scissor if neccesary) that all rendering is contained in the render area [...] [which] must be contained within the framebuffer dimesions". While this should be done by the application, commit `465e5a86` added the clamp to the framebuffer size, in case of application does not do it. But this requires to know the framebuffer dimensions. If we do not have a framebuffer at that moment, the best compromise we can do is to just apply the scissor as it is, and let the application to ensure the rendering is contained in the render area. v2: do not clamp to framebuffer if there isn't a framebuffer v3 (Jason): - clamp earlier in the conditional - clamp to render area if command buffer is primary v4: clamp also x and y to render area (Jason) v5: rename used variables (Jason) Fixes: `465e5a86` ("anv: Clamp scissors to the framebuffer boundary") CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 19:19:13 +01:00
Marek Olšák	6c64413b6f	radeonsi: use MEM instead of MEM_GRBM in COPY_DATA.DST_SEL Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-12 13:08:54 -05:00
Marek Olšák	f8e4c9df47	radeonsi: add AMD_DEBUG env var as an alternative to R600_DEBUG Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-12 13:08:54 -05:00
Samuel Pitoiset	1b8983c25b	radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8 This fixes a critical issue. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:39:30 +01:00
Samuel Pitoiset	bd1186572f	radv: add support for push constants inlining when possible This removes some scalar loads from shaders, but it increases the number of SET_SH_REG packets. This is currently basic but it could be improved if needed. Inlining dynamic offsets might also help. Original idea from Dave Airlie. 29077 shaders in 15096 tests Totals: SGPRS: 1321325 -> 1357101 (2.71 %) VGPRS: 936000 -> 932576 (-0.37 %) Spilled SGPRs: 24804 -> 24791 (-0.05 %) Code Size: 49827960 -> 49642232 (-0.37 %) bytes Max Waves: 242007 -> 242700 (0.29 %) Totals from affected shaders: SGPRS: 290989 -> 326765 (12.29 %) VGPRS: 244680 -> 241256 (-1.40 %) Spilled SGPRs: 1442 -> 1429 (-0.90 %) Code Size: 8126688 -> 7940960 (-2.29 %) bytes Max Waves: 80952 -> 81645 (0.86 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:25:54 +01:00
Samuel Pitoiset	8364ffe823	radv: keep track of the number of remaining user SGPRs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:25:52 +01:00
Samuel Pitoiset	5f9379ca35	radv: gather if shaders load dynamic offsets separately Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:25:49 +01:00
Samuel Pitoiset	5806d99984	radv: gather more info about push constants This is needed in order to inline some push constants when possible. This also adds a new helper for initializing the pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 17:25:34 +01:00
Samuel Pitoiset	129a9f4937	radv: fix compiler issues with GCC 9 "The C standard says that compound literals which occur inside of the body of a function have automatic storage duration associated with the enclosing block. Older GCC releases were putting such compound literals into the scope of the whole function, so their lifetime actually ended at the end of containing function. This has been fixed in GCC 9. Code that relied on this extended lifetime needs to be fixed, move the compound literals to whatever scope they need to accessible in." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-12 14:48:08 +01:00
Tapani Pälli	2a2e69f975	i965: add P0x formats and propagate required scaling factors Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Lin Johnson <johnson.lin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-12 08:43:04 +02:00
Tapani Pälli	3da858a6b9	intel/compiler: add scale_factors to sampler_prog_key_data Patch propagates given scale_factors to lowering options. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 08:42:25 +02:00
Tapani Pälli	722f96bfc8	dri: add P010, P012, P016 for 10bit/12bit/16bit YUV420 formats Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Lin Johnson <johnson.lin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-12 08:42:02 +02:00
Tapani Pälli	19a85a704b	nir: add option to use scaling factor when sampling planes YUV lowering Patch adds nir_lower_tex_options as parameter to sample_plane so that we don't need to extend nir_tex_instr for this. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 08:41:20 +02:00
Kenneth Graunke	3eedc8f7b1	i965: Use info->textures_used instead of prog->SamplersUsed. prog->SamplersUsed is set by the linker when validating resource limits, while info->textures_used is gathered after NIR optimizations, which may have eliminated some unused surfaces. This may let us skip some work. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:50 -08:00
Kenneth Graunke	59ae985631	i965: Drop unnecessary 'and' with prog->SamplerUnits textures_used_by_txf is a subset of textures_used which is a subset of prog->SamplerUnits. This should do nothing. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:48 -08:00
Kenneth Graunke	f5c7df4dc9	nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref. Eric and I would like a bitmask of which samplers are used, similar to prog->SamplersUsed, but available in NIR. The linker uses SamplersUsed for resource limit checking, but later optimizations may eliminate more samplers. So instead of propagating it through, we gather a new one. While there, we also gather the existing textures_used_by_txf bitmask. Gathering these bitfields in nir_shader_gather_info is awkward at best. The main reason is that it introduces an ordering dependency between the two passes. If gathering runs before lower_samplers_as_deref, it can't look at var->data.binding. If the driver doesn't use the full lowering to texture_index/texture_array_size (like radeonsi), then the gathering can't use those fields. Gathering might be run early /and/ late, first to get varying info, and later to update it after variant lowering. At this point, should gathering work on pre-lowered or post-lowered code? Pre-lowered is also harder due to the presence of structure types. Just doing the gathering when we do the lowering alleviates these ordering problems. This fixes ordering issues in i965 and makes the txf info gathering work for radeonsi (though they don't use it). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:45 -08:00
Kenneth Graunke	120f9b8362	nir: Use sampler derefs in drawpixels and bitmap lowering. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:44 -08:00
Kenneth Graunke	04bdc56872	program: Make prog_to_nir create texture/sampler derefs. Until now, prog_to_nir has been setting texture_index and sampler_index directly. This is different than GLSL shaders, which create variable dereferences and rely on lowering passes to reach this final form. radeonsi uses variable dereferences for samplers rather than texture_index and sampler_index, so it doesn't even make sense to set them there. By moving to derefs, we ensure that both GLSL and ARB programs produce the same final form that the driver desires. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:40 -08:00
Kenneth Graunke	6a4be25a90	st/nir: Use sampler derefs in built-in shaders. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:38 -08:00
Kenneth Graunke	ba9c1c8217	st/nir: Lower sampler derefs for builtin shaders. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:36 -08:00
Kenneth Graunke	8d1646e0e1	st/nir: Pull sampler lowering into a helper function. This will make it easier to reuse across GLSL / ARB / built-ins. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:35 -08:00
Kenneth Graunke	243c11dc16	i965: Call nir_lower_samplers for ARB programs. An upcoming patch will start building derefs in prog_to_nir, at which point we'll need to lower them to indexes. This gets both GLSL and non-GLSL shaders using the same paths. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:30 -08:00
Kenneth Graunke	529a0711c1	glsl: Don't look at sampler uniform storage for internal vars Passes like nir_lower_drawpixels add additional sampler variables, and set an explicit binding which never changes. These extra samplers don't have proper uniform storage associated with them, and there is no way to update bindings via the API. So, for any 'hidden' variables, just trust that there's an explicit binding set. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:28 -08:00
Kenneth Graunke	d34e434989	glsl: Allow gl_nir_lower_samplers*() without a gl_shader_program I would like to be able to run gl_nir_lower_samplers() to turn texture and sampler variable dereferences into indexes and offsets, even for ARB programs, and built-in shaders. This would make sampler handling more consistent across the various types of shaders. For GLSL programs, the gl_nir_lower_samplers_as_deref() pass looks up the variable bindings in the shader program's uniform storage. But ARB programs and built-in shaders don't have a gl_shader_program, and uniform storage doesn't exist. In this case, we simply skip that lookup, and trust var->data.binding to be set correctly by whoever created the shader. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:22 -08:00
Kenneth Graunke	f45dd6d31b	st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048 Piglit's vp-max-array test creates a vertex program containing a uniform array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB. Mesa will then add additional state-var parameters for things like the MVP matrix. radeonsi currently exposes a value of 4096, derived from constant buffer upload size. This means the array will have 4096 elements, and the extra MVP state-vars would get a prog_src_register::Index of over 4096. Unfortunately, prog_src_register::Index is a signed 13-bit integer, so values beyond 4096 end up turning into negative numbers. Negative source indexes are only valid for relative addressing, so this ends up generating illegal IR. In prog_to_nir, this would cause an out of bounds array access. st_mesa_to_tgsi checks for a negative value, assumes it's bogus, and remaps it to parameter 0 in order to get something in-range. This isn't right - instead of reading the MVP matrix, it would read the first element of the vertex program's large array. But the test only checks that the program compiles, so we never noticed that it was broken. This patch limits the size of the program limits, with the understanding that we may need to generate additional state-vars internally. i965 has exposed 1024 for this limit for years, so I don't expect lowering it to 2048 will cause any practical problems for radeonsi or other drivers. Fixes vp-max-array with prog_to_nir.c. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:09:51 -08:00
Francisco Jerez	374eb3cd6f	intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces. This fixes a rather astonishing problem that came up while debugging an issue in the Vulkan CTS. Apparently the Vulkan CTS framework has the tendency to create multiple VkDevices, each one with a separate DRM device FD and therefore a disjoint GEM buffer object handle space. Because the intel_dump_gpu tool wasn't making any distinction between buffers from the different handle spaces, it was confusing the instruction state pools from both devices, which happened to have the exact same GEM handle and PPGTT virtual address, but completely different shader contents. This was causing the simulator to believe that the vertex pipeline was executing a fragment shader, which didn't end up well. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-11 12:27:22 -08:00
Kristian H. Kristensen	e404c6879d	freedreno/a6xx: Fall back to masked RGBA blits for depth/stencil The blitter doesn't seem to have a write mask, so for depth only and stencil only blits to Z24S8 we cast the Z24S8 buffer to an RGBA UNORM8 buffer and fall back to pipeline blits with corresponding write mask. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8 dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8 Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	f03ba155d5	freedreno/a6xx: Add format argument to fd6_tex_swiz() We need to allow overriding the format with that of the image or sampler view, so we can't take it from the resource in fd6_tex_swiz(). Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	bc8c813d5a	freedreno/a6xx: Support y-inverted blits The src coordinates are s24.8. For an inverted blit that ends at y=0 we need to program -1 for sy2, so we need to handle negative values correctly. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_color dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_color Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	03a01e5d23	freedreno/a6xx: Support some depth/stencil blits on blitter We can rewrite almost all depth stencil blits to various red-only blits. The exception is depth-only or stencil-only blits into z24s8 combined depth stencil buffer. We can fall back for depth-only, but stencil-only remains broken. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	e9592da2b4	freedreno/a6xx: Move blit check so as to restore comment The explanation for the compressed format check is broken across two comments: /* We can blit if both or neither formats are compressed formats... / / ... but only if they're the same compression format. */ but the ok_format() checks were inserted between, breaking up the flow of the sentence. Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	d2639f2eac	freedreno: Don't tell the blitter what it can't do Call ctx->blit() and let it reject blits it can't do instead of giving up on stencil blits and blits u_blitter can't do. Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	8cf1303698	freedreno: Consolidate u_blitter functions in freedreno_blitter.c Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	701d30dda8	freedreno/a6xx: Combine emit_blit and fd6_blit Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	6d1a7bdba3	freedreno/a6xx: Use the right resource for separate stencil stride Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	24b4172375	freedreno: Log number of draw for sysmem passes Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	a201cb157d	freedreno/a6xx: Drop render condition check in blitter We already check earlier in the call chain in fd_blit(). glBlitFramebuffer always sets render_condition_enable and thus we would never try the blitter path for that. Now that we get all of dEQP-GLES3.functional.fbo.blit.conversion.* down this path, it turs out that the fail_if(info->mask != util_format_get_mask(info->src.format)); fail_if(info->mask != util_format_get_mask(info->dst.format)); conditions weren't accurate. util_format_get_mask() returns PIPE_MASK_RGBA for any format with any color channels, while info->mask is the exact set of channels to blit. So we reject things we could blit - for example, PIPE_FORMAT_R16G16_FLOAT where info->mask is RG while util_format_get_mask() returns RGBA - and accept things we can't. It turns out that the blitter is happy to blit different number of channels, but fails to blit formats with different numerical formats and srgb formats. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	4f7a9c23ed	freedreno/a6xx: regen headers Update for a6xx.xml.h to incorporate a few new bits and changes to blit src rect coordinate types. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-11 12:26:21 -08:00
Leo Liu	a0a52a0367	st/va/vp9: set max reference as default of VP9 reference number If there is no information about number of render targets Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-11 14:44:16 -05:00
Leo Liu	21cdb828a3	st/va: fix the incorrect max profiles report Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will be correct when adding more profiles in the future. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-11 14:44:16 -05:00
Guttula, Suresh	2cf2a56739	st/va:Add support for indirect manner by returning VA_STATUS_ERROR_OPERATION_FAILED Based on VA Spec,DeriveImage() returns VA_STATUS_ERROR_OPERATION_FAILED if driver dont have support for internal surface formats.Currently vaDeriveImage() failed for non-contiguous planes and operation failed error string is required to support indirect manner i.e. vaCreateImage()+vaPutImage() incase vaDeriveImage() failed with VA_STATUS_ERROR_OPERATION_FAILED. This patch will notify to the client as operation failed with proper error sting,so that client will fallback to vaCreateImage()+vaPutImage(). v2: updated commit message based on VA spec. Signed-off-by: suresh guttula <suresh.guttula@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-02-11 14:44:16 -05:00
Marek Olšák	114a899cc8	winsys/amdgpu: cs_check_space sets the minimum IB size for future IBs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	766e920cdb	winsys/amdgpu: clean up IB buffer size computation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	8c1cb393fc	winsys/amdgpu: remove occurence of INDIRECT_BUFFER_CONST Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	881ef14b32	winsys/amdgpu: use a separate fence list for syncobjs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	9f00123d51	winsys/amdgpu: unify fence list code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	ddfe209a0d	winsys/amdgpu: don't drop manually added fence dependencies wow, it's hard to believe that fence and syncobjs dependencies were ignored. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	61c678d4bc	radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:06 -05:00
Marek Olšák	4522f01d4e	gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:04 -05:00
Jason Ekstrand	9e6a6ef0d4	nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: `7d1d1208c2` "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-11 10:57:23 -06:00
Jason Ekstrand	fd77606b5b	intel/fs: Use enumerated array assignments in fb read TXF setup It's more clear and means we don't have to update the array every time we add an optional texture instruction argument Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-11 10:57:09 -06:00
Michel Dänzer	d6c55f6c62	gitlab-ci: Re-use docker image from the main repo in forked repos Instead of generating it from scratch in each forked repo. This should save time, energy and storage. (The xserver & xf86-video-amdgpu CI scripts do basically the same) v2: * Hardcode "mesa" instead of using $CI_PROJECT_NAME, to avoid breakage if the project name is changed after forking (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-11 12:24:31 +01:00
Ilia Mirkin	cc79a1483f	nvc0: we have 16k-sized framebuffers, fix default scissors For some reason we don't use view volume clipping by default, and use scissors instead. These scissors were set to an 8k max fb size, while the driver advertises 16k-sized framebuffers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2019-02-10 23:36:23 -05:00
Alyssa Rosenzweig	85e2bb58ca	panfrost: Specify supported draw modes per-context Midgard has native support for QUADS and POLYGONS; Bifrost seemingly does not. Thus, Midgard generally skips prim_convert whereas Bifrost needs the pass; this patch allows the setting of allowed primitives to occur on a per-context basis (for runtime hardware selection). v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Robert Foss <robert.foss@collabora.com>	2019-02-11 03:23:00 +00:00
Dave Airlie	90c6880df7	radv: remove alloc parameter from pipeline init clang points out this isn't used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-11 10:04:40 +10:00
Dave Airlie	a523ae0cac	radv/llvm: initialise passes member. Fixes coverity warning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-11 08:59:02 +10:00
Dave Airlie	d2e82c2682	glsl: glsl to nir fix uninit class member. The constructor should init this to NULL Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-11 08:55:07 +10:00
Alyssa Rosenzweig	2458797256	panfrost: Elucidate texture op scheduling comment Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:51:57 +00:00
Alyssa Rosenzweig	658961aec3	panfrost: Remove speculative if 0'd format bit code Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:51:51 +00:00
Alyssa Rosenzweig	b1213a3947	panfrost: Remove if 0'd dead code Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:50:35 +00:00
Alyssa Rosenzweig	e91e1786c5	panfrost: Add kernel-agnostic resource management Various methods relating to resource management were previously marked as kernel-specific, forcing them to stay downstream in the vendor overlay and eventually be duplicated for DRM code. This patch adds back this code in kernel-neutral space, allowing for code sharing and minimising the diff to downstream. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:44:32 +00:00
Alyssa Rosenzweig	4ed23b193a	panfrost: Don't hardcode number of nir_ssa_defs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:42:52 +00:00
Alyssa Rosenzweig	97dcad8d3e	panfrost: Clean-up one-argument passing quirk Most Midgard instructions take two-arguments logically; there are always two arguments at the assembly level. For the few instructions that take only a single argument, generally the second argument slot is unused, with a zero inline constant occupying the space. fmov/imov are the exception, where the first argument is filled with r24 and the logical argument is in the second slot. Previously, these constraints were handled by a delicate, buggy series of hacks. This commit removes these hacks. Instead, we look at the logical number of arguments (from NIR), switching between two argument and one-argument-one-zero style. We then introduce a quirk for the flipped style, which applies to fmov/imov. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:41:25 +00:00
Karol Herbst	49397a3c84	glsl_type: initialize offset and location to -1 for glsl_struct_field Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-09 13:52:15 +01:00
Kenneth Graunke	55e00a2ea8	nouveau: Silence unhandled cap warnings Nouveau apparently uses the u_screen helper but prints a warning in the default case, so running any GL program would start grumbling. Fixes: `8fa54bc549` gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit. Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-02-08 16:26:00 -08:00
Caio Marcelo de Oliveira Filho	ee670d09af	intel/compiler: use 0 as sampler in emit_mcs_fetch The sampler will be ignored since the underlying 'ld_mcs' operation won't use it, so just fill the field with 0 instead of the texture to make it clearer that's the case. This will also avoid is_high_sampler() to kick in unnecessarily, in case we are using the operation for a texture with index >= 16. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 14:51:56 -08:00
Eric Engestrom	e8e544436c	wsi: query the ICD's max dimensions instead of hard-coding them anv and radv both happened to already return 2^14 for these, but querying the ICD is safer and will help if vdreno (or whatever it's called) doesn't have the same max. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 18:54:57 +00:00
Ian Romanick	b031c64349	nir: Convert a bcsel with only phi node sources to a phi node v2: Remove the original ALU instruciton after all of its readers are modified to read the new ALU instruction. v3: Fix an issue where a bcsel that may not be executed on a loop iteration due to a break statement is converted to a phi (and therefore incorrectly "executed"). Noticed by Tim. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109216 Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	0881e90c09	nir: Split ALU instructions in loops that read phis A single shader in Unigine Superposition is affected by this change. A single iadd is moved to the end of a loop. This iadd is involved in a complex set of logic to terminate the loop, and an extra mov instruction is inserted. This shader really needs the optimization suggested by bugzilla #94747, and I expect that to make this tiny regression go away. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15047543 -> 15047545 (<.01%) instructions in affected programs: 565 -> 567 (0.35%) helped: 0 HURT: 2 total cycles in shared programs: 369977253 -> 369978253 (<.01%) cycles in affected programs: 127910 -> 128910 (0.78%) helped: 0 HURT: 2 v2: Skip nir_op_vec{2,3,4} and nir_op_[fi]mov instructions to avoid infinite optimization loops. Remove the original ALU instruciton after all of its readers are modified to read the new ALU instruction. v3: Extend to the more general case. The if the prev-block value from the phi is not undef, this means the ALU instruction has to be duplicated in both the prev-block and the continue-block. Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	0c0c69729b	nir: Select phi nodes using prev_block instead of continue_block This simplifies some changes coming later. Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	8d8f80af3a	nir: Refactor code that checks phi nodes in opt_peel_loop_initial_if This will be used in a couple more places soon. The function name is... horribly long. Neither Matt nor I could think of any thing that was shorter and still more descriptive than "is_phi_foo". I'm willing to entertain suggestions. Fixes: `8fb8ebfbb0` ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	4d65d2b12e	nir: Document some fields of nir_loop_terminator Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	28ef5bb74c	intel/compiler: Silence warning about value that may be used uninitialized For some reason, this warning only occurs for me in release builds. In file included from src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:25:0: src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c: In function ‘brw_nir_lower_mem_access_bit_sizes’: src/compiler/nir/nir_builder.h:501:26: warning: ‘src_swiz[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized] alu_src.swizzle[i] = swiz[i]; ~~~~~~~~~~~~~~~~~~~^~~~~~~~~ src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:225:16: note: ‘src_swiz[2]’ was declared here unsigned src_swiz[4]; ^~~~~~~~ Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Ian Romanick	78169870e4	nir: Silence zillions of unused parameter warnings in release builds Fixes: `cd56d79b59` "nir: check NIR_SKIP to skip passes by name" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-08 10:37:06 -08:00
Eric Engestrom	3dc5faf523	gitlab-ci: workaround docker bug for users with uppercase characters CI_REGISTRY_IMAGE == lower($CI_REGISTRY/$CI_PROJECT_PATH) Suggested-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-08 17:45:57 +00:00
Andrii Simiklit	2b7d5c3217	i965: consider a 'base level' when calculating width0, height0, depth0 I guess that when we calculating the width0, height0, depth0 to use for function 'intel_miptree_create' we need to consider the 'base level' like it is done in the 'intel_miptree_create_for_teximage' function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107987 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-07 21:40:50 -08:00
Timothy Arceri	26aa460940	nir: rewrite varying component packing There are a number of reasons for the rewrite. 1. Adding support for packing tess patch varyings in a sane way. 2. Making use of qsort allowing the code to be much easier to follow. 3. Fixes a bug where different interp types caused component packing to be skipped for all varyings in some scenarios. 4. Allows us to add a crude live range analysis for deciding which components should be packed together. This support can optionally be added in a future patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	2f53260417	nir: add is_packing_supported_for_type() helper This will be used in the following patches to determine if we support packing the components of a varying. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	e041123841	nir: add glsl_type_is_32bit() helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	7b01d5c354	nir: add support for marking used patches when packing varyings This adds support needed for marking the varyings as used but we don't actually support packing patches in this patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	d0af13cfb4	st/glsl_to_nir: call nir_remove_dead_variables() after lowing local indirects Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Timothy Arceri	d0abbaa528	util: move BITFIELD macros to util/macros.h Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 02:54:56 +00:00
Karol Herbst	cbd1ad6165	st/mesa: require RGBA2, RGB4, and RGBA4 to be renderable If the driver does not support rendering to these formats but does support texturing, we can end up in incompatibilities between textures and renderbuffers that are then copied to. Fixes KHR-GL45.copy_image.functional on nvc0 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-07 21:51:45 -05:00
Karol Herbst	6010d7b8e8	gallium: add PIPE_CAP_MAX_VARYINGS Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <karolherbst@gmail.com> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-07 21:51:45 -05:00
Alyssa Rosenzweig	738346fa23	kmsro: Silence warning if missing Regardless of whether the build uses kmsro, kmsro is the default driver descriptor when the static loader is used. Thus, in an edge case where the static loader is used, no static targets are loaded, and kmsro is not compiled, a spurious warning is printed. There's no harm in executing the stub function in this case, but it's not "an error" to not have kmsro in the build; the driver missing warning should not printed kmsro. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-02-08 01:48:37 +00:00
Lionel Landwerlin	f1bcb9be46	radv: assert that colorAttachment is valid for CmdClearAttachment This partially reverts a change from `b7a93cbded` ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment") which fixed actual issues but also started to accept invalid values for the colorAttachment field. This change asserts that the field is valid for the current pass. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b7a93cbded` ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-08 00:18:16 +00:00
Lionel Landwerlin	a934a3d124	anv: assert that color attachment are valid This reverts commit `d76e777988`. Let's make this obvious that there is an application issue if it tries to access an attachment that doesn't exist in the current pass. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d76e777988` ("anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-08 00:18:16 +00:00
Dave Airlie	3c153b3982	docs: update qbo support for virgl Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-02-08 09:06:36 +10:00
Eric Engestrom	6e0effbd34	travis: fix osx make build This variable was removed in commit `087af992a2` "travis: remove unused linux code path" because it looked like it was only used by the Linux build. Turns out I was wrong, so let's restore it. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-02-07 20:14:14 +00:00
Jason Ekstrand	eaf5e4a24d	README: Drop the badges from the readme They have been added as badges directly to the GitLab project. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-07 12:46:17 -06:00
Eric Engestrom	358d0cfab2	driconf: drop unused macro Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-07 13:40:26 +00:00
Eric Engestrom	00be88aab8	meson: add script to print the options before configuring a builddir Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-07 13:22:41 +00:00
Alyssa Rosenzweig	d43ec104b7	panfrost: Include glue for out-of-tree legacy code In addition to the DRM interface in active development, for legacy kernels Panfrost has a small, optional, out-of-tree glue repository. For various reasons, this legacy code should not be included in Mesa proper, but this commit allows it to coexist peacefully with upstream Panfrost. If the nondrm repo is cloned/symlinked to the directory `src/gallium/drivers/panfrost/nondrm`, legacy functionality will be built. Otherwise, the driver will build normally, though a runtime error message will be printed if a legacy kernel is detected. This workaround is icky, but it allows a nearly-upstream Panfrost to work on real hardware, today. Ideally, this patch will be reverted when the Panfrost kernel module is mature and we drop legacy support. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-07 01:58:32 +00:00
Alyssa Rosenzweig	7da251fc72	panfrost: Check in sources for command stream This patch includes the command stream portion of the driver, complementing the earlier compiler. It provides a base for future work, though it does not integrate with any particular winsys. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-07 01:57:50 +00:00
Alyssa Rosenzweig	8f4485ef1a	panfrost: Use u_pipe_screen_get_param_defaults Switching to the defaults function cleans up pan_screen.h markedly and futureproofs for when new PIPE_CAPs are added. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Eric Anholt <eric@anholt.net>	2019-02-07 01:57:19 +00:00
Alyssa Rosenzweig	8f9f99d84d	kmsro: Move DRM entrypoints to shared block As kmsro allows an essentially mix-and-match hodgepodge of display drivers and renderonly GPUs, it doesn't make sense to couple the display driver entrypoint definition with the driver. Instead, we move all kmsro entrypoints to a shared kmsro block at the end (avoiding clutter and distraction since this list may snowball in the future). v2: Alphabetize driver list. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-07 01:50:16 +00:00
Rhys Perry	5b6f522fc2	nvc0: add compute invocation counter The strategy is to keep a CPU-side counter of the direct invocations, and a GPU-side counter of the indirect invocations, and then add them together for queries. The specific technique is a macro which multiplies a list of integers together and accumulates the product into SCRATCH registers held inside of the context. Another macro will read those values out and add them to the passed-in cpu-side counter to be stored in a query buffer the same way that all the other statistics are stored. Original implementation by Rhys Perry, redone by Ilia Mirkin to use the SCRATCH temporaries. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-02-06 19:35:57 -05:00
Karol Herbst	cce4955721	gm107/ir: add fp64 rsq Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Karol Herbst	815a8e59c6	gm107/ir: add fp64 rcp Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Karol Herbst	12669d2970	gk104/ir: Use the new rcp/rsq in library [imirkin: add a few more "long" prefixes to safen things up] Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Boyan Ding	656ad06051	gk110/ir: Use the new rcp/rsq in library v2: (Karol Herbst <kherbst@redhat.com> * fix Value setup for the builtins Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> [imirkin: track the fp64 flag when switching ops to calls] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Boyan Ding	7937408052	gk110/ir: Add rsq f64 implementation Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Boyan Ding	04593d9a73	gk110/ir: Add rcp f64 implementation Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	6adb9b38bf	nvc0: stick zero values for the compute invocation counts Not quite perfect, but at least we don't end up with random values in the query buffer. Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	e00799d3dc	nv50,nvc0: use condition for occlusion queries when already complete For the NO_WAIT variants, we would jump into the ALWAYS case for both nested and inverted occlusion queries. However if the query had previously completed, the application could reasonably expect that the render condition would follow that result. To resolve this, we remove the nesting distinction which unnecessarily created an imbalance between the regular and inverted cases (since there's no "zero" condition mode). We also use the proper comparison if we know that the query has completed (which could happen as a result of an earlier get_query_result call). Fixes KHR-GL45.conditional_render_inverted.functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	162352e671	nvc0: fix 3d images on kepler Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d tiling, they just need the correct inputs. Supply them. We also have to deal with the case where a 2d "layer" of a 3d image is bound. In this case, we supply the z coordinate separately to the shader, which has to optionally treat every 2d case as if it could be a slice of a 3d texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	5de5beedf2	nvc0/ir: fix second tex argument after levelZero optimization We used to pre-set a bunch of extra arguments to a texture instruction in order to force the RA to allocate a register at the boundary of 4. However with the levelZero optimization, which removes a LOD argument when it's uniformly equal to zero, we undid that logic by removing an extra argument. As a result, we could end up with insufficient alignment on the second wide texture argument. Instead we switch to a different method of achieving the same result. The logic runs during the constraint analysis of the RA, and adds unset sources as necessary right before being merged into a wide argument. Fixes MISALIGNED_REG errors in Hitman when run with bindless textures enabled on a GK208. Fixes: `9145873b15` ("nvc0/ir: use levelZero flag when the lod is set to 0") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	4443b6ddf2	nvc0/ir: always use CG mode for loads from atomic-only buffers Atomic operations don't update the local cache, which means that we would have to issue CCTL operations in order to get the updated values. When we know that a buffer is primarily used for atomic operations, it's easier to just avoid the caching at that level entirely. The same issue persists for non-atomic buffers, which will have to be fixed separately. Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	399215eb7a	nvc0: add support for handling indirect draws with attrib conversion The hardware does not natively support FIXED and DOUBLE formats. If those are used in an indirect draw, they have to be converted. Our conversion tries to be clever about only converting the data that's needed. However for indirect, that won't work. Given that DOUBLE or FIXED are highly unlikely to ever be used with indirect draws, read the indirect buffer on the CPU and issue draws directly. Fixes the failing dEQP-GLES31.functional.draw_indirect.random.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Kristian H. Kristensen	0f7a20e91e	freedreno/a6xx: Use tiling for all resources We used to restrict this to just PIPE_BIND_SAMPLER_VIEW resources, but most resources benefit from being tiled. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-06 15:28:48 -08:00
Kristian H. Kristensen	357ea7da51	freedreno/a6xx: Emit blitter dst with OUT_RELOCW We're writing to the bo and the kernel needs to know for fd_bo_cpu_prep() to work. Fixes: `f93e431272` ("freedreno/a6xx: Enable blitter") Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-06 15:22:25 -08:00
Bas Nieuwenhuizen	13ab63bb62	radv: Implement VK_EXT_buffer_device_address. v2: Also update the release notes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:37:38 +01:00
Bas Nieuwenhuizen	3259e7b036	radv: Do not use the bo list for local buffers. The kernel already does it for us. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:19 +01:00
Bas Nieuwenhuizen	8a15950211	amd/common: Implement global memory accesses. Needed for VK_EXT_buffer_device_address. The pointers are implmemented as i8*, since I could not figure out how to emulate setting struct offsets in LLVM based on the SPIR-V offsets (and more weird stuff like row major matrices). Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:11 +01:00
Bas Nieuwenhuizen	5703ecf651	amd/common: Do not use 32-bit loads for shared memory. We use a straight glsl->llvm type conversion so types should already be right. Also even though the writemasks were changed we we not actually doing 32-bit things, so this fails miserably. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:06 +01:00
Bas Nieuwenhuizen	8d1718590b	amd/common: handle nir_deref_cast for shared memory from integers. Can happen e.g. after a phi. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:02 +01:00
Bas Nieuwenhuizen	830fd0efc1	amd/common: Handle nir_deref_type_ptr_as_array for shared memory. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:58 +01:00
Bas Nieuwenhuizen	dbdb44d575	amd/common: Fix stores to derefs with unknown variable. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:54 +01:00
Bas Nieuwenhuizen	3c24fc64c7	amd/common: Use correct writemask for shared memory stores. The check was for 1 bit being set, which is clearly not what we want. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:49 +01:00
Bas Nieuwenhuizen	00253ab2c4	radv: Fix the shader info pass for not having the variable. For example with VK_EXT_buffer_device_address or VK_KHR_variable_pointers. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:45 +01:00
Bas Nieuwenhuizen	58c8dadd32	amd/common: Implement ptr->int casts in ac_to_integer. For the implicit casts inherent in nir. This should probably have been done for shared memory for VK_KHR_variable_pointers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:40 +01:00
Bas Nieuwenhuizen	e00d9a9a72	amd/common: Add gep helper for pointer increment. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:36 +01:00
Bas Nieuwenhuizen	39ab4e12f7	radv: Only look at pImmutableSamples if the descriptor has a sampler. Equivalent of ANV patch `c7f4a2867c` CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:32 +01:00
Eric Engestrom	40b53a7203	xvmc: fix string comparison Fixes: `6fca18696d` "g3dvl: Update XvMC unit tests." Cc: Younes Manton <younes.m@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 18:15:43 +00:00
Eric Engestrom	110a6e1839	xvmc: fix string comparison Fixes: `c7b65dcaff` "xvmc: Define some Xv attribs to allow users to specify color standard and procamp" Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 18:15:43 +00:00
Eric Engestrom	ba26bc4ef0	gitlab-ci: add meson glvnd build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	5459900f38	travis: remove unused scons code path Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	087af992a2	travis: remove unused linux code path Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	73275147fe	gitlab-ci: add make Gallium ST Other build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	360a7bfbe9	gitlab-ci: add make Gallium ST Clover LLVM-7 build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	39315a747b	gitlab-ci: add make Gallium ST Clover LLVM-6.0 build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	e80f88c48a	gitlab-ci: add make Gallium ST Clover LLVM-5.0 build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	cc85f50029	gitlab-ci: add make Gallium ST Clover LLVM-4.0 build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	984e295500	gitlab-ci: add make Gallium ST Clover LLVM-3.9 build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	d0dff24cbb	gitlab-ci: add make Gallium Drivers "Other" build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	055cfbc6de	gitlab-ci: add make Gallium Drivers RadeonSI build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	7b26a19f31	gitlab-ci: add make Gallium Drivers SWR build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	bbdc563c11	gitlab-ci: add make loaders/classic DRI build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	f33517bda7	gitlab-ci: add meson gallium ST "Other" build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	8dab707ab8	gitlab-ci: add meson gallium ST Clover (LLVM 7.0) build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	8744ac0904	gitlab-ci: add meson gallium ST Clover (LLVM 6.0) build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	b5a70af062	gitlab-ci: add meson gallium ST Clover (LLVM 5.0) build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	d407ead204	gitlab-ci: add meson gallium "other drivers" build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	06e8f1961b	gitlab-ci: add meson gallium RadeonSI build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	360c814bfe	gitlab-ci: add meson gallium SWR build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	d73265e20d	gitlab-ci: add meson loader/classic DRI build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	6a19ec9daa	gitlab-ci: add scons SWR build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	d4c6d4d5cb	gitlab-ci: add scons llvm 3.5 build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	06b245b438	gitlab-ci: add a scons no-llvm build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	89a7467899	gitlab-ci: add a make vulkan build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	46d23c0a46	gitlab-ci: add a meson vulkan build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Eric Engestrom	329f5cd780	gitlab-ci: add ubuntu container Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 17:56:30 +00:00
Marek Olšák	42a1cd034d	radeonsi: use local ws variable in si_need_dma_space Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	2c4911c652	radeonsi: don't leak an index buffer if draw_vbo fails Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	d72c319867	radeonsi: make allocator_zeroed_memory unmappable and use bigger buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	5068dec5de	radeonsi: clear allocator_zeroed_memory with SDMA so that it can be used in parallel IBs. This also removes the SO_FILLED_SIZE hack. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	7d4c935654	radeonsi: initialize textures using DCC to black when possible Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Jonathan Marek	3361305f57	freedreno: a2xx: fix fast clear Fixes: `912a9c8d` Signed-off-by: Jonathan Marek <jonathan@marek.ca> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 14:34:57 +00:00
Eric Engestrom	54fa5eceae	egl: use coherent variable names `EGLDisplay` variables (the opaque Khronos type) have mostly been consistently called `dpy`, as this is the name used in the Khronos specs. However, `_EGLDisplay` variables (our internal struct) have been randomly called `dpy` when there was no local variable clash with `EGLDisplay`s, and `disp` otherwise. Let's be consistent and use `dpy` for the Khronos type, and `disp` for our struct. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-02-06 11:53:24 +00:00
Alyssa Rosenzweig	a81d5587d6	meson: Remove panfrost from default driver list Until the kernel side matures and the full driver is upstreamed, to avoid end-user surprises, Panfrost should only be built for the adventurous. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-06 02:59:00 +00:00
Eric Anholt	3c08ecf147	v3d: Whitespace consistency fix.	2019-02-05 15:46:42 -08:00
Eric Anholt	940501a446	v3d: Fix copy-propagation of input unpacks. I had a single function for "does this do float input unpacking" with two major flaws: It was missing the most common thing to try to copy propagate a f32 input nunpack to (the VFPACK to an FP16 render target) along with several other ALU ops, and also would try to propagate an f32 unpack into a VFMUL which only does f16 unpacks. instructions in affected programs: 659232 -> 655895 (-0.51%) uniforms in affected programs: 132613 -> 135336 (2.05%) and a couple of programs increase their thread counts. The uniforms hit appears to be a pattern in generated code of doing (-a >= a) comparisons, which when a is abs(b) can result in the abs instruction being copy propagated once but not fully DCEed.	2019-02-05 15:46:04 -08:00
Eric Anholt	e5c6938590	v3d: Fix input packing of .l for rounding/fdx/fdy. Avoids a regression in dEQP-GLES3.functional.shaders.derivate.fwidth.texture.* once we start copy-propagating more input packs.	2019-02-05 15:45:23 -08:00
Eric Anholt	1a4170952d	v3d: Fix pack/unpack of VFPACK operand unpacks. We want to be able to copy propagate our texture unpacks into the vfpack.	2019-02-05 15:45:23 -08:00
Eric Anholt	d0fdbd4211	v3d: Fix dumping of shaders with alpha test. We were trying to print a NULL entry from the table.	2019-02-05 15:42:14 -08:00
Eric Anholt	bdef17b052	v3d: Store the actual mask of color buffers present in the key. If you only bound rt 1+, we'd still emit a write to the rt0 that isn't present (noticed while debugging an ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero regression in another change).	2019-02-05 15:42:04 -08:00
Eric Anholt	17a649af05	v3d: Fix precompile of FRAG_RESULT_DATA1 and higher outputs. I was just leaving the other MRT targets than DATA0 out, by accident.	2019-02-05 15:35:49 -08:00
Kristian H. Kristensen	ba4b22011a	st/nir: Use src/ relative include path for autotools Fixes: `cdc53fa81c` Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-05 14:19:51 -08:00
Kenneth Graunke	8fa54bc549	gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit. Iris would like to use compact arrays for tesslevels and clip/cull distances. radeonsi will likely want to switch to these at some point, since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready for them just yet. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	cf731564e6	st/nir: Call nir_lower_clip_cull_distance_arrays(). Today, st always sets LowerCombinedClipCullDistance, causing the GLSL IR lowering to run, giving us vec4[2] arrays. I would like to disable this and instead run the NIR lowering so that we get compact float[] arrays instead. Calling the new pass is a noop if the GLSL IR pass has already run, so it's safe to call the pass unconditionally. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	15c6902117	nir: Avoid splitting compact arrays into per-element variables. Compact arrays are used for special variables like clip and cull distances, or tessellation levels. Drivers using compact arrays assume that these values will always be actual arrays. We don't want to turn a float[1] gl_CullDistance into a single float; that would confuse drivers. Today, i965 uses compact arrays, and Gallium drivers use nir_lower_io_arrays_to_elements, so we haven't had any overlap that would demonstrate the issue. Iris will use both. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	ba9dcc80fb	nir: Avoid clip/cull distance lowering multiple times. A couple places in st/nir assume that cull distances have been lowered away, so it will need to call this lowering pass for drivers which opt out of the GLSL IR lowering. The Intel backend also calls this pass, for i965 and anv. We need to only do it once. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	5730364d69	nir: Bail on clip/cull distance lowering if GLSL IR already did it. We have a GLSL IR pass to convert clip/cull distance float[] arrays into vec4[2] arrays. In `ff281e6204`, we attempted to skip this pass if the GLSL IR lowering had already run. But, that code was not quite right, as we forgot to strip away the per-vertex IO array layer for geometry and tessellation shader varyings. If the GLSL IR pass has run, the variables will not be marked as "compact". So we can simply check that and bail. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	ef99f4c8d1	compiler: Mark clip/cull distance arrays as compact before lowering. nir_lower_clip_cull_distance_arrays() marks the combined clip/cull distance array as compact. However, when translating in from GLSL or SPIR-V, we were not marking the original float[] arrays as compact. We should do so. That way, we can detect these corner cases properly. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	3327c93510	nir: Record info->fs.pixel_center_integer in lower_system_values radeonsi uses a system value for gl_FragCoord rather than an input var. These get translated into load_frag_coord NIR intrinsics, which lose the pixel_center_integer and origin_upper_left decorations. To cope with this, Tim added a shader_info field for pixel_center_integer, and made glsl_to_nir set it accordingly. prog_to_nir also needs to handle these fragcoord conventions. Instead of duplicating the logic to set the info field, just move it to nir_lower_system_values so it'll happen regardless of who makes the NIR. (For what it's worth, we don't need an info flag for origin_upper_left, because radeonsi lowers origin conventions in nir_lower_wpos_ytransform before nir_lower_system_values destroys the variable and qualifiers.) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:52 -08:00
Kenneth Graunke	536abd453b	program: Extend prog_to_nir handle system values. Some drivers, such as radeonsi, use a system value for gl_FragCoord rather than an input variable. In this case, our Mesa IR will have a PROGRAM_SYSTEM_VALUE register, which we need to translate. This makes prog_to_nir work for Gallium drivers which expose the PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL capability bit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:51 -08:00
Kenneth Graunke	fa38ca25f6	program: Use u_bit_scan64 in prog_to_nir. We can simply iterate the bits rather than using util_last_bit and checking each one up until that point. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:50 -08:00
Kenneth Graunke	a01ad3110a	st/mesa: Add NIR versions of the PBO upload/download shaders. Acked-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:42 -08:00
Kenneth Graunke	a02349b9e7	st/mesa: Add a NIR version of the OES_draw_texture built-in shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:41 -08:00
Kenneth Graunke	be492affa8	st/mesa: Add NIR versions of the clear shaders. We implement the basic VS and FS, as well as the VS that does layered clears by writing gl_Layer from the vertex shader. Drivers which need a geometry shader for writing layer continue falling back to TGSI, as I didn't need this and so didn't bother implementing it. (We certainly could, however, if people want to add it in the future.) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:39 -08:00
Kenneth Graunke	3f28b245b5	st/mesa: Add NIR versions of the drawpixels Z/stencil fragment shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:37 -08:00
Kenneth Graunke	2d45f9fa25	st/mesa: Add a NIR version of the drawpixels/bitmap VS copy shader. This provides a native NIR version of the DrawPixels/Bitmap passthrough vertex shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:36 -08:00
Kenneth Graunke	cdc53fa81c	st/nir: Make new helpers for constructing built-in NIR shaders. The state tracker generates several built-in shaders in order to perform scissored clears, upload/download PBOs, and so on. These are currently constructed using TGSI, using ureg and u_simple_shader. I want to have NIR versions of these shaders, for my Gallium driver that has a NIR backend but no TGSI support. To that end, we'll want a few helpers to help construct simple shaders. This patch adds two new helpers: - st_nir_finish_builtin_shader() takes a manually constructed NIR shader, applies lowering passes (like st_link_nir would do for GLSL), and constructs the pipe_shader_state. - st_nir_make_passthrough_shader() makes a simple passthrough shader, which copies inputs to outputs. This is similar to u_simple_shaders. v2: Set info->fs.untyped_color_outputs for vc4/v3d (thanks Eric!). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:33 -08:00
Kenneth Graunke	4f799264d1	st/nir: Move varying setup code to a helper function. I want to reuse this for built-in shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:02 -08:00
Jason Ekstrand	36734987a5	nir/deref: Drop zero ptr_as_array derefs They are effectively (&x)[0] or *&x which does nothing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-05 15:17:19 -06:00
Eric Anholt	aaef12702f	nir: Move V3D's "the shader was TGSI, ignore FS output types" flag to NIR. Ken's rework of mesa/st builtins to NIR means that we'll have more NIR shaders with color output types that are mismatched with the render target types. Since this is behavior that GLSL doesn't require, add it as a shader_info option so the driver can know that it needs to ignore the FS output's base type in favor of the actual render target's. This prevents needing additional variants in several mesa/st paths (clear, pbo upload, pbo download), given that the driver already has to handle the variants for any TGSI being passed to it (from u_blitter, for example). Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-05 12:12:33 -08:00
Emil Velikov	8943eb8f03	anv: wire up the state_pool_padding test Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `927ba12b53` ("anv/tests: Adding test for the state_pool padding.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com><Paste> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-02-05 11:39:36 -08:00
Karol Herbst	a61c388d07	nvc0/ir: replace cvt instructions with add to improve shader performance gives me an performance boost of 0.2% in pixmark_piano on my gk106, gm204 and gp107. reduces the amount of generated convert instructions by roughly 30% in shader-db. v2: only for 32 bit operations move some common code out of the switch handle OP_SAT with modifiers v3: only for registers and const memory rework if clauses merge isCvt into this patch v4: merge isCvt into its use Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-02-05 20:35:38 +01:00
Bart Oldeman	a203eaa4f4	gallium-xlib: query MIT-SHM before using it. When Mesa is compiled for gallium-xlib using e.g. ./configure --enable-glx=gallium-xlib --disable-dri --disable-gbm -disable-egl and is used by an X server (usually remotely via SSH X11 forwarding) that does not support MIT-SHM such as XMing or MobaXterm, OpenGL clients report error messages such as Xlib: extension "MIT-SHM" missing on display "localhost:11.0". ad infinitum. The reason is that the code in src/gallium/winsys/sw/xlib uses MIT-SHM without checking for its existence, unlike the code in src/glx/drisw_glx.c and src/mesa/drivers/x11/xm_api.c. I copied the same check using XQueryExtension, and tested with glxgears on MobaXterm. This issue was reported before here: https://lists.freedesktop.org/archives/mesa-users/2016-July/001183.html Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: <mesa-stable@lists.freedesktop.org>	2019-02-05 17:53:35 +00:00
Alok Hota	6e5eb4ead6	swr/rast: update SWR rasterizer shader stats Primarily refactoring internal stats types Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-05 11:41:25 -06:00
Michel Dänzer	c0a540f320	loader/dri3: Use strlen instead of sizeof for creating VRR property atom sizeof counts the terminating null character as well, so that also contributed to the ID computed for the X11 atom. But the convention is for only the non-null characters to contribute to the atom ID. Fixes: `2e12fe425f` "loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property" Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 17:18:44 +00:00
Jonathan Marek	4f0a3c9f9e	nir: add missing vec opcodes in lower_bool_to_float Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-05 15:34:15 +00:00
Gert Wollny	b0b3de2be7	mesa: release references to image textures when a context is destroyed When a texture is still bound as an image and the context it was bound in is destroyed but not the texture, then the texture will still hold the resource and will not be freed when it is finally destroyed. Hence, release these references when the context is destroyed. This leak was triggered by virglrenderer: https://gitlab.freedesktop.org/virgl/virglrenderer/issues/86 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-05 10:53:41 +00:00
Gert Wollny	f1f3640f6f	radeonsi: release tokens after creating the shader program ureg_get_tokens clears the reference to the tokens, and create_compute_state makes a copy, hence the tokens must be explicitely released. Fixes: Direct leak of 256 byte(s) in 1 object(s) allocated from: #0 0x7ff729cf3c60 in realloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdbc60) #1 0x7ff721b1240c in tokens_expand ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:234 #2 0x7ff721b1c9c0 in get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:257 #3 0x7ff721b1c9c0 in copy_instructions ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2040 #4 0x7ff721b1c9c0 in ureg_finalize ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2090 #5 0x7ff721b1e919 in ureg_get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2167 #6 0x7ff721f8b35a in si_create_dma_compute_shader ../../samba/mesa/src/gallium/drivers/radeonsi/si_shaderlib_tgsi.c:219 #7 0x7ff722043ed9 in si_compute_do_clear_or_copy ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:156 #8 0x7ff7220448d3 in si_clear_buffer ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:247 #9 0x7ff7220350e8 in vi_dcc_clear_level ../../samba/mesa/src/gallium/drivers/radeonsi/si_clear.c:274 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-05 11:50:54 +01:00
Caio Marcelo de Oliveira Filho	8c7c543936	isl: assert that Gen8+ don't have bit6_swizzling v2: Rewrite the condition to more clearly match the comment. (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho	5299c9cbcc	anv: skip bit6 swizzle detection in Gen8+ It is always false on Gen8+. Also, move the variable definition near its use. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho	60740eade3	i965: skip bit6 swizzle detection in Gen8+ It is always false on Gen8+. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 20:44:41 -08:00
Caio Marcelo de Oliveira Filho	51547bbc5a	nir: keep the phi order when splitting blocks All things being equal is better to keep the original order. Since the new block is empty, push the phis in order to tail. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>	2019-02-04 20:41:13 -08:00
Ilia Mirkin	38f542783f	nv50,nvc0: add explicit settings for recent caps Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-04 23:36:46 -05:00
Alyssa Rosenzweig	e67e072637	panfrost: Implement Midgard shader toolchain This patch implements the free Midgard shader toolchain: the assembler, the disassembler, and the NIR-based compiler. The assembler is a standalone inaccessible Python script for reference purposes. The disassembler and the compiler are implemented in C, accessible via the standalone `midgard_compiler` binary. Later patches will use these interfaces from the driver for online compilation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-05 01:26:28 +00:00
Alyssa Rosenzweig	61d3ae6e0b	panfrost: Initial stub for Panfrost driver This patch adds an initial stub for the Gallium driver, containing simple screen functions and the majority of the driver headers but no actual functionality. It further adds the winsys glue for linking in this stub driver via kmsro on Rockchip/Amlogic boards. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-05 01:19:30 +00:00
Marek Olšák	742d6cdb42	radeonsi: fix crashing performance counters (division by zero) Fixes: `e2b9329f17` "radeonsi: move remaining perfcounter code into si_perfcounter.c"	2019-02-04 18:46:25 -05:00
Marek Olšák	a03ecbaeec	radeonsi: handle render_condition_enable in si_compute_clear_render_target	2019-02-04 18:46:25 -05:00
Sonny Jiang	984fd73515	radeonsi: use compute for clear_render_target when possible Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-02-04 18:46:25 -05:00
Kenneth Graunke	dc46317d1a	st/mesa: Set pipe_image_view::shader_access in PBO readpixels. Commit `8b626a22b2` introduced a new pipe_image_view::shader_access field, indicating the access mode specified in the shader. st/mesa's built-in PBO download shader creates a write-only image buffer, so we should flag it as such. Nobody uses this field yet (Iris will), so we don't need to backport this fix to stable branches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-02-04 11:17:56 -08:00
Rodrigo Vivi	56c3b4971d	intel: Add more PCI Device IDs for Coffee Lake and Ice Lake. Align with kernel commits: 5e0f5a58b167 ("drm/i915/cfl: Adding another PCI Device ID.") 03ca3cf8e9aa ("drm/i915/icl: Adding few more device IDs for Ice Lake") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-04 10:05:25 -08:00
Danylo Piliaiev	64d3b148fe	anv: Fix VK_EXT_transform_feedback working with varyings packed in PSIZ Transform feedback did not set correct SO_DECL.ComponentMask for varyings packed in VARYING_SLOT_PSIZ: gl_Layer - VARYING_SLOT_LAYER in VARYING_SLOT_PSIZ.y gl_ViewportIndex - VARYING_SLOT_VIEWPORT in VARYING_SLOT_PSIZ.z gl_PointSize - VARYING_SLOT_PSIZ in VARYING_SLOT_PSIZ.w Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 15:30:43 +00:00
Danylo Piliaiev	b7a93cbded	radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment From the Vulkan 1.0.98 spec for vkCmdClearAttachments: "If any attachment to be cleared in the current subpass is VK_ATTACHMENT_UNUSED, then the clear has no effect on that attachment." "If the aspectMask member of any element of pAttachments contains VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED, or must be a valid color attachment." "If the aspectMask member of any element of pAttachments contains VK_IMAGE_ASPECT_DEPTH_BIT, then the current subpass' depth/stencil attachment must either be VK_ATTACHMENT_UNUSED, or must have a depth component" "If the aspectMask member of any element of pAttachments contains VK_IMAGE_ASPECT_STENCIL_BIT, then the current subpass' depth/stencil attachment must either be VK_ATTACHMENT_UNUSED, or must have a stencil component" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 14:50:43 +02:00
Danylo Piliaiev	d76e777988	anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment From the Vulkan 1.0.98 spec for vkCmdClearAttachments: "If the aspectMask member of any element of pAttachments contains VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED, or must be a valid color attachment." Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-04 14:49:50 +02:00
Samuel Pitoiset	0d0affad3c	radv: don't flush src stages when dstStageMask == BOTTOM_OF_PIPE Original patch by Fredrik Höglund. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	9efa3405a7	radv: do not set preserveAttachments for internal render passes We don't use that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	80e809d993	radv: drop useless checks when resolving subpass color attachments The Vulkan spec says: "If pResolveAttachments is not NULL, for each resolve attachment that does not have the value VK_ATTACHMENT_UNUSED, the corresponding color attachment must not have the value VK_ATTACHMENT_UNUSED." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	76c17cfd8d	radv: execute external subpass barriers after ending subpasses Outgoing dependencies (ie. external) should happen after the subpass. This doesn't change anything for subpass resolves as we already make sure that attachments are shader readable. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	b482c030f5	radv: accumulate all ingoing external dependencies to the first subpass In case two or more subpasses declare ingoing external dependencies. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	eaab35e5e3	radv: handle subpass dependencies correctly The different masks should be accumulated. For example if two subpasses declare an outgoing dependency (ie. dst == VK_SUBPASS_EXTERNAL). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	6430616e77	radv: track if subpasses have color attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	1e810f1c53	radv: add radv_render_pass_add_subpass_dep() helper To share common code that handles subpass dependencies. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	2472907563	radv: move some render pass things to radv_render_pass_compile() radv_render_pass_compile() is common to vkCreateRenderPass() and vkCreateRenderPass2(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:19:14 +01:00
Samuel Pitoiset	b509013060	radv: handle final layouts at end of every subpass and render pass That shouldn't change anything as we check if the last subpass id is the final subpass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:18:38 +01:00
Samuel Pitoiset	5699ac0078	radv: determine the last subpass id for every attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:59 +01:00
Samuel Pitoiset	e1a0a268c6	radv: use the new attachments array when starting subpasses Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:57 +01:00
Samuel Pitoiset	a20c2e38d8	radv: store the list of attachments for every subpass This reworks how the depth stencil attachment is used for simplicity. This also introduces radv_render_pass_compile() helper that will be used for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:54 +01:00
Samuel Pitoiset	a7c7d811f1	radv: move subpass image transitions to radv_cmd_buffer_begin_subpass() Instead of doing them in radv_cmd_buffer_set_subpass(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:52 +01:00
Samuel Pitoiset	291a933786	radv: add radv_cmd_buffer_begin_subpass() helper To unify some code in BeginRenderPass() and NextSubpass(). Based on Intel ANV driver. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:50 +01:00
Samuel Pitoiset	41199e2eeb	radv: remove useless MAYBE_UNUSED in CmdBeginRenderPass() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:46 +01:00
Samuel Pitoiset	545552c9b9	radv: remove unused radv_render_pass_attachment::view_mask Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:42 +01:00
Samuel Pitoiset	0f932bbede	radv: bail out when no image transitions will be performed Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-04 13:17:40 +01:00
Marek Olšák	1e85cfb91a	meson: drop the xcb-xrandr version requirement autotools doesn't have any requirement. This fixes meson on Ubuntu 16.04. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-03 18:39:57 -05:00
Eric Engestrom	808bf59cac	wsi/display: add comment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Keith Packard <keithp@keithp.com>	2019-02-02 23:08:03 +00:00
Jason Ekstrand	0aa5a97b03	relnotes: Add VK_EXT_buffer_device_address	2019-02-02 08:42:14 -06:00
Jason Ekstrand	48ed2a7bb0	anv: Implement VK_EXT_buffer_device_address Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-01 17:09:42 -06:00
Jason Ekstrand	e644ed468f	intel/fs: Implement nir_intrinsic_global_atomic_* eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	a91f392073	intel/fs: Use SENDS for A64 writes on gen9+ eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	1c25bf4373	intel/fs: Implement load/store_global with A64 untyped messages eviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	b4f0d062cd	intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode Previously, we only applied the fix to shaders with a dispatch mode of SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16 instructions. If you have a SIMD8 instruction in a SIMD16 shader, neither would trigger and the restriction could still be hit. Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:11:00 -06:00
Jason Ekstrand	79724a0756	intel/fs: Properly handle 64-bit types in LOAD_PAYLOAD By just assigning dst.type to src[i].type, we ensure that the offset at the end of the loop actually offsets it by the right number of registers. Otherwise, we'll get into a case where we copy with a Q type and then offset with a D type and things get out of sync. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:10:57 -06:00
Jason Ekstrand	f02914a991	intel/fs/cse: Split create_copy_instr into three cases Previously, we tried to combine all cases where the instruction being CSE'd writes to more than one MOV worth of registers into one case with a bit of special casing for LOAD_PAYLOAD. This commit splits things so that LOAD_PAYLOAD is entirely it's own case. This makes tweaking the LOAD_PAYLOAD case simpler in the next commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:10:40 -06:00
Jason Ekstrand	f409a08e5f	intel/nir: Add global support to lower_mem_access_bit_sizes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 16:08:29 -06:00
Oscar Blumberg	fea5b8e5ad	intel/fs: Fix memory corruption when compiling a CS Missing check for shader stage in the fs_visitor would corrupt the cs_prog_data.push information and trigger crashes / corruption later when uploading the CS state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 10:53:33 -08:00
Jason Ekstrand	ab940b0d97	spirv: Support LocalSizeId and LocalSizeHintId execution modes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-01 17:34:02 +00:00
Jason Ekstrand	7223590c42	spirv: Handle OpExecutionModeId Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-01 17:34:02 +00:00
Jason Ekstrand	e68871f6a4	spirv: Handle constants and types before execution modes We already defer handling the actual execution modes until after we've created the shader. This just moves it a tiny bit further so we actually have constants and types and can handle OpExecutionModeId. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-01 17:34:02 +00:00
Jason Ekstrand	7d862ef530	spirv: Rework handling of spec constant workgroup size built-ins Instead of handling it as part of the handling of constant instructions, just stash the vtn_value when we see the decoration and handle it explicitly later. This will let us re-order handling of constant instructions without breaking the Vulkan SPIR-V requirement that decorating a specialization constant as the WorkgroupSize built-in overrides the workgroup size set as an execution mode. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-01 17:34:02 +00:00
Jason Ekstrand	9b37e93e42	spirv: Replace vtn_constant_value with vtn_constant_uint The uint version is less typing, supports different bit sizes, and is probably a bit more safe because we're actually verifying that the SPIR-V value is an integer scalar constant. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-01 17:34:02 +00:00
Samuel Pitoiset	5e7f800f32	radv: fix build Fixes: `9b9ccee4d6` ("radv: take LDS into account for compute shader occupancy stats") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-01 15:31:55 +01:00
Timothy Arceri	9b9ccee4d6	radv: take LDS into account for compute shader occupancy stats Ported from `d205faeb6c`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-01 22:25:30 +11:00
Timothy Arceri	a53d68d318	ac/radv/radeonsi: add ac_get_num_physical_sgprs() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-01 22:25:30 +11:00
Gurchetan Singh	574186f0e8	docs: add GL_EXT_texture_compression_s3tc_srgb to release notes Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-01 10:01:59 +00:00
Gurchetan Singh	dc9a15aefb	st/mesa: expose EXT_texture_compression_s3tc_srgb Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-01 10:01:59 +00:00
Gurchetan Singh	a2ab400719	i965: Set flag for EXT_texture_compression_s3tc_srgb Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-01 10:01:59 +00:00
Gurchetan Singh	db24132d80	mesa/main: Expose EXT_texture_compression_s3tc_srgb Required for the following test: bin/compressedteximage GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT pass when emulating GL on GLES. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-02-01 10:01:59 +00:00
Timothy Arceri	0f3a8e1b64	st/glsl_to_nir: remove dead local variables Without this we do not end up with a deterministic NIR because temporary register variables are added in random order. NIR must be deterministic because we use it to produce a sha for the radeonsi backends disk cache. This fixes the shader cache for a bunch of shaders. Another positive is that this results in a large reduction in the size of the NIR that the state tracker stores to the disk cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-02-01 15:56:02 +11:00
Dylan Baker	4052142de7	meson: remove -std=c++11 from intel/tools for meson all C++ code is already compiled as C++11, so it's unnecessary. It's also the wrong way to do this, if we really needed this the correct way is to set: ```meson executable( ... override_options : ['cpp_std=c++11'], ) ``` Which ensures not only that the correct syntax for the current compiler is used, but also that meson doesn't create arguments like `-std=c++14 ... -std=c++11` Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-31 21:42:16 +00:00
Dylan Baker	8e49b32f63	meson: fix style in intel/tools The `:` in options should always have one space before and after `foo : bar`, and lists do not get spaces around the braces: `[foo]` not `[ foo ]` Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-31 21:42:16 +00:00
Dylan Baker	d93d53fa72	meson: remove build_by_default : true Which is and has always been the default. This is largely an artifact of how the building of these tools was controlled when the meson build was originally created. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-31 21:42:16 +00:00
Emil Velikov	1240c3cb10	docs: update calendar, add news item and link release notes for 18.3.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-31 21:17:38 +00:00
Emil Velikov	83160c6c05	docs: add sha256 checksums for 18.3.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `7475d7727f`)	2019-01-31 21:15:20 +00:00
Emil Velikov	4d0732dc39	docs: add release notes for 18.3.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `190a79f462`) [Emil: drop VERSION hunk] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: VERSION	2019-01-31 21:14:56 +00:00
Neha Bhende	69d736b17a	st/mesa: Fix topogun-1.06-orc-84k-resize.trace crash We need to initialize all fields in rs->prim explicitly while creating new rastpos stage. Fixes: `bac8534267` ("st/mesa: allow glDrawElements to work with GL_SELECT feedback") v2: Initializing all fields in rs->prim as per Ilia. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-31 12:21:59 -07:00
Dylan Baker	c812c740e6	android,autotools,i965: Fix location of float64_glsl.h Android.mk and autotools disagree about where generated files should go, which wasn't a problem until we wanted to build a dist tarball. This corrects the problem by changing the output and include paths to be the same on android and autotools (meson already has the correct include path). Fixes: `7d7b30835c` ("automake: Fix path to generated source") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-31 19:04:30 +00:00
Marek Olšák	d49c16a597	gallium: allow more PIPE_RESOURCE_ driver flags radeonsi has 8 and will probably have 9 soon. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-01-31 13:10:42 -05:00
Eric Anholt	ab4d5775b0	v3d: Fix image_load_store clamping of signed integer stores. This was copy-and-paste fail, that oddly showed up in the CTS's reinterprets of r32f, rgba8, and srgba8 to rgba8i, but not r32ui and r32i to rgba8i or reinterprets to other signed int formats. Fixes: `6281f26f06` ("v3d: Add support for shader_image_load_store.")	2019-01-31 08:39:40 -08:00
Eric Anholt	db2ae51121	mesa: Skip partial InvalidateFramebuffer of packed depth/stencil. One of the CTS cases tries to invalidate just stencil of packed depth/stencil, and we incorrectly lost the depth contents. Fixes dEQP-GLES3.functional.fbo.invalidate.whole.unbind_read_stencil Fixes: `0c42b5f3cb` ("mesa: wire up InvalidateFramebuffer") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-31 08:37:46 -08:00
Rob Clark	39cfdf9930	freedreno: more fixing release tarball Fixes: `aa0fed10d3` freedreno: move ir3 to common location Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-31 09:59:18 -05:00
Rob Clark	e252656d14	freedreno: fix release tarball Fixes: `b4476138d5` freedreno: move drm to common location Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-31 09:59:18 -05:00
Emmanuel Gil Peyrot	0d4dd59ae5	docs: make bugs.html easier to find Thanks to Yann Kervran for the report and suggestions. Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-31 14:31:48 +00:00
Dave Airlie	9279a28f07	virgl: ARB_query_buffer_object support v1.1: fix size define. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-01-31 11:23:38 +10:00
Dave Airlie	38658c6d4d	virgl: enable elapsed time queries GL underneath always has GL_TIME_ELAPSED so always enable these. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-01-31 11:23:30 +10:00
Dylan Baker	da48cba61e	automake: Add --enable-autotools to distcheck flags Fixes: `e68777c87c` ("autotools: Deprecate the use of autotools") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-01-30 19:32:44 +00:00
Marek Olšák	ffbd37d8e9	radeonsi: fix a comment typo in si_fine_fence_set	2019-01-30 14:32:05 -05:00
Marek Olšák	f4eb746ef7	r600: add -Wstrict-overflow=0 to meson to silence the warning same as radeonsi	2019-01-30 12:49:45 -05:00
Marek Olšák	d50bef9831	winsys/amdgpu: remove amdgpu_drm.h definitions trivial	2019-01-30 12:38:56 -05:00
Marek Olšák	16672f16da	radeonsi: unify error paths in si_texture_create_object	2019-01-30 12:35:22 -05:00
Marek Olšák	2361558eb7	radeonsi: merge & rename texture BO metadata functions	2019-01-30 12:35:22 -05:00
Marek Olšák	1c12d56e4d	radeonsi: enable dithered alpha-to-coverage for better quality same as AMDVLK. GL_NV_alpha_to_coverage_dither_control allows controlling this behavior. The default is implementation-dependent.	2019-01-30 12:35:22 -05:00
Dylan Baker	b4986d2e0c	gallium: wrap u_screen in extern "C" for c++ Some drivers (notabily SWR) are written in C++, and as such they need access to C headers with extern "C". So lets add that.	2019-01-30 15:12:27 +00:00
Gert Wollny	45903cddc3	mesa/core: Enable EXT_texture_sRGB_R8 also for desktop GL As of Nov/30/2018 the extension is also valid for OpenGL >= 1.2, so enable it accordingly and also add the required view class entry. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-30 11:32:40 +00:00
Samuel Pitoiset	9c762c01c8	radv/winsys: fix hash when adding internal buffers This fixes serious stuttering in Shadow Of The Tomb Raider. Fixes: `50fd253bd6` ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-30 12:29:10 +01:00
Erik Faye-Lund	3b6f95ad66	mesa: expose NV_conditional_render on GLES The extension spec has been updated to include GLES 2 support, so let's enable it there. v2: fixup ABI-check as well Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-30 09:43:44 +01:00
Ernestas Kulik	90458bef54	v3d: Fix leak in resource setup error path Reported by Coverity: in the case of unsupported modifier request, the code does not jump to the “fail” label to destroy the acquired resource. CID: 1435704 Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com> Fixes: `45bb8f2957` ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")	2019-01-29 16:14:13 -08:00
Ernestas Kulik	f6e49d5ad0	vc4: Fix leak in HW queries error path Reported by Coverity: in the case where there exist hardware and non-hardware queries, the code does not jump to err_free_query and leaks the query. CID: 1430194 Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com> Fixes: `9ea90ffb98` ("broadcom/vc4: Add support for HW perfmon")	2019-01-29 16:14:13 -08:00
Eric Anholt	6053c7bb43	v3d: Fix a release build set-but-unused compiler warning.	2019-01-29 16:02:51 -08:00
Eric Anholt	0c05198d6b	v3d: Always enable the NEON utile load/store code. I can't imagine the new HW block being paired with a v6 CPU, so don't bother with the CPU detection that vc4 had to do. Improves 1024x1024 TexImage on my 7278 by 47.3229% +/- 0.679632%	2019-01-29 16:00:25 -08:00
Emil Velikov	385843ac3c	vc4: Declare the last cpu pointer as being modified in NEON asm. Earlier commit addressed 7 of the 8 instances available. v2: Rebase patch back to master (by anholt) Cc: Carsten Haitzler (Rasterman) <raster@rasterman.com> Cc: Eric Anholt <eric@anholt.net> Fixes: `300d3ae8b1` ("vc4: Declare the cpu pointers as being modified in NEON asm.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-29 16:00:25 -08:00
Dylan Baker	75ad254acf	docs: Add relnotes stub for 19.1	2019-01-29 15:32:16 -08:00
Dylan Baker	dba0989ac1	bump version for 19.0 branch	2019-01-29 15:30:25 -08:00

1999 changed files with 259636 additions and 83562 deletions

52

.gitignore vendored

View File

@@ -1,54 +1,2 @@
 *.a
 *.dll
 *.exe
 *.ilk
 *.la
 *.lo
 *.log
 *.o
 *.obj
 *.orig
 *.os
 *.pc
 *.pdb
 *.pyc
 *.pyo
 *.rej
 *.so
 *.so.*
 *.sw[a-z]
 *.tar
 *.tar.bz2
 *.tar.gz
 *.tar.xz
 *.trs
 *.zip
 *~
 depend
 depend.bak
 bin/ltmain.sh
 lib
 lib64
 configure
 configure.lineno
 autom4te.cache
 aclocal.m4
 config.log
 config.status
 cscope*
 tags
 .scon*
 config.py
 build
 libtool
 manifest.txt
 .dir-locals.el
 .deps/
 .dirstamp
 .libs/
 Makefile
 Makefile.in
 .install-mesa-links
 .install-gallium-links
 /src/git_sha1.h
 TAGS

									
										264

.gitlab-ci.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,264 @@

				# This is the tag of the docker image used for the build jobs. If the

				# image doesn't exist yet, the containers-build stage generates it.

				#

				# In order to generate a new image, one should generally change the tag.

				# While removing the image from the registry would also work, that's not

				# recommended except for ephemeral images during development: Replacing

				# an image after a significant amount of time might pull in newer

				# versions of gcc/clang or other packages, which might break the build

				# with older commits using the same tag.

				#

				# After merging a change resulting in generating a new image to the

				# main repository, it's recommended to remove the image from the source

				# repository's container registry, so that the image from the main

				# repository's registry will be used there as well.

				#

				# The format of the tag is "%Y-%m-%d-${counter}" where ${counter} stays

				# at "01" unless you have multiple updates on the same day :)

				variables:

				  UPSTREAM_REPO: mesa/mesa

				  DEBIAN_TAG: "2019-05-01"

				  DEBIAN_VERSION: stretch-slim

				  DEBIAN_IMAGE: "$CI_REGISTRY_IMAGE/debian/$DEBIAN_VERSION:$DEBIAN_TAG"

				include:

				  - project: 'wayland/ci-templates'

				    ref: c73dae8b84697ef18e2dbbf4fed7386d9652b0cd

				    file: '/templates/debian.yml'

				stages:

				  - containers-build

				  - build+test

				# When to automatically run the CI

				.ci-run-policy: &ci-run-policy

				  only:

				    - branches@mesa/mesa

				    - merge_requests

				    - /^ci([-/].*)?$/

				  retry:

				    max: 2

				    when:

				      - runner_system_failure

				# CONTAINERS

				debian:

				  extends: .debian@container-ifnot-exists

				  stage: containers-build

				  <<: *ci-run-policy

				  variables:

				    GIT_STRATEGY: none # no need to pull the whole tree for rebuilding the image

				    DEBIAN_EXEC: 'bash .gitlab-ci/debian-install.sh'

				# BUILD

				.build:

				  <<: *ci-run-policy

				  image: $DEBIAN_IMAGE

				  stage: build+test

				  cache:

				    paths:

				      - ccache

				  artifacts:

				    when: on_failure

				    untracked: true

				  variables:

				    CCACHE_COMPILERCHECK: "content"

				  # Use ccache transparently, and print stats before/after

				  before_script:

				    - export PATH="/usr/lib/ccache:$PATH"

				    - export CCACHE_BASEDIR="$PWD"

				    - export CCACHE_DIR="$PWD/ccache"

				    - ccache --zero-stats || true

				    - ccache --show-stats || true

				  after_script:

				    - export CCACHE_DIR="$PWD/ccache"

				    - ccache --show-stats

				.meson-build:

				  extends: .build

				  script:

				    # We need to control the version of llvm-config we're using, so we'll

				    # generate a native file to do so. This requires meson >=0.49

				    - if test -n "$LLVM_VERSION"; then

				        LLVM_CONFIG="llvm-config-${LLVM_VERSION}";

				        echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file;

				        $LLVM_CONFIG --version;

				      else

				        touch native.file;

				      fi

				    - meson --version

				    - meson _build

				            --native-file=native.file

				            -D buildtype=debug

				            -D build-tests=true

				            -D libunwind=${UNWIND}

				            ${DRI_LOADERS}

				            -D dri-drivers=${DRI_DRIVERS:-[]}

				            ${GALLIUM_ST}

				            -D gallium-drivers=${GALLIUM_DRIVERS:-[]}

				            -D vulkan-drivers=${VULKAN_DRIVERS:-[]}

				            -D I-love-half-baked-turnips=true

				    - cd _build

				    - meson configure

				    - ninja -j4

				    - LC_ALL=C.UTF-8 ninja test

				.scons-build:

				  extends: .build

				  variables:

				    SCONSFLAGS: "-j4"

				  script:

				    - if test -n "$LLVM_VERSION"; then

				        export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";

				      fi

				    - scons $SCONS_TARGET

				    - eval $SCONS_CHECK_COMMAND

				# NOTE: Building SWR is 2x (yes two) times slower than all the other

				# gallium drivers combined.

				# Start this early so that it doesn't limit the total run time.

				#

				# We also put softpipe (and therefore gallium nine, which requires

				# it) here, since softpipe/llvmpipe can't be built alongside classic

				# swrast.

				#

				# Putting glvnd here is arbitrary, but we want it in one of the builds

				# for coverage.

				meson-swr-glvnd:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glvnd=true

				      -D egl=true

				    GALLIUM_ST: >

				      -D dri3=true

				      -D gallium-vdpau=false

				      -D gallium-xvmc=false

				      -D gallium-omx=disabled

				      -D gallium-va=false

				      -D gallium-xa=false

				      -D gallium-nine=true

				      -D gallium-opencl=disabled

				      -D osmesa=gallium

				    GALLIUM_DRIVERS: "swr,swrast,iris"

				    LLVM_VERSION: "6.0"

				meson-clang:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_DRIVERS: "auto"

				    GALLIUM_DRIVERS: "auto"

				    VULKAN_DRIVERS: intel,amd,freedreno

				    CC: "ccache clang-8"

				    CXX: "ccache clang++-8"

				  before_script:

				    - export CCACHE_BASEDIR="$PWD" CCACHE_DIR="$PWD/ccache"

				    - ccache --zero-stats --show-stats || true

				     # clang++ breaks if it picks up the GCC 8 directory without libstdc++.so

				    - apt-get remove -y libgcc-8-dev

				meson-vulkan:

				  extends: .meson-build

				  variables:

				    UNWIND: "false"

				    DRI_LOADERS: >

				      -D glx=disabled

				      -D gbm=false

				      -D egl=false

				      -D platforms=x11,wayland,drm

				      -D osmesa=none

				    GALLIUM_ST: >

				      -D dri3=true

				      -D gallium-vdpau=false

				      -D gallium-xvmc=false

				      -D gallium-omx=disabled

				      -D gallium-va=false

				      -D gallium-xa=false

				      -D gallium-nine=false

				      -D gallium-opencl=disabled

				    VULKAN_DRIVERS: intel,amd,freedreno

				    LLVM_VERSION: "7"

				meson-main:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glx=dri

				      -D gbm=true

				      -D egl=true

				      -D platforms=x11,wayland,drm,surfaceless

				      -D osmesa=classic

				    DRI_DRIVERS: "i915,i965,r100,r200,swrast,nouveau"

				    GALLIUM_ST: >

				      -D dri3=true

				      -D gallium-extra-hud=true

				      -D gallium-vdpau=true

				      -D gallium-xvmc=true

				      -D gallium-omx=bellagio

				      -D gallium-va=true

				      -D gallium-xa=true

				      -D gallium-nine=false

				      -D gallium-opencl=disabled

				    GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,svga,v3d,vc4,virgl,etnaviv,panfrost,lima"

				    LLVM_VERSION: "7"

				meson-clover-llvm:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glx=disabled

				      -D egl=false

				      -D gbm=false

				    GALLIUM_ST: >

				      -D dri3=false

				      -D gallium-vdpau=false

				      -D gallium-xvmc=false

				      -D gallium-omx=disabled

				      -D gallium-va=false

				      -D gallium-xa=false

				      -D gallium-nine=false

				      -D gallium-opencl=icd

				    GALLIUM_DRIVERS: "r600,radeonsi"

				meson-clover-llvm39:

				  extends: meson-clover-llvm

				  variables:

				    GALLIUM_DRIVERS: "i915,r600"

				    LLVM_VERSION: "3.9"

				scons-nollvm:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "llvm=0"

				    SCONS_CHECK_COMMAND: "scons llvm=0 check"

				scons-llvm:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "llvm=1"

				    SCONS_CHECK_COMMAND: "scons llvm=1 check"

				    LLVM_VERSION: "3.4"

				    # LLVM 3.4 packages were built with an old libstdc++ ABI

				    CXX: "g++ -D_GLIBCXX_USE_CXX11_ABI=0"

				scons-swr:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "swr=1"

				    SCONS_CHECK_COMMAND: "true"

				    LLVM_VERSION: "6.0"

				scons-win64:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: platform=windows machine=x86_64

				    SCONS_CHECK_COMMAND: "true"

									
										181

.gitlab-ci/debian-install.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,181 @@

				#!/bin/bash

				set -e

				set -o xtrace

				export DEBIAN_FRONTEND=noninteractive

				apt-get install -y \

				      apt-transport-https \

				      ca-certificates \

				      curl \

				      wget \

				      gnupg \

				      software-properties-common

				curl -fsSL https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -

				add-apt-repository "deb https://apt.llvm.org/stretch/ llvm-toolchain-stretch-7 main"

				add-apt-repository "deb https://apt.llvm.org/stretch/ llvm-toolchain-stretch-8 main"

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian stretch-backports main' >/etc/apt/sources.list.d/backports.list

				echo 'deb https://deb.debian.org/debian jessie main' >/etc/apt/sources.list.d/jessie.list

				apt-get update

				apt-get install -y -t stretch-backports \

				      llvm-3.4-dev \

				      llvm-3.9-dev \

				      libclang-3.9-dev \

				      llvm-5.0-dev \

				      llvm-6.0-dev \

				      llvm-7-dev \

				      g++ \

				      clang-8 \

				      libclang-7-dev

				# Install remaining packages from Debian buster to get newer versions

				add-apt-repository "deb https://deb.debian.org/debian/ buster main"

				add-apt-repository "deb https://deb.debian.org/debian/ buster-updates main"

				apt-get update

				apt-get install -y \

				      bzip2 \

				      zlib1g-dev \

				      pkg-config \

				      libxrender-dev \

				      libxdamage-dev \

				      libxxf86vm-dev \

				      gcc \

				      libclc-dev \

				      libxvmc-dev \

				      libomxil-bellagio-dev \

				      xz-utils \

				      libexpat1-dev \

				      libx11-xcb-dev \

				      libelf-dev \

				      libunwind-dev \

				      libglvnd-dev \

				      python-mako \

				      python3-mako \

				      meson \

				      scons

				# autotools build deps

				apt-get install -y \

				      automake \

				      libtool \

				      bison \

				      flex \

				      gettext \

				      make

				# for 64bit windows cross-builds

				apt-get install -y \

				      wine64 \

				      mingw-w64

				# dependencies where we want a specific version

				export              XORG_RELEASES=https://xorg.freedesktop.org/releases/individual

				export               XCB_RELEASES=https://xcb.freedesktop.org/dist

				export           WAYLAND_RELEASES=https://wayland.freedesktop.org/releases

				export         XORGMACROS_VERSION=util-macros-1.19.0

				export            GLPROTO_VERSION=glproto-1.4.17

				export          DRI2PROTO_VERSION=dri2proto-2.8

				export       LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				export             LIBDRM_VERSION=libdrm-2.4.97

				export           XCBPROTO_VERSION=xcb-proto-1.13

				export         RANDRPROTO_VERSION=randrproto-1.3.0

				export          LIBXRANDR_VERSION=libXrandr-1.3.0

				export             LIBXCB_VERSION=libxcb-1.13

				export       LIBXSHMFENCE_VERSION=libxshmfence-1.3

				export           LIBVDPAU_VERSION=libvdpau-1.1

				export              LIBVA_VERSION=libva-1.7.0

				export         LIBWAYLAND_VERSION=wayland-1.15.0

				export  WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8

				wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2

				tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2

				cd $XORGMACROS_VERSION; ./configure; make install; cd ..

				rm -rf $XORGMACROS_VERSION

				wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2

				tar -xvf $GLPROTO_VERSION.tar.bz2 && rm $GLPROTO_VERSION.tar.bz2

				cd $GLPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $GLPROTO_VERSION

				wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2

				tar -xvf $DRI2PROTO_VERSION.tar.bz2 && rm $DRI2PROTO_VERSION.tar.bz2

				cd $DRI2PROTO_VERSION; ./configure; make install; cd ..

				rm -rf $DRI2PROTO_VERSION

				wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2

				cd $XCBPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $XCBPROTO_VERSION

				wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2

				tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2

				cd $LIBXCB_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXCB_VERSION

				wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2

				tar -xvf $LIBPCIACCESS_VERSION.tar.bz2 && rm $LIBPCIACCESS_VERSION.tar.bz2

				cd $LIBPCIACCESS_VERSION; ./configure; make install; cd ..

				rm -rf $LIBPCIACCESS_VERSION

				wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2

				cd $LIBDRM_VERSION; ./configure --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api; make install; cd ..

				rm -rf $LIBDRM_VERSION

				wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2

				tar -xvf $RANDRPROTO_VERSION.tar.bz2 && rm $RANDRPROTO_VERSION.tar.bz2

				cd $RANDRPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $RANDRPROTO_VERSION

				wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2

				tar -xvf $LIBXRANDR_VERSION.tar.bz2 && rm $LIBXRANDR_VERSION.tar.bz2

				cd $LIBXRANDR_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXRANDR_VERSION

				wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				tar -xvf $LIBXSHMFENCE_VERSION.tar.bz2 && rm $LIBXSHMFENCE_VERSION.tar.bz2

				cd $LIBXSHMFENCE_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXSHMFENCE_VERSION

				wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2

				tar -xvf $LIBVDPAU_VERSION.tar.bz2 && rm $LIBVDPAU_VERSION.tar.bz2

				cd $LIBVDPAU_VERSION; ./configure; make install; cd ..

				rm -rf $LIBVDPAU_VERSION

				wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2

				tar -xvf $LIBVA_VERSION.tar.bz2 && rm $LIBVA_VERSION.tar.bz2

				cd $LIBVA_VERSION; ./configure --disable-wayland --disable-dummy-driver; make install; cd ..

				rm -rf $LIBVA_VERSION

				wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz

				tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz

				cd $LIBWAYLAND_VERSION; ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation; make install; cd ..

				rm -rf $LIBWAYLAND_VERSION

				wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz

				tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz

				cd $WAYLAND_PROTOCOLS_VERSION; ./configure; make install; cd ..

				rm -rf $WAYLAND_PROTOCOLS_VERSION

				# Use ccache to speed up builds

				apt-get install -y ccache

				# We need xmllint to validate the XML files in Mesa

				apt-get install -y libxml2-utils

				# Remove unused packages

				apt-get purge -y \

				      automake \

				      libtool \

				      make \

				      curl \

				      wget \

				      gnupg \

				      software-properties-common

				apt-get autoremove -y --purge

3

.mailmap

View File

@@ -265,6 +265,9 @@ Kristian Høgsberg <krh@bitplanet.net> <krh@hinata.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com>
 Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@chromium.org>
 Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@google.com>
 Kristian Høgsberg <krh@bitplanet.net> <hoegsberg@gmail.com>
 Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io>

									
										862

.travis.yml
									
												View File
												
				@@ -1,852 +1,40 @@

				language: c

				dist: xenial

				os: osx

				cache:

				  apt: true

				  ccache: true

				env:

				  global:

				    - XORG_RELEASES=https://xorg.freedesktop.org/releases/individual

				    - XCB_RELEASES=https://xcb.freedesktop.org/dist

				    - WAYLAND_RELEASES=https://wayland.freedesktop.org/releases

				    - XORGMACROS_VERSION=util-macros-1.19.0

				    - GLPROTO_VERSION=glproto-1.4.17

				    - DRI2PROTO_VERSION=dri2proto-2.8

				    - LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				    - LIBDRM_VERSION=libdrm-2.4.97

				    - XCBPROTO_VERSION=xcb-proto-1.13

				    - RANDRPROTO_VERSION=randrproto-1.3.0

				    - LIBXRANDR_VERSION=libXrandr-1.3.0

				    - LIBXCB_VERSION=libxcb-1.13

				    - LIBXSHMFENCE_VERSION=libxshmfence-1.2

				    - LIBVDPAU_VERSION=libvdpau-1.1

				    - LIBVA_VERSION=libva-1.7.0

				    - LIBWAYLAND_VERSION=wayland-1.15.0

				    - WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8

				    - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig

				    - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"

				    - PATH="$HOME/prefix/bin:$PATH"

				matrix:

				  include:

				    - env:

				        - LABEL="meson Vulkan"

				        - BUILD=meson

				        - UNWIND="false"

				        - DRI_LOADERS="-Dglx=disabled -Dgbm=false -Degl=false -Dplatforms=x11,wayland,drm -Dosmesa=none"

				        - GALLIUM_ST="-Ddri3=true -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"

				        - VULKAN_DRIVERS="intel,amd"

				        - LLVM_VERSION=7

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          sources:

				            - sourceline: 'deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main'

				              key_url: https://apt.llvm.org/llvm-snapshot.gpg.key

				          packages:

				            - llvm-7-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson loaders/classic DRI"

				        - BUILD=meson

				        - UNWIND="false"

				        - DRI_LOADERS="-Dglx=dri -Dgbm=true -Degl=true -Dplatforms=x11,wayland,drm,surfaceless -Dosmesa=classic"

				        - DRI_DRIVERS="i915,i965,r100,r200,swrast,nouveau"

				        - GALLIUM_ST="-Ddri3=true -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"

				      addons:

				        apt:

				          packages:

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libxxf86vm-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libxdamage-dev

				            - libxfixes-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make loaders/classic DRI"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="make check"

				        - DRI_LOADERS="--enable-glx --enable-gbm --enable-egl --with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"

				        - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS=""

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--disable-libunwind"

				      addons:

				        apt:

				          packages:

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libxxf86vm-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libxdamage-dev

				            - libxfixes-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        # NOTE: Building SWR is 2x (yes two) times slower than all the other

				        # gallium drivers combined.

				        # Start this early so that it doesn't hunder the run time.

				        - LABEL="meson Gallium Drivers SWR"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=false -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"

				        - GALLIUM_DRIVERS="swr"

				        - LLVM_VERSION=6.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            - llvm-6.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson Gallium Drivers RadeonSI"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=false -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"

				        - GALLIUM_DRIVERS="radeonsi"

				        - LLVM_VERSION=7

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          sources:

				            - sourceline: 'deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main'

				              key_url: https://apt.llvm.org/llvm-snapshot.gpg.key

				          packages:

				            # From sources above

				            - llvm-7-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson Gallium Drivers Other"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=false -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"

				        - GALLIUM_DRIVERS="i915,nouveau,kmsro,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv"

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-5.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson Gallium ST Clover LLVM-5.0"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=false -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=icd"

				        - GALLIUM_DRIVERS="r600"

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            - libclc-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-5.0-dev

				            - clang-5.0

				            - libclang-5.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson Gallium ST Clover LLVM-6.0"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=false -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=icd"

				        - GALLIUM_DRIVERS="r600"

				        - LLVM_VERSION=6.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            - libclc-dev

				            - llvm-6.0-dev

				            - clang-6.0

				            - libclang-6.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson Gallium ST Clover LLVM-7"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=false -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=icd"

				        - GALLIUM_DRIVERS="r600,radeonsi"

				        - LLVM_VERSION=7

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          sources:

				            - sourceline: 'deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main'

				              key_url: https://apt.llvm.org/llvm-snapshot.gpg.key

				          packages:

				            - libclc-dev

				            # From sources above

				            - llvm-7-dev

				            - clang-7

				            - libclang-7-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="meson Gallium ST Other"

				        - BUILD=meson

				        - UNWIND="true"

				        - DRI_LOADERS="-Dglx=disabled -Degl=false -Dgbm=false"

				        - GALLIUM_ST="-Ddri3=true -Dgallium-vdpau=true -Dgallium-xvmc=true -Dgallium-omx=bellagio -Dgallium-va=true -Dgallium-xa=true -Dgallium-nine=true -Dgallium-opencl=disabled -Dosmesa=gallium"

				        # We need swrast for osmesa and nine.

				        # Nouveau supports, or builds at least against all ST.

				        - GALLIUM_DRIVERS="nouveau,swrast"

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            - llvm-5.0-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # Nine requires gcc 4.6... which is the one we have right ?

				            - libxvmc-dev

				            # Build locally, for now.

				            #- libvdpau-dev

				            #- libva-dev

				            - libomxil-bellagio-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3.5

				            - python3-pip

				            - python3-setuptools

				    - env:

				        # NOTE: Building SWR is 2x (yes two) times slower than all the other

				        # gallium drivers combined.

				        # Start this early so that it doesn't hunder the run time.

				        - LABEL="make Gallium Drivers SWR"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=6.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="swr"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            - llvm-6.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium Drivers RadeonSI"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=7

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="radeonsi"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          sources:

				            - sourceline: 'deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main'

				              key_url: https://apt.llvm.org/llvm-snapshot.gpg.key

				          packages:

				            # From sources above

				            - llvm-7-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium Drivers Other"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.9

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="i915,nouveau,kmsro,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-3.9-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium ST Clover LLVM-3.9"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.9

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            - libclc-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-3.9-dev

				            - clang-3.9

				            - libclang-3.9-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium ST Clover LLVM-4.0"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=4.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            - libclc-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-4.0-dev

				            - clang-4.0

				            - libclang-4.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium ST Clover LLVM-5.0"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            - libclc-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-5.0-dev

				            - clang-5.0

				            - libclang-5.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium ST Clover LLVM-6.0"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=6.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            - libclc-dev

				            - llvm-6.0-dev

				            - clang-6.0

				            - libclang-6.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Gallium ST Clover LLVM-7"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=7

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600,radeonsi"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          sources:

				            - sourceline: 'deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main'

				              key_url: https://apt.llvm.org/llvm-snapshot.gpg.key

				          packages:

				            - libclc-dev

				            # From sources above

				            - llvm-7-dev

				            - clang-7

				            - libclang-7-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				    - env:

				        - LABEL="make Gallium ST Other"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.5

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine --enable-xvmc --enable-vdpau --enable-va --enable-omx-bellagio --enable-gallium-osmesa"

				        # We need swrast for osmesa and nine.

				        # i915 most likely doesn't work with most ST.

				        # Regardless - we're doing a quick build test here.

				        - GALLIUM_DRIVERS="i915,swrast"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          packages:

				            # We actually want to test against llvm-3.3, yet 3.5 is available

				            - llvm-3.5-dev

				            # Nine requires gcc 4.6... which is the one we have right ?

				            - libxvmc-dev

				            # Build locally, for now.

				            #- libvdpau-dev

				            #- libva-dev

				            - libomxil-bellagio-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="make Vulkan"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"

				        - LLVM_VERSION=7

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS=""

				        - VULKAN_DRIVERS="intel,radeon"

				        - LIBUNWIND_FLAGS="--disable-libunwind"

				      addons:

				        apt:

				          sources:

				            - sourceline: 'deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main'

				              key_url: https://apt.llvm.org/llvm-snapshot.gpg.key

				          packages:

				            # From sources above

				            - llvm-7-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - python3-pip

				            - python3-setuptools

				    - env:

				        - LABEL="scons"

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        # Explicitly disable.

				        - SCONS_TARGET="llvm=0"

				        # Keep it symmetrical to the make build.

				        - SCONS_CHECK_COMMAND="scons llvm=0 check"

				      addons:

				        apt:

				          packages:

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="scons LLVM"

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        - SCONS_TARGET="llvm=1"

				        # Keep it symmetrical to the make build.

				        - SCONS_CHECK_COMMAND="scons llvm=1 check"

				        - LLVM_VERSION=3.5

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # We actually want to test against llvm-3.3, yet 3.5 is available

				            - llvm-3.5-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="scons SWR"

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        - SCONS_TARGET="swr=1"

				        - LLVM_VERSION=6.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        # Keep it symmetrical to the make build. There's no actual SWR, yet.

				        - SCONS_CHECK_COMMAND="true"

				      addons:

				        apt:

				          packages:

				            - llvm-6.0-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="macOS make"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="make check"

				        - DRI_LOADERS="--with-platforms=x11 --disable-egl"

				      os: osx

				    - env:

				        - LABEL="macOS meson"

				        - BUILD=meson

				        - UNWIND="false"

				        - DRI_LOADERS="-Dglx=dri -Dgbm=false -Degl=false -Dplatforms=x11 -Dosmesa=none"

				        - GALLIUM_ST="-Ddri3=true -Dgallium-vdpau=false -Dgallium-xvmc=false -Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled"

				      os: osx

				    - PKG_CONFIG_PATH=""

				before_install:

				  - |

				    if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then

				      HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext

				      # Set PATH for homebrew pip3 installs

				      PATH="$HOME/Library/Python/3.6/bin:${PATH}"

				      # Set PKG_CONFIG_PATH for keg-only expat

				      PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"

				      # Set PATH for keg-only gettext

				      PATH="/usr/local/opt/gettext/bin:${PATH}"

				  - HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext

				  # Set PATH for homebrew pip3 installs

				  - PATH="$HOME/Library/Python/3.6/bin:${PATH}"

				  # Set PKG_CONFIG_PATH for keg-only expat

				  - PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"

				  # Set PATH for keg-only gettext

				  - PATH="/usr/local/opt/gettext/bin:${PATH}"

				      # Install xquartz for prereqs ...

				      XQUARTZ_VERSION="2.7.11"

				      wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg

				      hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg

				      sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /

				      hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}

				      # ... and set paths

				      PATH="/opt/X11/bin:${PATH}"

				      PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"

				      ACLOCAL="aclocal -I /opt/X11/share/aclocal -I /usr/local/share/aclocal"

				    fi

				  # Install xquartz for prereqs ...

				  - XQUARTZ_VERSION="2.7.11"

				  - wget -nv https://dl.bintray.com/xquartz/downloads/XQuartz-${XQUARTZ_VERSION}.dmg

				  - hdiutil attach XQuartz-${XQUARTZ_VERSION}.dmg

				  - sudo installer -pkg /Volumes/XQuartz-${XQUARTZ_VERSION}/XQuartz.pkg -target /

				  - hdiutil detach /Volumes/XQuartz-${XQUARTZ_VERSION}

				  # ... and set paths

				  - PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"

				install:

				  # Install a more modern meson from pip, since the version in the

				  # ubuntu repos is often quite old.

				  - if test "x$BUILD" = xmeson; then

				      pip3 install --user meson;

				      pip3 install --user mako;

				    fi

				  # Install autotools build dependencies

				  - if test "x$BUILD" = xmake; then

				      pip2 install --user mako;

				    fi

				  # Install a more modern scons from pip.

				  - if test "x$BUILD" = xscons; then

				      pip2 install --user "scons>=2.4";

				      pip2 install --user mako;

				    fi

				  # Install dependencies where we require specific versions (or where

				  # disallowed by Travis CI's package whitelisting).

				  - |

				    if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then

				      wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2

				      tar -jxvf $XORGMACROS_VERSION.tar.bz2

				      (cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2

				      tar -jxvf $GLPROTO_VERSION.tar.bz2

				      (cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2

				      tar -jxvf $DRI2PROTO_VERSION.tar.bz2

				      (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				      tar -jxvf $XCBPROTO_VERSION.tar.bz2

				      (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2

				      tar -jxvf $LIBXCB_VERSION.tar.bz2

				      (cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2

				      tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2

				      (cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				      tar -jxvf $LIBDRM_VERSION.tar.bz2

				      (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)

				      wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2

				      tar -jxvf $RANDRPROTO_VERSION.tar.bz2

				      (cd $RANDRPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2

				      tar -jxvf $LIBXRANDR_VERSION.tar.bz2

				      (cd $LIBXRANDR_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				      tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				      (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2

				      tar -jxvf $LIBVDPAU_VERSION.tar.bz2

				      (cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2

				      tar -jxvf $LIBVA_VERSION.tar.bz2

				      (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)

				      wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz

				      tar -axvf $LIBWAYLAND_VERSION.tar.xz

				      (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)

				      wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz

				      tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz

				      (cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      # Meson requires ninja >= 1.6, but xenial has 1.3.x

				      wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip

				      unzip ninja-linux.zip

				      mv ninja $HOME/prefix/bin/

				      # Generate this header since one is missing on the Travis instance

				      mkdir -p linux

				      printf "%s\n" \

				           "#ifndef _LINUX_MEMFD_H" \

				           "#define _LINUX_MEMFD_H" \

				           "" \

				           "#define MFD_CLOEXEC             0x0001U" \

				           "#define MFD_ALLOW_SEALING       0x0002U" \

				           "" \

				           "#endif /* _LINUX_MEMFD_H */" > linux/memfd.h

				      # Generate this header, including the missing SYS_memfd_create

				      # macro, which is not provided by the header in the Travis

				      # instance

				      mkdir -p sys

				      printf "%s\n" \

				           "#ifndef _SYSCALL_H" \

				           "#define _SYSCALL_H      1" \

				           "" \

				           "#include <asm/unistd.h>" \

				           "" \

				           "#ifndef _LIBC" \

				           "# include <bits/syscall.h>" \

				           "#endif" \

				           "" \

				           "#ifndef __NR_memfd_create" \

				           "# define __NR_memfd_create 319 /* Taken from <asm/unistd_64.h> */" \

				           "#endif" \

				           "" \

				           "#ifndef SYS_memfd_create" \

				           "# define SYS_memfd_create __NR_memfd_create" \

				           "#endif" \

				           "" \

				           "#endif" > sys/syscall.h

				    fi

				  - pip3 install --user meson

				  - pip3 install --user mako

				script:

				  - if test "x$BUILD" = xmake; then

				      export CFLAGS="$CFLAGS -isystem`pwd`";

				      mkdir build &&

				      cd build &&

				      ../autogen.sh

				        --enable-autotools

				        --enable-debug

				        $LIBUNWIND_FLAGS

				        $DRI_LOADERS

				        --with-dri-drivers=$DRI_DRIVERS

				        $GALLIUM_ST

				        --with-gallium-drivers=$GALLIUM_DRIVERS

				        --with-vulkan-drivers=$VULKAN_DRIVERS

				        --disable-llvm-shared-libs

				        &&

				      make && eval $MAKE_CHECK_COMMAND;

				    fi

				  - if test "x$BUILD" = xscons; then

				      scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;

				    fi

				  - |

				    if test "x$BUILD" = xmeson; then

				      if test -n "$LLVM_CONFIG"; then

				        # We need to control the version of llvm-config we're using, so we'll

				        # generate a native file to do so. This requires meson >=0.49

				        #

				        echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file

				        $LLVM_CONFIG --version

				      else

				        : > native.file

				      fi

				      export CFLAGS="$CFLAGS -isystem`pwd`"

				      meson _build \

				                   --native-file=native.file \

				                   -Dbuild-tests=true \

				                   -Dlibunwind=${UNWIND} \

				                   ${DRI_LOADERS} \

				                   -Ddri-drivers=${DRI_DRIVERS:-[]} \

				                   ${GALLIUM_ST} \

				                   -Dgallium-drivers=${GALLIUM_DRIVERS:-[]} \

				                   -Dvulkan-drivers=${VULKAN_DRIVERS:-[]}

				      meson configure _build

				      ninja -C _build

				      ninja -C _build test

				    fi

				  - meson _build

				      -Dbuild-tests=true

				      -Dplatforms=x11

				      -Dgallium-drivers=swrast

				  - ninja -C _build

				  - ninja -C _build test

									
										3

Android.common.mk
									
												View File
												
				@@ -32,13 +32,14 @@ LOCAL_C_INCLUDES += \

				MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				LOCAL_CFLAGS += \

					-Wno-error \

					-Werror=incompatible-pointer-types \

					-Wno-unused-parameter \

					-Wno-pointer-arith \

					-Wno-missing-field-initializers \

					-Wno-initializer-overrides \

					-Wno-mismatched-tags \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"

					-DPACKAGE_BUGREPORT=\"https://gitlab.freedesktop.org/mesa/mesa/issues\"

				# XXX: The following __STDC_*_MACROS defines should not be needed.

				# It's likely due to a bug elsewhere, but let's temporarily add them

									
										15

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv

				#   gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris lima

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -59,7 +59,9 @@ gallium_drivers := \

					vmwgfx.HAVE_GALLIUM_VMWGFX \

					vc4.HAVE_GALLIUM_VC4 \

					virgl.HAVE_GALLIUM_VIRGL \

					etnaviv.HAVE_GALLIUM_ETNAVIV

					etnaviv.HAVE_GALLIUM_ETNAVIV \

					iris.HAVE_GALLIUM_IRIS \

					lima.HAVE_GALLIUM_LIMA

				ifeq ($(BOARD_GPU_DRIVERS),all)

				MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))

				@@ -96,18 +98,19 @@ define mesa-build-with-llvm

				  $(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5), \

				    $(warning Unsupported LLVM version in Android $(MESA_ANDROID_MAJOR_VERSION)),) \

				  $(if $(filter 6,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0)) \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_STRING=\"3.7\")) \

				  $(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0)) \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_STRING=\"3.8\")) \

				  $(if $(filter 8,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)) \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_STRING=\"3.9\")) \

				  $(if $(filter P,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)) \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_STRING=\"3.9\")) \

				  $(eval LOCAL_SHARED_LIBRARIES += libLLVM)

				endef

				# add subdirectories

				SUBDIRS := \

					src/freedreno \

					src/gbm \

					src/loader \

					src/mapi \

									
										92

Makefile.am
									
												View File
											
				@@ -1,92 +0,0 @@

				# Copyright © 2012 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				SUBDIRS = src

				AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-dri \

					--enable-dri3 \

					--enable-egl \

					--enable-gallium-tests \

					--enable-gallium-osmesa \

					--enable-llvm \

					--enable-gbm \

					--enable-gles1 \

					--enable-gles2 \

					--enable-glx \

					--enable-glx-tls \

					--enable-nine \

					--enable-opencl \

					--enable-opencl-icd \

					--enable-opengl \

					--enable-va \

					--enable-vdpau \

					--enable-xa \

					--enable-xvmc \

					--enable-llvm-shared-libs \

					--enable-libunwind \

					--with-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,nouveau,r300,kmsro,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv \

					--with-vulkan-drivers=intel,radeon

				ACLOCAL_AMFLAGS = -I m4

				EXTRA_DIST = \

					autogen.sh \

					common.py \

					docs \

					doxygen \

					bin/git_sha1_gen.py \

					scons \

					SConstruct \

					build-support/conftest.dyn \

					build-support/conftest.map \

					meson.build \

					meson_options.txt \

					bin/meson.build \

					include/meson.build \

					bin/install_megadrivers.py \

					bin/meson_get_version.py

				noinst_HEADERS = \

					include/c99_alloca.h \

					include/c99_compat.h \

					include/c99_math.h \

					include/c11 \

					include/drm-uapi/drm.h \

					include/drm-uapi/drm_fourcc.h \

					include/drm-uapi/drm_mode.h \

					include/drm-uapi/i915_drm.h \

					include/drm-uapi/tegra_drm.h \

					include/drm-uapi/v3d_drm.h \

					include/drm-uapi/vc4_drm.h \

					include/D3D9 \

					include/GL/wglext.h \

					include/HaikuGL \

					include/no_extern_c.h \

					include/pci_ids \

					include/vulkan

				# We list some directories in EXTRA_DIST, but don't actually want to include

				# the .gitignore files in the tarball.

				dist-hook:

					find $(distdir) -name .gitignore -exec $(RM) {} +

									
										19

README.rst
									
												View File
												
				@@ -9,25 +9,6 @@ This repository lives at https://gitlab.freedesktop.org/mesa/mesa.

				Other repositories are likely forks, and code found there is not supported.

				Build status

				------------

				Travis:

				.. image:: https://travis-ci.org/mesa3d/mesa.svg?branch=master

				    :target: https://travis-ci.org/mesa3d/mesa

				Appveyor:

				.. image:: https://img.shields.io/appveyor/ci/mesa3d/mesa.svg

				    :target: https://ci.appveyor.com/project/mesa3d/mesa

				Coverity:

				.. image:: https://scan.coverity.com/projects/139/badge.svg?flat=1

				    :target: https://scan.coverity.com/projects/mesa

				Build & install

				---------------

8

REVIEWERS

View File

@@ -94,14 +94,6 @@ GALLIUM TARGETS
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/gallium/targets/
 AUTOCONF BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: autogen.sh
 F: configure.ac
 F: */Automake.inc
 F: */Makefile.*am
 F: */Makefile.sources
 SCONS BUILD
 F: scons/
 F: */SConscript*

									
										2

SConstruct
									
												View File
												
				@@ -73,7 +73,7 @@ with open("VERSION") as f:

				  mesa_version = f.read().strip()

				env.Append(CPPDEFINES = [

				    ('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),

				    ('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),

				    ('PACKAGE_BUGREPORT', '\\"https://gitlab.freedesktop.org/mesa/mesa/issues\\"'),

				])

				# Includes

2

VERSION

View File

@@ -1 +1 @@
 .0.0-devel
 .1.8

									
										14

autogen.sh
									
												View File
											
				@@ -1,14 +0,0 @@

				#! /bin/sh

				srcdir=`dirname "$0"`

				test -z "$srcdir" && srcdir=.

				ORIGDIR=`pwd`

				cd "$srcdir"

				autoreconf --force --verbose --install || exit 1

				cd "$ORIGDIR" || exit $?

				if test -z "$NOCONFIGURE"; then

				    "$srcdir"/configure "$@"

				fi

44

bin/.cherry-ignore Normal file

View File

@@ -0,0 +1,44 @@
 # fixes: The following commits do not apply cleanly on 19.1 branch, as they
 #        depend on other commits not present in the branch.
 b00e1ff24f974bc99e7ca9a720518da0ce5b89 panfrost: Make ctx->job useful
 f6c44549ee2dd0f218deea1feba3965523609406 iris: Replace devinfo->gen with GEN_GEN
 cd13ccee7bc2733e7a56284dc02bdb1b1c40081 iris: Update fast clear colors on Gen9 with direct immediate writes.
 fe55256c78ede507d75d4665d73936ea7db31 nir/opt_large_constants: Handle store writemasks
 # fixes: The following commit depends on commits 77a1070d366a and df4c2ec5e19b
 #        in order to compile, which did not land in the branch.
 d799250346331a93b21678dc5605cff74dfa3a1 iris: Avoid unnecessary resolves on transfer maps
 # stable: Explicit 19.2 only nominations.
 e73d863a66caac796ed5fb543a77f0b892df8573 radv: allow to enable VK_AMD_shader_ballot only on GFX8+
 f202ac27a99caf9009aa9d60e2e0d7f3b528e99f radv: add a new debug option called RADV_DEBUG=noshaderballot
 a6ad9e8ccf970a0da68508eb2ce26b316045b9f0 radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood
 c27d8d4a7e9372a8a86d970b598fc4e3bfd1 radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
 a4e6e59db82e61b47ef905f28dde80ae36a67d35 radv/gfx10: do not use NGG with NAVI14
 fe0ec41c4d36fd5a82e7579d89e34cce7423c4e5 radv: Change memory type order for GPUs without dedicated VRAM
 adf0d00c6b5506ed2206b950336bdc568d2247 radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts
 d95afd8b9e7f9b3880813203292257bf0ed7babf radeonsi/gfx10: fix wave occupancy computations
 d5f11ab345b05759c22acbcd2f79928311689e3 radv: store engine name
 dc6074cf7f651b720868e0ba24362b585d1b31 driconfig: add a new engine name/version parameter
 b7ac90cf4f86bb409d34101e3a3cceac8cbe vulkan: add vk_x11_strict_image_count option
 f195414a2e89bd9f549dacc04365f67e5bd110 radeonsi: add Navi12 PCI ID
 f833b4cada07b746a10ffa4d93fcd821920c3cb1 docs: Update to OpenGL 4.6 in the release notes
 68820007fddbb5b79f1b2b08e66ef14092053a95 radv: fix loading 64-bit GS inputs
 b0e0d7e0f2353d337e68e8e439b5dfead880c4 docs: Add the maximum implemented Vulkan API version in 19.2 rel notes
 b698136c5ef0ef1a15cb6fbff13cbc4ceb3881 amd: add more PCI IDs for Navi14
 de601a8afea1e5f99637f5823a97ca21915 ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir
 c0938bece83cd37365c30c35d2d54927f3fe0cd radeonsi/gfx10: fix L2 cache rinse programming
 d97013294816db46abb7d1e7c6871fe73dfac93 ac: fix incorrect vram_size reported by the kernel
 cbe83445b2ec78fab1f303918c79268713500b5 ac: add radeon_info::tcc_harvested
 ebe91633e7f47518118983e0e6f5c632b25a4 radeonsi/gfx10: fix corruption for chips with harvested TCCs
 b7c2f7c5a6b21bccb7847ab03b7fba5c770e131c ac: fix num_good_cu_per_sh for harvested chips
 # stable: Explicit 19.3 only nominations.
 f2aa6ccd0b226eebe2c1a46281160b0a54d522 docs: Add the maximum implemented Vulkan API version in 19.3 rel notes
 # revert: The following commit was requested to be removed from stable branch by original author.
 dcc0e23438f3e5929c2ef74d57e8207be25ecb41 Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
 # fixes: The following commit was reverted later
 c73988300f943e185a50aaba015f2f114ffcb262 util: added missing headers in anon-file
 # fixes: The following commit depends on commit e1dc3ab75348 in order to
 #        compile, which did not land in the branch.
 ad3d8b178c0d8939db62ac2be9fdc98d127742d radv: Fix condition for skipping the continue CS.
 # revert: The following commit was explicitly requested to be removed from the
 #         branch.
 43041627445540afda1a05d11861935963660344 Revert "radv: disable viewport clamping even if FS doesn't write Z"

9

bin/.gitignore vendored

View File

@@ -1,9 +0,0 @@
 config.guess
 config.sub
 install-sh
 /depcomp
 /missing
 ylwrap
 compile
 ar-lib
 /test-driver

									
										6

bin/get-pick-list.sh
									
												View File
												
				@@ -13,12 +13,12 @@

				is_stable_nomination()

				{

					git show --summary "$1" | grep -q -i -o "CC:.*mesa-stable"

					git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-stable"

				}

				is_typod_nomination()

				{

					git show --summary "$1" | grep -q -i -o "CC:.*mesa-dev"

					git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-dev"

				}

				fixes=

				@@ -32,7 +32,7 @@ is_sha_nomination()

				{

					fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \

						sed -e 's/'"$2"'/\nfixes:/Ig' | \

						grep -Eo 'fixes:[a-f0-9]{8,40}'`

						grep -Eo 'fixes:[a-f0-9]{4,40}'`

					fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`

					if test $fixes_count -eq 0; then

									
										15

bin/install_megadrivers.py
									
												View File
												
				@@ -24,7 +24,6 @@

				from __future__ import print_function

				import argparse

				import os

				import shutil

				def main():

				@@ -35,7 +34,11 @@ def main():

				    args = parser.parse_args()

				    if os.path.isabs(args.libdir):

				        to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])

				        destdir = os.environ.get('DESTDIR')

				        if destdir:

				            to = os.path.join(destdir, args.libdir[1:])

				        else:

				            to = args.libdir

				    else:

				        to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)

				@@ -45,7 +48,6 @@ def main():

				        if os.path.lexists(to):

				            os.unlink(to)

				        os.makedirs(to)

				    shutil.copy(args.megadriver, master)

				    for driver in args.drivers:

				        abs_driver = os.path.join(to, driver)

				@@ -67,7 +69,14 @@ def main():

				                name, ext = os.path.splitext(name)

				        finally:

				            os.chdir(ret)

				    # Remove meson-created master .so and symlinks

				    os.unlink(master)

				    name, ext = os.path.splitext(master)

				    while ext != '.so':

				        if os.path.lexists(name):

				            os.unlink(name)

				        name, ext = os.path.splitext(name)

				if __name__ == '__main__':

									
										63

bin/meson-options.py
									
										Executable file
									
												View File
												
				@@ -0,0 +1,63 @@

				#!/usr/bin/env python3

				from os import get_terminal_size

				from textwrap import wrap

				from mesonbuild import coredata

				from mesonbuild import optinterpreter

				(COLUMNS, _) = get_terminal_size()

				def describe_option(option_name: str, option_default_value: str,

				                    option_type: str, option_message: str) -> None:

				    print('name:    ' + option_name)

				    print('default: ' + option_default_value)

				    print('type:    ' + option_type)

				    for line in wrap(option_message, width=COLUMNS - 9):

				        print('         ' + line)

				    print('---')

				oi = optinterpreter.OptionInterpreter('')

				oi.process('meson_options.txt')

				for (name, value) in oi.options.items():

				    if isinstance(value, coredata.UserStringOption):

				        describe_option(name,

				                        value.value,

				                        'string',

				                        "You can type what you want, but make sure it makes sense")

				    elif isinstance(value, coredata.UserBooleanOption):

				        describe_option(name,

				                        'true' if value.value else 'false',

				                        'boolean',

				                        "You can set it to 'true' or 'false'")

				    elif isinstance(value, coredata.UserIntegerOption):

				        describe_option(name,

				                        str(value.value),

				                        'integer',

				                        "You can set it to any integer value between '{}' and '{}'".format(value.min_value, value.max_value))

				    elif isinstance(value, coredata.UserUmaskOption):

				        describe_option(name,

				                        str(value.value),

				                        'umask',

				                        "You can set it to 'preserve' or a value between '0000' and '0777'")

				    elif isinstance(value, coredata.UserComboOption):

				        choices = '[' + ', '.join(["'" + v + "'" for v in value.choices]) + ']'

				        describe_option(name,

				                        value.value,

				                        'combo',

				                        "You can set it to any one of those values: " + choices)

				    elif isinstance(value, coredata.UserArrayOption):

				        choices = '[' + ', '.join(["'" + v + "'" for v in value.choices]) + ']'

				        value = '[' + ', '.join(["'" + v + "'" for v in value.value]) + ']'

				        describe_option(name,

				                        value,

				                        'array',

				                        "You can set it to one or more of those values: " + choices)

				    elif isinstance(value, coredata.UserFeatureOption):

				        describe_option(name,

				                        value.value,

				                        'feature',

				                        "You can set it to 'auto', 'enabled', or 'disabled'")

				    else:

				        print(name + ' is an option of a type unknown to this script')

				        print('---')

									
										12

common.py
									
												View File
												
				@@ -17,6 +17,9 @@ import SCons.Script.SConscript

				host_platform = _platform.system().lower()

				if host_platform.startswith('cygwin'):

				    host_platform = 'cygwin'

				# MSYS2 default platform selection.

				if host_platform.startswith('mingw'):

				    host_platform = 'windows'

				# Search sys.argv[] for a "platform=foo" argument since we don't have

				# an 'env' variable at this point.

				@@ -49,9 +52,18 @@ if 'PROCESSOR_ARCHITECTURE' in os.environ:

				else:

				    host_machine = _platform.machine()

				host_machine = _machine_map.get(host_machine, 'generic')

				# MSYS2 default machine selection.

				if _platform.system().lower().startswith('mingw') and 'MSYSTEM' in os.environ:

				    if os.environ['MSYSTEM'] == 'MINGW32':

				        host_machine = 'x86'

				    if os.environ['MSYSTEM'] == 'MINGW64':

				        host_machine = 'x86_64'

				default_machine = host_machine

				default_toolchain = 'default'

				# MSYS2 default toolchain selection.

				if _platform.system().lower().startswith('mingw'):

				    default_toolchain = 'mingw'

				if target_platform == 'windows' and host_platform != 'windows':

				    default_machine = 'x86'

3378

configure.ac

View File

File diff suppressed because it is too large Load Diff

									
										2

docs/application-issues.html
									
												View File
												
				@@ -62,9 +62,11 @@ older than the given year.

				<p>

				For example, if the game was released in 2001, do

				</p>

				<pre>

				export MESA_EXTENSION_MAX_YEAR=2001

				</pre>

				<p>

				before running the game.

				</p>

									
										270

docs/autoconf.html
									
												View File
											
				@@ -1,270 +0,0 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Compilation and Installation using Autoconf</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Compilation and Installation using Autoconf</h1>

				<ol>

				<li><p><a href="#basic">Basic Usage</a></li>

				<li><p><a href="#driver">Driver Options</a>

				  <ul>

				  <li><a href="#xlib">Xlib Driver Options</a></li>

				  <li><a href="#dri">DRI Driver Options</a></li>

				  <li><a href="#osmesa">OSMesa Driver Options</a></li>

				  </ul>

				</ol>

				<h2>ATTENTION:</h2>

				<p>

				    The autotools build is being replaced by the <a href="meson.html">meson</a>

				    build system. If you haven't yet now is a good time to try using meson and

				    report any issues you run into.

				</p>

				<h2 id="basic">1. Basic Usage</h2>

				<p>

				The autoconf generated configure script can be used to guess your

				platform and change various options for building Mesa. To use the

				configure script, type:

				</p>

				<pre>

				    ./configure

				</pre>

				<p>

				To see a short description of all the options, type <code>./configure

				--help</code>. If you are using a development snapshot and the configure

				script does not exist, type <code>./autogen.sh</code> to generate it

				first. If you know the options you want to pass to

				<code>configure</code>, you can pass them to <code>autogen.sh</code>. It

				will run <code>configure</code> with these options after it is

				generated. Once you have run <code>configure</code> and set the options

				to your preference, type:

				</p>

				<pre>

				    make

				</pre>

				<p>

				This will produce libGL.so and/or several other libraries depending on the

				options you have chosen. Later, if you want to rebuild for a different

				configuration run <code>make realclean</code> before rebuilding.

				</p>

				<p>

				Some of the generic autoconf options are used with Mesa:

				</p>

				<dl>

				<dt><code>--prefix=PREFIX</code></dt>

				<dd><p>This is the root directory where

				files will be installed by <code>make install</code>. The default is

				<code>/usr/local</code>.</p>

				</dd>

				<dt><code>--exec-prefix=EPREFIX</code></dt>

				<dd><p>This is the root directory

				where architecture-dependent files will be installed. In Mesa, this is

				only used to derive the directory for the libraries. The default is

				<code>${prefix}</code>.</p>

				</dd>

				<dt><code>--libdir=LIBDIR</code></dt>

				<dd><p>This option specifies the directory

				where the GL libraries will be installed. The default is

				<code>${exec_prefix}/lib</code>. It also serves as the name of the

				library staging area in the source tree. For instance, if the option

				<code>--libdir=/usr/local/lib64</code> is used, the libraries will be

				created in a <code>lib64</code> directory at the top of the Mesa source

				tree.</p>

				</dd>

				<dt><code>--sysconfdir=DIR</code></dt>

				<dd><p>This option specifies the directory where the configuration

				files will be installed. The default is <code>${prefix}/etc</code>.

				Currently there's only one config file provided when dri drivers are

				enabled - it's <code>drirc</code>.</p>

				</dd>

				<dt><code>--datadir=DIR</code></dt>

				<dd><p>This option specifies the directory where the data files will

				be installed. The default is <code>${prefix}/share</code>.

				Currently when dri drivers are enabled, <code>drirc.d/</code> is at

				this place.</p>

				</dd>

				<dt><code>--enable-static, --disable-shared</code></dt>

				<dd><p>By default, Mesa

				will build shared libraries. Either of these options will force static

				libraries to be built. It is not currently possible to build static and

				shared libraries in a single pass.</p>

				</dd>

				<dt><code>CC, CFLAGS, CXX, CXXFLAGS</code></dt>

				<dd><p>These environment variables

				control the C and C++ compilers used during the build. By default,

				<code>gcc</code> and <code>g++</code> are used and the debug/optimisation

				level is left unchanged.</p>

				</dd>

				<dt><code>LDFLAGS</code></dt>

				<dd><p>An environment variable specifying flags to

				pass when linking programs. These should be empty and

				<code>PKG_CONFIG_PATH</code> is recommended to be used instead. If needed

				it can be used to direct the linker to use libraries in nonstandard

				directories. For example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.</p>

				</dd>

				<dt><code>PKG_CONFIG_PATH</code></dt>

				<dd><p>The

				<code>pkg-config</code> utility is a hard requirement for configuring and

				building mesa. It is used to search for external libraries

				on the system. This environment variable is used to control the search

				path for <code>pkg-config</code>. For instance, setting

				<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for

				package metadata in <code>/usr/X11R6</code> before the standard

				directories.</p>

				</dd>

				</dl>

				<p>

				There are also a few general options for altering the Mesa build:

				</p>

				<dl>

				<dt><code>--enable-debug</code></dt>

				<dd><p>This option will set the compiler debug/optimisation levels (if the user

				hasn't already set them via the CFLAGS/CXXFLAGS) and macros to aid in

				debugging the Mesa libraries.</p>

				<p>Note that enabling this option can lead to noticeable loss of performance.</p>

				<dt><code>--disable-asm</code></dt>

				<dd><p>There are assembly routines

				available for a few architectures. These will be used by default if

				one of these architectures is detected. This option ensures that

				assembly will not be used.</p>

				</dd>

				<dt><code>--build=</code></dt>

				<dt><code>--host=</code></dt>

				<dd><p>By default, the build will compile code for the architecture that

				it's running on. In order to build cross-compile Mesa on a x86-64 machine

				that is to run on a i686, one would need to set the options to:</p>

				<p><code>--build=x86_64-pc-linux-gnu --host=i686-pc-linux-gnu</code></p>

				Note that these can vary from distribution to distribution. For more

				information check with the

				<a href="https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Specifying-Target-Triplets.html">

				autoconf manual</a>.

				Note that you will need to correctly set <code>PKG_CONFIG_PATH</code> as well.

				<p>In some cases a single compiler is capable of handling both architectures

				(multilib) in that case one would need to set the <code>CC,CXX</code> variables

				appending the correct machine options. Seek your compiler documentation for

				further information -

				<a href="https://gcc.gnu.org/onlinedocs/gcc/Submodel-Options.html"> gcc

				machine dependent options</a></p>

				<p>In addition to specifying correct <code>PKG_CONFIG_PATH</code> for the target

				architecture, the following should be sufficient to configure multilib Mesa</p>

				<code>./configure CC="gcc -m32" CXX="g++ -m32" --build=x86_64-pc-linux-gnu --host=i686-pc-linux-gnu ...</code>

				</dd>

				</dl>

				<h2 id="driver">2. GL Driver Options</h2>

				<p>

				There are several different driver modes that Mesa can use. These are

				described in more detail in the <a href="install.html">basic

				installation instructions</a>. The Mesa driver is controlled through the

				configure options <code>--enable-glx</code> and <code>--enable-osmesa</code>

				</p>

				<h3 id="xlib">Xlib</h3><p>

				It uses Xlib as a software renderer to do all rendering. It corresponds

				to the option <code>--enable-glx=xlib</code> or <code>--enable-glx=gallium-xlib</code>.

				<h3 id="dri">DRI</h3><p>This mode uses the DRI hardware drivers for

				accelerated OpenGL rendering. To enable use <code>--enable-glx=dri

				--enable-dri</code>.

				<!-- DRI specific options -->

				<dl>

				<dt><code>--with-dri-driverdir=DIR</code>

				<dd><p> This option specifies the

				location the DRI drivers will be installed to and the location libGL

				will search for DRI drivers. The default is <code>${libdir}/dri</code>.

				<dt><code>--with-dri-drivers=DRIVER,DRIVER,...</code>

				<dd><p> This option

				allows a specific set of DRI drivers to be built. For example,

				<code>--with-dri-drivers="swrast,i965,radeon,nouveau"</code>. By

				default, the drivers will be chosen depending on the target platform.

				See the directory <code>src/mesa/drivers/dri</code> in the source tree

				for available drivers. Beware that the swrast DRI driver is used by both

				libGL and the X.Org xserver GLX module to do software rendering, so you

				may run into problems if it is not available.

				<!-- This explanation might be totally bogus. Kristian? -->

				<dt><code>--disable-driglx-direct</code>

				<dd><p> Disable direct rendering in

				GLX. Normally, direct hardware rendering through the DRI drivers and

				indirect software rendering are enabled in GLX. This option disables

				direct rendering entirely. It can be useful on architectures where

				kernel DRM modules are not available.

				<dt><code>--enable-glx-tls</code> <dd><p>

				Enable Thread Local Storage (TLS) in

				GLX.

				<dt><code>--with-expat=DIR</code>

				<dd><p><strong>DEPRECATED</strong>, use <code>PKG_CONFIG_PATH</code> instead.</p>

				<p>The DRI-enabled libGL uses expat to

				parse the DRI configuration files in <code>${sysconfdir}/drirc</code> and

				<code>~/.drirc</code>. This option allows a specific expat installation

				to be used. For example, <code>--with-expat=/usr/local</code> will

				search for expat headers and libraries in <code>/usr/local/include</code>

				and <code>/usr/local/lib</code>, respectively.

				</dl>

				<h3 id="osmesa">OSMesa </h3><p> No libGL is built in this

				mode. Instead, the driver code is built into the Off-Screen Mesa

				(OSMesa) library. See the <a href="osmesa.html">Off-Screen Rendering</a>

				page for more details.  It corresponds to the option

				<code>--enable-osmesa</code>.

				<!-- OSMesa specific options -->

				<dl>

				<dt><code>--with-osmesa-bits=BITS</code>

				<dd><p> This option allows the size

				of the color channel in bits to be specified. By default, an 8-bit

				channel will be used, and the driver will be named libOSMesa. Other

				options are 16- and 32-bit color channels, which will add the bit size

				to the library name. For example, <code>--with-osmesa-bits=16</code>

				will create the libOSMesa16 library with a 16-bit color channel.

				</dl>

				<h2 id="library">3. Library Options</h2>

				<p>

				The configure script provides more fine grained control over the libraries

				that will be built.

				</div>

				</body>

				</html>

									
										6

docs/bugs.html
									
												View File
												
				@@ -14,7 +14,7 @@

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Bug Database</h1>

				<h1>Report a bug</h1>

				<p>

				The Mesa bug database is hosted on

				@@ -24,8 +24,8 @@ The old bug database on SourceForge is no longer used.

				<p>

				To file a Mesa bug, go to

				<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa">

				Bugzilla on freedesktop.org</a>

				<a href="https://gitlab.freedesktop.org/mesa/mesa/issues">

				GitLab on freedesktop.org</a>

				</p>

				<p>

									
										1

docs/codingstyle.html
									
												View File
												
				@@ -135,7 +135,6 @@ should prefer the use of <tt>bool</tt>, <tt>true</tt>, and

				src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.

				</ul>

				</p>

				</div>

				</body>

									
										29

docs/contents.html
									
												View File
												
				@@ -12,6 +12,10 @@

				      background-color: #cccccc;

				      color: black;

				    }

				    h2 {

				      font-size: inherit;

				      font-weight: bold;

				    }

				    a:link {

				      color: #000;

				    }

				@@ -23,7 +27,7 @@

				</head>

				<body>

				<b>Documentation</b>

				<h2>Documentation</h2>

				<ul>

				<li><a href="intro.html" target="_parent">Introduction</a>

				<li><a href="index.html" target="_parent">News</a>

				@@ -37,27 +41,26 @@

				<li>more docs below...

				</ul>

				<b>Download / Install</b>

				<h2>Download / Install</h2>

				<ul>

				<li><a href="download.html" target="_parent">Downloading / Unpacking</a>

				<li><a href="install.html" target="_parent">Compiling / Installing</a>

				  <ul>

				    <li><a href="autoconf.html" target="_parent">Autoconf</a></li>

				    <li><a href="meson.html" target="_parent">Meson</a></li>

				  </ul>

				</li>

				<li><a href="precompiled.html" target="_parent">Precompiled Libraries</a>

				</ul>

				<b>Resources</b>

				<h2>Need help?</h2>

				<ul>

				<li><a href="lists.html" target="_parent">Mailing Lists</a>

				<li><a href="bugs.html" target="_parent">Bug Database</a>

				<li><a href="bugs.html" target="_parent">Report a bug</a>

				<li><a href="webmaster.html" target="_parent">Webmaster</a>

				<li><a href="https://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>

				</ul>

				<b>User Topics</b>

				<h2>User Topics</h2>

				<ul>

				<li><a href="shading.html" target="_parent">Shading Language</a>

				<li><a href="egl.html" target="_parent">EGL</a>

				@@ -67,7 +70,6 @@

				<li><a href="debugging.html" target="_parent">Debugging Tips</a>

				<li><a href="perf.html" target="_parent">Performance Tips</a>

				<li><a href="extensions.html" target="_parent">Mesa Extensions</a>

				<li><a href="mangling.html" target="_parent">GL Function Name Mangling</a>

				<li><a href="llvmpipe.html" target="_parent">Gallium llvmpipe driver</a>

				<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D guest driver</a>

				<li><a href="postprocess.html" target="_parent">Gallium post-processing</a>

				@@ -75,7 +77,7 @@

				<li><a href="viewperf.html" target="_parent">Viewperf Issues</a>

				</ul>

				<b>Developer Topics</b>

				<h2>Developer Topics</h2>

				<ul>

				<li><a href="repository.html" target="_parent">Source Code Repository</a>

				<li><a href="sourcetree.html" target="_parent">Source Code Tree</a>

				@@ -90,7 +92,7 @@

				<li><a href="dispatch.html" target="_parent">GL Dispatch</a>

				</ul>

				<b>Links</b>

				<h2>Links</h2>

				<ul>

				<li><a href="https://www.opengl.org" target="_parent">OpenGL website</a>

				<li><a href="https://dri.freedesktop.org" target="_parent">DRI website</a>

				@@ -98,11 +100,10 @@

				<li><a href="https://planet.freedesktop.org" target="_parent">Developer blogs</a>

				</ul>

				<b>Hosted by:</b>

				<br>

				<blockquote>

				<a href="https://freedesktop.org" target="_parent">freedesktop.org</a>

				</blockquote>

				<h2>Hosted by:</h2>

				<dl>

				<dd><a href="https://freedesktop.org" target="_parent">freedesktop.org</a>

				</dl>

				</body>

				</html>

									
										8

docs/debugging.html
									
												View File
												
				@@ -26,12 +26,8 @@

				</p>

				<p>

				   More extensive error checking is done when Mesa is compiled with the

				   DEBUG symbol defined.  You'll have to edit the Make-config file and

				   add -DDEBUG to the CFLAGS line for your system configuration.  You may

				   also want to replace any optimization flags with the -g flag so you can

				   use your debugger.  After you've edited Make-config type 'make clean'

				   before recompiling.

				   More extensive error checking is done in DEBUG builds

				   (<code>--buildtype debug</code> for meson, <code>build=debug</code> for scons).

				</p>

				<p>

				   In your debugger you can set a breakpoint in _mesa_error() to trap Mesa

									
										4

docs/devinfo.html
									
												View File
												
				@@ -25,6 +25,7 @@

				<p>

				To add a new GL extension to Mesa you have to do at least the following.

				</p>

				<ul>

				<li>

				@@ -70,10 +71,9 @@ To add a new GL extension to Mesa you have to do at least the following.

				<li>

				   The dispatch tests check_table.cpp and dispatch_sanity.cpp

				   should be updated with details about the new extensions functions. These

				   tests are run using 'make check'

				   tests are run using 'meson test'

				</li>

				</ul>

				</p>

									
										2

docs/dispatch.html
									
												View File
												
				@@ -134,7 +134,7 @@ the common case.</p>

				<tr><td><pre>

				#define GET_DISPATCH() \

				    (_glapi_Dispatch != NULL) \

				        ? _glapi_Dispatch : pthread_getspecific(&_glapi_Dispatch_key)

				        ? _glapi_Dispatch : pthread_getspecific(&amp;_glapi_Dispatch_key)

				</pre></td></tr>

				<tr><td>Improved <tt>GET_DISPATCH</tt> Implementation</td></tr></table>

				</blockquote>

									
										20

docs/download.html
									
												View File
												
				@@ -46,34 +46,18 @@ Mesa releases are available in two formats: <tt>.tar.xz</tt> and <tt>.tar.gz</tt

				<p>

				To unpack the tarball:

				</p>

				<pre>

					tar xf mesa-Y.N.P.tar.xz

				</pre>

				or

				<p>or</p>

				<pre>

					tar xf mesa-Y.N.P.tar.gz

				</pre>

				</p>

				<h1>Contents</h1>

				<p>

				After unpacking you'll have these files and directories (among others):

				</p>

				<pre>

				autogen.sh	- Autoconf script for *nix systems

				scons/		- SCons script for Windows builds

				include/	- GL header (include) files

				bin/		- shell scripts for making shared libraries, etc

				docs/		- documentation

				src/		- source code for libraries

				src/mesa	- sources for the main Mesa library and device drivers

				src/gallium     - sources for Gallium and Gallium drivers

				src/glx		- sources for building libGL with full GLX and DRI support

				</pre>

				<p>

				Proceed to the <a href="install.html">compilation and installation

				instructions</a>.

									
										41

docs/egl.html
									
												View File
												
				@@ -33,13 +33,16 @@ directly dispatched to the drivers.</p>

				<ol>

				<li>

				<p>Run <code>configure</code> with the desired client APIs and enable

				the driver for your hardware.  For example</p>

				<p>Configure your build with the desired client APIs and enable

				the driver for your hardware.  For example:</p>

				<pre>

				  $ ./configure --enable-gles1 --enable-gles2 \

				                --with-dri-drivers=... \

				                --with-gallium-drivers=...

				$ meson configure \

				        -D egl=true \

				        -D gles1=true \

				        -D gles2=true \

				        -D dri-drivers=... \

				        -D gallium-drivers=...

				</pre>

				<p>The main library and OpenGL is enabled by default.  The first two options

				@@ -61,7 +64,7 @@ or more EGL drivers.</p>

				time</p>

				<dl>

				<dt><code>--enable-egl</code></dt>

				<dt><code>-D egl=true</code></dt>

				<dd>

				<p>By default, EGL is enabled.  When disabled, the main library and the drivers

				@@ -69,19 +72,11 @@ will not be built.</p>

				</dd>

				<dt><code>--with-egl-driver-dir</code></dt>

				<dd>

				<p>The directory EGL drivers should be installed to.  If not specified, EGL

				drivers will be installed to <code>${libdir}/egl</code>.</p>

				</dd>

				<dt><code>--with-platforms</code></dt>

				<dt><code>-D platforms=...</code></dt>

				<dd>

				<p>List the platforms (window systems) to support.  Its argument is a comma

				separated string such as <code>--with-platforms=x11,drm</code>.  It decides

				separated string such as <code>-D platforms=x11,drm</code>.  It decides

				the platforms a driver may support.  The first listed platform is also used by

				the main library to decide the native platform.</p>

				@@ -90,15 +85,15 @@ the main library to decide the native platform.</p>

				and <code>haiku</code>.

				The <code>android</code> platform can either be built as a system

				component, part of AOSP, using <code>Android.mk</code> files, or

				cross-compiled using appropriate <code>configure</code> options.

				The <code>haiku</code> platform can only be built with SCons.

				cross-compiled using appropriate options.

				The <code>haiku</code> platform can only be built with SCons or Meson.

				Unless for special needs, the build system should

				select the right platforms automatically.</p>

				</dd>

				<dt><code>--enable-gles1</code></dt>

				<dt><code>--enable-gles2</code></dt>

				<dt><code>-D gles1=true</code></dt>

				<dt><code>-D gles2=true</code></dt>

				<dd>

				<p>These options enable OpenGL ES support in OpenGL.  The result is one big

				@@ -106,7 +101,7 @@ internal library that supports multiple APIs.</p>

				</dd>

				<dt><code>--enable-shared-glapi</code></dt>

				<dt><code>-D shared-glapi=true</code></dt>

				<dd>

				<p>By default, <code>libGL</code> has its own copy of <code>libglapi</code>.

				@@ -134,9 +129,9 @@ runtime</p>

				<dd>

				<p>This variable specifies the native platform.  The valid values are the same

				as those for <code>--with-platforms</code>.  When the variable is not set,

				as those for <code>-D platforms=...</code>.  When the variable is not set,

				the main library uses the first platform listed in

				<code>--with-platforms</code> as the native platform.</p>

				<code>-D platforms=...</code> as the native platform.</p>

				<p>Extensions like <code>EGL_MESA_drm_display</code> define new functions to

				create displays for non-native platforms.  These extensions are usually used by

									
										3

docs/envvars.html
									
												View File
												
				@@ -338,6 +338,9 @@ See src/mesa/state_tracker/st_debug.c for other options.

				for details.

				<li>SVGA_EXTRA_LOGGING - if set, enables extra logging to the vmware.log file,

				such as the OpenGL program's name and command line arguments.

				<li>SVGA_NO_LOGGING - if set, disables logging to the vmware.log file.

				This is useful when using Valgrind because it otherwise crashes when

				initializing the host log feature.

				<li>See the driver code for other, lesser-used variables.

				</ul>

									
										28

docs/faq.html
									
												View File
												
				@@ -14,22 +14,18 @@

				<iframe src="contents.html"></iframe>

				<div class="content">

				<center>

				<h1>Mesa Frequently Asked Questions</h1>

				Last updated: 19 September 2018

				</center>

				<br>

				<br>

				<h2>Index</h2>

				<a href="#part1">1. High-level Questions and Answers</a>

				<br>

				<a href="#part2">2. Compilation and Installation Problems</a>

				<br>

				<a href="#part3">3. Runtime / Rendering Problems</a>

				<br>

				<a href="#part4">4. Developer Questions</a>

				<br>

				<ol>

				  <li><a href="#part1">High-level Questions and Answers</a></li>

				  <li><a href="#part2">Compilation and Installation Problems</a></li>

				  <li><a href="#part3">Runtime / Rendering Problems</a></li>

				  <li><a href="#part4">Developer Questions</a></li>

				</ol>

				<br>

				<br>

				@@ -236,22 +232,22 @@ Basically you'll want the following:

				Mesa version number.

				</li></ul>

				<p>

				When configuring Mesa, there are three autoconf options that affect the install

				When configuring Mesa, there are three meson options that affect the install

				location that you should take care with: <code>--prefix</code>,

				<code>--libdir</code>, and <code>--with-dri-driverdir</code>. To install Mesa

				<code>--libdir</code>, and <code>-D dri-drivers-path</code>. To install Mesa

				into the system location where it will be available for all programs to use, set

				<code>--prefix=/usr</code>. Set <code>--libdir</code> to where your Linux

				distribution installs system libraries, usually either <code>/usr/lib</code> or

				<code>/usr/lib64</code>. Set <code>--with-dri-driverdir</code> to the directory

				<code>/usr/lib64</code>. Set <code>-D dri-drivers-path</code> to the directory

				where your Linux distribution installs DRI drivers. To find your system's DRI

				driver directory, try executing <code>find /usr -type d -name dri</code>. For

				example, if the <code>find</code> command listed <code>/usr/lib64/dri</code>,

				then set <code>--with-dri-driverdir=/usr/lib64/dri</code>.

				then set <code>-D dri-drivers-path=/usr/lib64/dri</code>.

				</p>

				<p>

				After determining the correct values for the install location, configure Mesa

				with <code>./configure --prefix=/usr --libdir=xxx --with-dri-driverdir=xxx</code>

				and then install with <code>sudo make install</code>.

				with <code>meson configure --prefix=/usr --libdir=xxx -D dri-drivers-path=xxx</code>

				and then install with <code>sudo ninja install</code>.

				</p>

				<br>

				<br>

88

docs/features.txt

View File

@@ -63,7 +63,7 @@ GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (freedreno/a5xx, freedreno (*), llvmpipe (*), softpipe (*), swr (*))
   Multisample anti-aliasing                             DONE (freedreno/a5xx+, freedreno (*), llvmpipe (*), softpipe (*), swr (*))
 (*) freedreno (a2xx-a4xx), llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
@@ -90,7 +90,7 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (freedreno)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (freedreno)
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx)
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx+)
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (freedreno)
   GL_ARB_sync (Fence objects)                           DONE (freedreno)
   GLX_ARB_create_context_profile                        DONE
@@ -115,20 +115,20 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_draw_buffers_blend                             DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965/gen7+)
   - 'precise' qualifier                                 DONE
   - 'precise' qualifier                                 DONE (softpipe)
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE (freedreno)
   - Implicit signed -> unsigned conversions             DONE
   - Fused multiply-add                                  DONE ()
   - Dynamically uniform UBO array indices               DONE (freedreno, softpipe)
   - Implicit signed -> unsigned conversions             DONE (softpipe)
   - Fused multiply-add                                  DONE (softpipe)
   - Packing/bitfield/conversion functions               DONE (freedreno, softpipe)
   - Enhanced textureGather                              DONE (freedreno, softpipe)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                    DONE ()
   - Geometry shader multiple streams                    DONE (softpipe)
   - Enhanced per-sample shading                         DONE ()
   - Interpolation functions                             DONE ()
   - New overload resolution rules                       DONE
   - Interpolation functions                             DONE (softpipe)
   - New overload resolution rules                       DONE (softpipe)
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965/gen6+, nv50)
   GL_ARB_sample_shading                                 DONE (freedreno/a6xx, i965/gen6+, nv50)
   GL_ARB_shader_subroutine                              DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965/gen7+)
   GL_ARB_texture_buffer_object_rgb32                    DONE (freedreno, i965/gen6+, llvmpipe, softpipe, swr)
@@ -153,11 +153,11 @@ GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_texture_compression_bptc                       DONE (freedreno, i965)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
@@ -170,7 +170,7 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, softpipe, llvmpipe)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
@@ -181,10 +181,10 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
   GL_ARB_multi_draw_indirect                            DONE (freedreno, i965, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_stencil_texturing                              DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (freedreno, nv50, i965, llvmpipe)
   GL_ARB_texture_buffer_range                           DONE (freedreno, nv50, i965, softpipe, llvmpipe)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
@@ -204,14 +204,14 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, virgl)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_texture_stencil8                               DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, r600, virgl)
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, r600, softpipe, virgl)
   GL_ARB_clip_control                                   DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_cull_distance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
@@ -221,16 +221,16 @@ GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600, virgl)
   GL_ARB_texture_barrier                                DONE (freedreno, i965, nv50, r600, virgl)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
   GL_KHR_robustness                                     DONE (freedreno, i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
 GL 4.6, GLSL 4.60
   GL_ARB_gl_spirv                                       in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_indirect_parameters                            DONE (i965/gen7+, nvc0, radeonsi)
   GL_ARB_indirect_parameters                            DONE (i965/gen7+, nvc0, radeonsi, virgl)
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_polygon_offset_clamp                           DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe, virgl)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx+, i965/gen7+, nvc0, r600, radeonsi, softpipe, virgl)
   GL_ARB_shader_draw_parameters                         DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_group_vote                              DONE (i965, nvc0, radeonsi)
   GL_ARB_spirv_extensions                               in progress (Nicolai Hähnle, Ian Romanick)
@@ -244,23 +244,23 @@ These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_draw_indirect                                  DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965/gen7+, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (freedreno, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx, i965/gen7+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx+, i965/gen7+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (freedreno, i965/gen7+)
   GS5 Packing/bitfield/conversion functions             DONE (freedreno/a5xx, i965/gen6+)
   GS5 Packing/bitfield/conversion functions             DONE (freedreno/a5xx+, i965/gen6+)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
@@ -272,25 +272,25 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+, radeonsi, virgl
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        DONE (i965, nvc0)
   GL_KHR_blend_equation_advanced                        DONE (freedreno/a6xx, i965, nvc0)
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965, nvc0)
   GL_KHR_robustness                                     DONE (freedreno, i965, nvc0)
   GL_KHR_texture_compression_astc_ldr                   DONE (freedreno, i965/gen9+)
   GL_OES_copy_image                                     DONE (all drivers)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0, softpipe)
   GL_OES_gpu_shader5                                    DONE (freedreno/a6xx, all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         DONE (freedreno/a5xx+, i965/gen7+, nvc0, softpipe)
   GL_OES_sample_shading                                 DONE (freedreno/a6xx, i965, nvc0, r600)
   GL_OES_sample_variables                               DONE (freedreno/a6xx, i965, nvc0, r600)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (All drivers that support GLES 3.1)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600)
   GL_OES_shader_multisample_interpolation               DONE (freedreno/a6xx, i965, nvc0, r600)
   GL_OES_tessellation_shader                            DONE (all drivers that support GL_ARB_tessellation_shader)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (freedreno, i965, nvc0)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0)
   GL_OES_texture_buffer                                 DONE (freedreno, i965, nvc0, softpipe)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0, softpipe)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
@@ -302,7 +302,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+, radeonsi, virgl)
   GL_ARB_fragment_shader_interlock                      DONE (i965)
   GL_ARB_gpu_shader_int64                               DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
   GL_ARB_parallel_shader_compile                        not started, but Chia-I Wu did some related work in 2014
   GL_ARB_parallel_shader_compile                        DONE (all drivers)
   GL_ARB_post_depth_coverage                            DONE (i965, nvc0)
   GL_ARB_robustness_isolation                           not started
   GL_ARB_sample_locations                               DONE (nvc0)
@@ -323,7 +323,9 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_EXT_semaphore                                      DONE (radeonsi)
   GL_EXT_semaphore_fd                                   DONE (radeonsi)
   GL_EXT_semaphore_win32                                not started
   GL_EXT_texture_norm16                                 DONE (i965, r600, radeonsi, nvc0)
   GL_EXT_sRGB_write_control                             DONE (all drivers that support GLES 3.0+)
   GL_EXT_texture_norm16                                 DONE (freedreno, i965, r600, radeonsi, nvc0)
   GL_EXT_texture_sRGB_R8                                DONE (all drivers that support GLES 3.0+)
   GL_KHR_blend_equation_advanced_coherent               DONE (i965/gen9+)
   GL_KHR_texture_compression_astc_hdr                   DONE (i965/bxt)
   GL_KHR_texture_compression_astc_sliced_3d             DONE (i965/gen9+, radeonsi)
@@ -339,7 +341,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_OES_texture_half_float                             DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear                      DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_view                                   DONE (freedreno, i965/gen8+, r600, radeonsi, nv50, nvc0, softpipe, llvmpipe, swr)
   GL_OES_viewport_array                                 DONE (i965, nvc0, radeonsi)
   GL_OES_viewport_array                                 DONE (i965, nvc0, radeonsi, softpipe)
   GLX_ARB_context_flush_control                         not started
   GLX_ARB_robustness_application_isolation              not started
   GLX_ARB_robustness_share_group_isolation              not started
@@ -439,11 +441,11 @@ Vulkan 1.1 -- all DONE: anv, radv
   VK_KHR_variable_pointers                              DONE (anv, radv)
 Khronos extensions that are not part of any Vulkan version:
   VK_KHR_8bit_storage                                   DONE (anv)
   VK_KHR_8bit_storage                                   DONE (anv, radv)
   VK_KHR_android_surface                                not started
   VK_KHR_create_renderpass2                             DONE (anv, radv)
   VK_KHR_display                                        DONE (anv, radv)
   VK_KHR_display_swapchain                              DONE (anv, radv)
   VK_KHR_display_swapchain                              not started
   VK_KHR_draw_indirect_count                            DONE (radv)
   VK_KHR_external_fence_fd                              DONE (anv, radv)
   VK_KHR_external_fence_win32                           not started

									
										2

docs/helpwanted.html
									
												View File
												
				@@ -29,7 +29,7 @@ immediately checked into git because not enough people are testing them.

				Just applying patches, testing and reporting back is helpful.

				<li>

				<b>Driver debugging.</b>

				There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa">bug database</a>.

				There are plenty of open bugs in the <a href="https://gitlab.freedesktop.org/mesa/mesa/issues">bug database</a>.

				<li>

				<b>Remove aliasing warnings.</b>

				Enable gcc -Wstrict-aliasing=2 -fstrict-aliasing and track down aliasing

									
										53

docs/index.html
									
												View File
												
				@@ -15,6 +15,59 @@

				<div class="content">

				<h1>News</h1>

				<h2>April 24, 2019</h2>

				<p>

				<a href="relnotes/19.0.3.html">Mesa 19.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 10, 2019</h2>

				<p>

				<a href="relnotes/19.0.2.html">Mesa 19.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 5, 2019</h2>

				<p>

				<a href="relnotes/18.3.6.html">Mesa 18.3.6</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 18.3.6 will be the final release in the

				18.3 series. Users of 18.3 are encouraged to migrate to the 19.0

				series in order to obtain future fixes.

				</p>

				<h2>March 27, 2019</h2>

				<p>

				<a href="relnotes/19.0.1.html">Mesa 19.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 18, 2019</h2>

				<p>

				<a href="relnotes/18.3.5.html">Mesa 18.3.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 13, 2019</h2>

				<p>

				<a href="relnotes/19.0.0.html">Mesa 19.0.0</a> is released.

				This is a new development release. See the release notes for more

				information about this release

				</p>

				<h2>February 18, 2019</h2>

				<p>

				<a href="relnotes/18.3.4.html">Mesa 18.3.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 31, 2019</h2>

				<p>

				<a href="relnotes/18.3.3.html">Mesa 18.3.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 17, 2019</h2>

				<p>

				<a href="relnotes/18.3.2.html">Mesa 18.3.2</a> is released.

									
										33

docs/install.html
									
												View File
												
				@@ -40,10 +40,10 @@ Build system.

				</p>

				<ul>

				<li><a href="https://mesonbuild.com">meson</a> is recommended when building on *nix platforms.

				<li>Autoconf is another option when building on *nix platforms.

				<li><a href="https://mesonbuild.com">meson</a> is required when building on *nix platforms.

				<li>Autoconf was removed in 19.1.0, use meson instead

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				Windows and optional for Linux (it's an alternative to autoconf/automake or meson.)

				Windows and optional for Linux (it's an alternative to meson.)

				</li>

				<li>Android Build system when building as native Android component. Autoconf

				is used when when building ARC.

				@@ -76,7 +76,6 @@ you think you've spotted a bug let developers know by filing a

				<li><a href="https://www.python.org/">Python</a> - Python is required.

				When building with scons 2.7 is required.

				When building with meson 3.5 or newer is required.

				When building with autotools 2.7, or 3.5 or later are required.

				</li>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.8.0 or later should work.

				@@ -138,21 +137,7 @@ for more information

				<h1 id="autoconf">3. Building with autoconf (Linux/Unix/X11)</h1>

				<p>

				Although meson is recommended, another supported way to build on *nix systems

				is with autoconf.

				</p>

				<p>

				The general approach is the standard:

				</p>

				<pre>

				  ./configure

				  make

				  sudo make install

				</pre>

				<p>

				But please read the <a href="autoconf.html">detailed autoconf instructions</a>

				for more details.

				  Autoconf support was removed in Mesa 19.1.0. Please use meson instead.

				</p>

				@@ -220,11 +205,11 @@ When compilation has finished, look in the top-level <code>lib/</code>

				You'll see a set of library files similar to this:

				</p>

				<pre>

				lrwxrwxrwx    1 brian    users          10 Mar 26 07:53 libGL.so -> libGL.so.1*

				lrwxrwxrwx    1 brian    users          19 Mar 26 07:53 libGL.so.1 -> libGL.so.1.5.060100*

				lrwxrwxrwx    1 brian    users          10 Mar 26 07:53 libGL.so -&gt; libGL.so.1*

				lrwxrwxrwx    1 brian    users          19 Mar 26 07:53 libGL.so.1 -&gt; libGL.so.1.5.060100*

				-rwxr-xr-x    1 brian    users     3375861 Mar 26 07:53 libGL.so.1.5.060100*

				lrwxrwxrwx    1 brian    users          14 Mar 26 07:53 libOSMesa.so -> libOSMesa.so.6*

				lrwxrwxrwx    1 brian    users          23 Mar 26 07:53 libOSMesa.so.6 -> libOSMesa.so.6.1.060100*

				lrwxrwxrwx    1 brian    users          14 Mar 26 07:53 libOSMesa.so -&gt; libOSMesa.so.6*

				lrwxrwxrwx    1 brian    users          23 Mar 26 07:53 libOSMesa.so.6 -&gt; libOSMesa.so.6.1.060100*

				-rwxr-xr-x    1 brian    users       23871 Mar 26 07:53 libOSMesa.so.6.1.060100*

				</pre>

				@@ -253,7 +238,7 @@ versions of libGL and device drivers.

				<h1 id="pkg-config">7. Building OpenGL programs with pkg-config</h1>

				<p>

				Running <code>make install</code> will install package configuration files

				Running <code>ninja install</code> will install package configuration files

				for the pkg-config utility.

				</p>

									
										10

docs/llvmpipe.html
									
												View File
												
				@@ -120,10 +120,12 @@ To build everything on Linux invoke scons as:

				  scons build=debug libgl-xlib

				</pre>

				Alternatively, you can build it with autoconf/make with:

				Alternatively, you can build it with meson with:

				<pre>

				  ./configure --enable-glx=gallium-xlib --with-gallium-drivers=swrast --disable-dri --disable-gbm --disable-egl

				  make

				  mkdir build

				  cd build

				  meson -D glx=gallium-xlib -D gallium-drivers=swrast

				  ninja

				</pre>

				but the rest of these instructions assume that scons is used.

				@@ -306,7 +308,7 @@ for later analysis, e.g.:

				      <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>

				      <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>

				      <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>

				      <li><a href="https://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>

				      <li><a href="https://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a></li>

				    </ul>

				  </li>

				  <li>

									
										37

docs/mangling.html
									
												View File
											
				@@ -1,37 +0,0 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>GL Function Name Mangling</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>GL Function Name Mangling</h1>

				<p>

				If you want to use both Mesa and another OpenGL library in the same

				application at the same time you may find it useful to compile Mesa with

				<i>name mangling</i>.

				This results in all the Mesa functions being prefixed with

				<b>mgl</b> instead of <b>gl</b>.

				</p>

				<p>

				This option is supported only with the autoconf build. To use it add

				--enable-mangling to your configure line.

				</p>

				<pre>

				<code>./configure --enable-mangling ...</code>

				</pre>

				</div>

				</body>

				</html>

									
										45

docs/mesa.css
									
												View File
												
				@@ -3,64 +3,53 @@ body {

					background-color: #ffffff;

					font: 14px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif;

					color: black;

				 	link: #111188;

				}

				h1 {

					font: 24px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif;

					font-size: 24px;

					font-weight: bold;

					color: black;

				}

				h2 {

					font: 18px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif, bold;

					font-size: 18px;

					font-weight: bold;

					color: black;

				}

				code {

					font-family: monospace;

					font-size: 10pt;

					color: black;

				}

				pre {

					/*font-family: monospace;*/

					font-size: 10pt;

					/*color: black;*/

					background-color: #eee;

					margin-left: 2em;

					padding: .5em;

				}

				iframe {

				  width: 19em;

				  height: 80em;

				  border: none;

				  float: left;

					width: 19em;

					height: 80em;

					border: none;

					float: left;

				}

				.content {

				  position: absolute;

				  left: 20em;

				  right: 10px;

				  overflow: hidden

					position: absolute;

					left: 20em;

					right: 10px;

					overflow: hidden;

				}

				.header {

				  background: black url('gears.png') 15px no-repeat;

				  margin:0;

				  padding: 5px;

				  clear:both;

					background: url('gears.png') 15px no-repeat, black url('gears.png') right no-repeat;

					padding: 2em;

					display: flex;

					text-align: center;

				}

				.header h1 {

				  background: url('gears.png') right no-repeat;

				  color: white;

				  font: x-large sans-serif;

				  text-align: center;

				  height: 50px;

				  margin: 0;

				  padding-top: 30px;

					color: white;

					font: x-large sans-serif;

					margin: auto;

				}

									
										225

docs/meson.html
									
												View File
												
				@@ -17,63 +17,104 @@

				<h1>Compilation and Installation using Meson</h1>

				<ul>

				  <li><a href="#intro">Introduction</a></li>

				  <li><a href="#basic">Basic Usage</a></li>

				  <li><a href="#advanced">Advanced Usage</a></li>

				  <li><a href="#cross-compilation">Cross-compilation and 32-bit builds</a></li>

				</ul>

				<h2 id="basic">1. Basic Usage</h2>

				<h2 id="intro">1. Introduction</h2>

				<p><strong>The Meson build system is generally considered stable and ready

				for production</strong></p>

				<p>For general information about Meson see the

				<a href="http://mesonbuild.com/">Meson website</a>.</p>

				<p>The meson build is tested on Linux, macOS, Cygwin and Haiku, FreeBSD,

				<p><strong>Mesa's Meson build system is generally considered stable and ready

				for production.</strong></p>

				<p>The Meson build of Mesa is tested on Linux, macOS, Cygwin and Haiku, FreeBSD,

				DragonflyBSD, NetBSD, and should work on OpenBSD.</p>

				<p><strong>Mesa requires Meson >= 0.45.0 to build.</strong>

				<p>If Meson is not already installed on your system, you can typically

				install it with your package installer.  For example:</p>

				<pre>

				sudo apt-get install meson   # Ubuntu

				</pre>

				or

				<pre>

				sudo dnf install meson   # Fedora

				</pre>

				<p><strong>Mesa requires Meson &gt;= 0.45.0 to build.</strong>

				Some older versions of meson do not check that they are too old and will error

				out in odd ways.

				</p>

				<p>You'll also need <a href="https://ninja-build.org/">Ninja</a>.

				If it's not already installed, use apt-get or dnf to install

				the <em>ninja-build</em> package.

				</p>

				<h2 id="basic">2. Basic Usage</h2>

				<p>

				The meson program is used to configure the source directory and generates

				either a ninja build file or Visual Studio® build files. The latter must

				be enabled via the <code>--backend</code> switch, as ninja is the default backend on all

				operating systems. Meson only supports out-of-tree builds, and must be passed a

				be enabled via the <code>--backend</code> switch, as ninja is the default

				backend on all

				operating systems.

				</p>

				<p>

				Meson only supports out-of-tree builds, and must be passed a

				directory to put built and generated sources into. We'll call that directory

				"build" for examples.

				"build" here.

				It's recommended to create a

				<a href="http://mesonbuild.com/Using-multiple-build-directories.html">

				separate build directory</a> for each configuration you might want to use.

				</p>

				<p>Basic configuration is done with:</p>

				<pre>

				    meson build/

				meson build/

				</pre>

				<p>

				To see a description of your options you can run <code>meson configure</code>

				along with a build directory to view the selected options for. This will show

				your meson global arguments and project arguments, along with their defaults

				and your local settings.

				This will create the build directory.

				If any dependencies are missing, you can install them, or try to remove

				the dependency with a Meson configuration option (see below).

				</p>

				<p>

				Meson does not currently support listing options before configure a build

				directory, but this feature is being discussed upstream.

				For now, the only way to see what options exist is to look at the

				<code>meson_options.txt</code> file at the root of the project.

				To review the options which Meson chose, run:

				</p>

				<pre>

				    meson configure build/

				meson configure build/

				</pre>

				<p>

				With additional arguments <code>meson configure</code> is used to change

				options on already configured build directory. All options passed to this

				command are in the form <code>-D "command"="value"</code>.

				Meson does not currently support listing configuration options before

				running "meson build/" but this feature is being discussed upstream.

				For now, we have a <code>bin/meson-options.py</code> script that prints

				the options for you.

				If that script doesn't work for some reason, you can always look in the

				<a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/meson_options.txt">

				meson_options.txt</a> file at the root of the project.

				</p>

				<p>

				With additional arguments <code>meson configure</code> can be used to change

				options for a previously configured build directory.

				All options passed to this command are in the form

				<code>-D "option"="value"</code>.

				For example:

				</p>

				<pre>

				    meson configure build/ -Dprefix=/tmp/install -Dglx=true

				meson configure build/ -Dprefix=/tmp/install -Dglx=true

				</pre>

				<p>

				@@ -86,63 +127,108 @@ and brackets to represent an empty list (<code>-D platforms=[]</code>).

				<p>

				Once you've run the initial <code>meson</code> command successfully you can use

				your configured backend to build the project. With ninja, the -C option can be

				be used to point at a directory to build.

				your configured backend to build the project in your build directory:

				</p>

				<pre>

				    ninja -C build/

				ninja -C build/

				</pre>

				<p>

				Without arguments, it will produce libGL.so and/or several other libraries

				depending on the options you have chosen. Later, if you want to rebuild for a

				different configuration, you should run <code>ninja clean</code> before

				changing the configuration, or create a new out of tree build directory for

				each configuration you want to build

				<a href="http://mesonbuild.com/Using-multiple-build-directories.html">as

				recommended in the documentation</a>

				The next step is to install the Mesa libraries, drivers, etc.

				This also finishes up some final steps of the build process (such as creating

				symbolic links for drivers).  To install:

				</p>

				<pre>

				ninja -C build/ install

				</pre>

				<p>

				Autotools automatically updates translation files as part of the build process,

				meson does not do this. Instead if you want translated drirc files you will need 

				to invoke non-default targets for ninja to update them:

				<code>ninja -C build/ xmlpool-pot xmlpool-update-po xmlpool-gmo</code>

				Note: autotools automatically updated translation files (used by the DRI

				configuration tool) as part of the build process,

				Meson does not do this.  Instead, you will need do this:

				</p>

				<pre>

				ninja -C build/ xmlpool-pot xmlpool-update-po xmlpool-gmo

				</pre>

				<h2 id="advanced">3. Advanced Usage</h2>

				<dl>

				<dt><code>Environment Variables</code></dt>

				<dd><p>Meson supports the standard CC and CXX environment variables for

				changing the default compiler. Meson does support CFLAGS, CXXFLAGS, etc. But

				their use is discouraged because of the many caveats in using them. Instead it

				is recomended to use <code>-D${lang}_args</code> and

				<code>-D${lang}_link_args</code> instead. Among the benefits of these options

				is that they are guaranteed to persist across rebuilds and reconfigurations.

				Meson does not allow changing compiler in a configured builddir, you will need

				<dt>Installation Location</dt>

				<dd>

				<p>

				Meson default to installing libGL.so in your system's main lib/ directory

				and DRI drivers to a dri/ subdirectory.

				</p>

				<p>

				Developers will often want to install Mesa to a testing directory rather

				than the system library directory.

				This can be done with the --prefix option.  For example:

				</p>

				<pre>

				meson --prefix="${PWD}/build/install" build/

				</pre>

				<p>

				will put the final libraries and drivers into the build/install/

				directory.

				Then you can set LD_LIBRARY_PATH and LIBGL_DRIVERS_PATH to that location

				to run/test the driver.

				</p>

				<p>

				Meson also honors <code>DESTDIR</code> for installs.

				</p>

				</dd>

				<dt>Compiler Options</dt>

				<dd>

				<p>Meson supports the common CFLAGS, CXXFLAGS, etc. environment

				variables but their use is discouraged because of the many caveats

				in using them.

				</p>

				<p>Instead, it is recomended to use <code>-D${lang}_args</code> and

				<code>-D${lang}_link_args</code>. Among the benefits of these options

				is that they are guaranteed to persist across rebuilds and reconfigurations.

				</p>

				<p>

				This example sets -fmax-errors for compiling C sources and -DMAGIC=123

				for C++ sources:

				</p>

				<pre>

				meson builddir/ -Dc_args=-fmax-errors=10 -Dcpp_args=-DMAGIC=123

				</pre>

				</dd>

				<dt>Compiler Specification</dt>

				<dd>

				<p>

				Meson supports the standard CC and CXX environment variables for

				changing the default compiler.  Note that Meson does not allow

				changing the compilers in a configured builddir so you will need

				to create a new build dir for a different compiler.

				</p>

				<p>

				This is an example of specifying the clang compilers and cleaning

				the build directory before reconfiguring with an extra C option:

				</p>

				<pre>

				    CC=clang CXX=clang++ meson build-clang

				    ninja -C build-clang

				    ninja -C build-clang clean

				    meson configure build -Dc_args="-Wno-typedef-redefinition"

				    ninja -C build-clang

				CC=clang CXX=clang++ meson build-clang

				ninja -C build-clang

				ninja -C build-clang clean

				meson configure build -Dc_args="-Wno-typedef-redefinition"

				ninja -C build-clang

				</pre>

				<p>

				The default compilers depends on your operating system. Meson supports most of

				the popular compilers, a complete list is available

				<a href="http://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.

				</p>

				<p>Meson also honors <code>DESTDIR</code> for installs</p>

				</dd>

				<dt><code>LLVM</code></dt>

				<dt>LLVM</dt>

				<dd><p>Meson includes upstream logic to wrap llvm-config using its standard

				dependency interface.

				</p></dd>

				@@ -153,6 +239,7 @@ As of meson 0.49.0 meson also has the concept of a

				these files provide information about the native build environment (as opposed

				to a cross build environment). They are ini formatted and can override where to

				find llvm-config:

				</p>

				custom-llvm.ini

				<pre>

				@@ -165,38 +252,36 @@ Then configure meson:

				<pre>

				    meson builddir/ --native-file custom-llvm.ini

				</pre>

				</p></dd>

				</dd>

				<dd><p>

				For selecting llvm-config for cross compiling a

				<a href="https://mesonbuild.com/Cross-compilation.html#defining-the-environment">"cross file"</a>

				should be used. It uses the same format as the native file above:

				</p>

				cross-llvm.ini

				<p>cross-llvm.ini</p>

				<pre>

				    [binaries]

				    ...

				    llvm-config = '/usr/lib/llvm-config-32'

				</pre>

				Then configure meson:

				<p>Then configure meson:</p>

				<pre>

				    meson builddir/ --cross-file cross-llvm.ini

				</pre>

				See the <a href="#cross-compilation">Cross Compilation</a> section for more information.

				</dd></p>

				</dd>

				<dd><p>

				For older versions of meson <code>$PATH</code> (or <code>%PATH%</code> on

				windows) will be searched for llvm-config (and llvm-config$version and

				llvm-config-$version), you can override this environment variable to control

				the search: <code>PATH=/path/with/llvm-config:$PATH meson build</code>.

				</dd></p>

				</dl>

				</p></dd>

				<dl>

				<dt><code>PKG_CONFIG_PATH</code></dt>

				<dd><p>The

				<code>pkg-config</code> utility is a hard requirement for configuring and

				@@ -232,9 +317,7 @@ with debugging as some code and validation will be optimized away.

				buildtype, which causes meson to inject no additional compiler arguments, only

				those in the C/CXXFLAGS and those that mesa itself defines.</p>

				</dd>

				</dl>

				<dl>

				<dt><code>-Db_ndebug</code></dt>

				<dd><p>This option controls assertions in meson projects. When set to <code>false</code>

				(the default) assertions are enabled, when set to true they are disabled. This

				@@ -244,7 +327,7 @@ is unrelated to the <code>buildtype</code>; setting the latter to

				</dd>

				</dl>

				<h2 id="cross-compilation">2. Cross-compilation and 32-bit builds</h2>

				<h2 id="cross-compilation">4. Cross-compilation and 32-bit builds</h2>

				<p><a href="https://mesonbuild.com/Cross-compilation.html">Meson supports

				cross-compilation</a> by specifying a number of binary paths and

				@@ -262,14 +345,15 @@ will likely have to alter them for your system.</p>

				<p>

				Those running on ArchLinux can use the AUR-maintained packages for some

				of those, as they'll have the right values for your system:

				</p>

				<ul>

				  <li><a href="https://aur.archlinux.org/packages/meson-cross-x86-linux-gnu">meson-cross-x86-linux-gnu</a></li>

				  <li><a href="https://aur.archlinux.org/packages/meson-cross-aarch64-linux-gnu">meson-cross-aarch64-linux-gnu</a></li>

				</ul>

				</p>

				<p>

				32-bit build on x86 linux:

				</p>

				<pre>

				[binaries]

				c = '/usr/bin/gcc'

				@@ -291,10 +375,10 @@ cpu_family = 'x86'

				cpu = 'i686'

				endian = 'little'

				</pre>

				</p>

				<p>

				64-bit build on ARM linux:

				</p>

				<pre>

				[binaries]

				c = '/usr/bin/aarch64-linux-gnu-gcc'

				@@ -310,10 +394,10 @@ cpu_family = 'aarch64'

				cpu = 'aarch64'

				endian = 'little'

				</pre>

				</p>

				<p>

				64-bit build on x86 windows:

				</p>

				<pre>

				[binaries]

				c = '/usr/bin/x86_64-w64-mingw32-gcc'

				@@ -329,7 +413,6 @@ cpu_family = 'x86_64'

				cpu = 'i686'

				endian = 'little'

				</pre>

				</p>

				</div>

				</body>

									
										4

docs/opengles.html
									
												View File
												
				@@ -25,7 +25,7 @@ https://www.khronos.org/opengles/</a>.</p>

				<h2>Build the Libraries</h2>

				<ol>

				<li>Run <code>configure</code> with <code>--enable-gles1 --enable-gles2</code> and enable the Gallium driver for your hardware.</li>

				<li>Run <code>meson configure</code> with <code>-D gles1=true -D gles2=true</code> and enable the Gallium driver for your hardware.</li>

				<li>Build and install Mesa as usual.</li>

				</ol>

				@@ -33,7 +33,7 @@ Alternatively, if XCB-DRI2 is installed on the system, one can use

				<code>egl_dri2</code> EGL driver with OpenGL|ES-enabled DRI drivers

				<ol>

				<li>Run <code>configure</code> with <code>--enable-gles1 --enable-gles2</code>.</li>

				<li>Run <code>meson configure</code> with <code>-D gles1=true -D gles2=true</code>.</li>

				<li>Build and install Mesa as usual.</li>

				</ol>

									
										11

docs/osmesa.html
									
												View File
												
				@@ -51,8 +51,8 @@ There are several examples of OSMesa in the mesa/demos repository.

				Configure and build Mesa with something like:

				<pre>

				configure --enable-osmesa --disable-driglx-direct --disable-dri --with-gallium-drivers=swrast

				make

				meson builddir -Dosmesa=gallium -Dgallium-drivers=swrast -Ddri-drivers= -Dvulkan-drivers= -Dprefix=$PWD/builddir/install

				ninja -C builddir install

				</pre>

				<p>

				@@ -63,13 +63,12 @@ Make sure you have LLVM installed first if you want to use the llvmpipe driver.

				When the build is complete you should find:

				</p>

				<pre>

				lib/libOSMesa.so  (swrast-based OSMesa)

				lib/gallium/libOSMsea.so  (gallium-based OSMesa)

				$PWD/builddir/install/lib/libOSMesa.so  (swrast-based OSMesa)

				$PWD/builddir/install/lib/gallium/libOSMsea.so  (gallium-based OSMesa)

				</pre>

				<p>

				Set your LD_LIBRARY_PATH to point to one directory or the other to select

				the library you want to use.

				Set your LD_LIBRARY_PATH to point to $PWD/builddir/install to use the libraries

				</p>

				<p>

									
										2

docs/precompiled.html
									
												View File
												
				@@ -24,13 +24,13 @@ Some Linux distributions closely follow the latest Mesa releases. On others one

				has to use unofficial channels.

				<br>

				There are some general directions:

				</p>

				<ul>

				<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>

				<li>Fedora - Corp: erp and che</li>

				<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>

				<li>Gentoo/Archlinux - officially provided/supported</li>

				</ul>

				</p>

				</div>

				</body>

									
										63

docs/release-calendar.html
									
												View File
												
				@@ -49,78 +49,47 @@ if you'd like to nominate a patch in the next stable release.

				<th>Notes</th>

				</tr>

				<tr>

				<td rowspan="4">18.3</td>

				<td>2019-01-30</td>

				<td>18.3.3</td>

				<td>Emil Velikov</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-13</td>

				<td>18.3.4</td>

				<td>Emil Velikov</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-27</td>

				<td>18.3.5</td>

				<td>Emil Velikov</td>

				<td>

				</tr>

				<tr>

				<td>2019-03-13</td>

				<td>18.3.6</td>

				<td>Emil Velikov</td>

				<td>Last planned 18.3.x release</td>

				</tr>

				<tr>

				<td rowspan="4">19.0</td>

				<td>2019-01-29</td>

				<td>19.0.0-rc1</td>

				<td rowspan="3">19.0</td>

				<td>2019-05-07</td>

				<td>19.0.4</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-05</td>

				<td>19.0.0-rc2</td>

				<td>2019-05-21</td>

				<td>19.0.5</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-12</td>

				<td>19.0.0-rc3</td>

				<td>2019-06-04</td>

				<td>19.0.6</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-19</td>

				<td>19.0.0-rc4</td>

				<td>Dylan Baker</td>

				<td>Last planned RC/Final release</td>

				<td>Last planned 19.0.x release</td>

				</tr>

				<tr>

				<td rowspan="4">19.1</td>

				<td>2019-04-30</td>

				<td>19.1.0-rc1</td>

				<td>Andres Gomez</td>

				<td>Juan A. Suarez</td>

				<td>

				</tr>

				<tr>

				<td>2019-05-07</td>

				<td>19.1.0-rc2</td>

				<td>Andres Gomez</td>

				<td>Juan A. Suarez</td>

				<td>

				</tr>

				<tr>

				<td>2019-05-14</td>

				<td>19.1.0-rc3</td>

				<td>Andres Gomez</td>

				<td>Juan A. Suarez</td>

				<td>

				</tr>

				<tr>

				<td>2019-05-21</td>

				<td>19.1.0-rc4</td>

				<td>Andres Gomez</td>

				<td>Juan A. Suarez</td>

				<td>Last planned RC/Final release</td>

				</tr>

				<tr>

				@@ -152,25 +121,25 @@ if you'd like to nominate a patch in the next stable release.

				<td rowspan="4">19.3</td>

				<td>2019-10-15</td>

				<td>19.3.0-rc1</td>

				<td>Juan A. Suarez</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-10-22</td>

				<td>19.3.0-rc2</td>

				<td>Juan A. Suarez</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-10-29</td>

				<td>19.3.0-rc3</td>

				<td>Juan A. Suarez</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-11-05</td>

				<td>19.3.0-rc4</td>

				<td>Juan A. Suarez</td>

				<td>Dylan Baker</td>

				<td>Last planned RC/Final release</td>

				</tr>

				</table>

									
										122

docs/releasing.html
									
												View File
												
				@@ -136,7 +136,7 @@ well contained. Thus it cannot affect more than one driver/subsystem.

				<p>The following must pass:</p>

				<ul>

				<li>make distcheck, scons and scons check

				<li>meson test, scons and scons check

				<li>Testing with different version of system components - LLVM and others is also

				performed where possible.

				<li>As a general rule, testing with various combinations of configure

				@@ -251,7 +251,7 @@ stabilisation and bugfixing.

				</p>

				<p>

				Note: Before doing a branch ensure that basic build and <code>make check</code>

				Note: Before doing a branch ensure that basic build and <code>meson test</code>

				testing is done and there are little to-no issues.

				<br>

				Ideally all of those should be tackled already.

				@@ -279,7 +279,7 @@ To setup the branchpoint:

				<p>

				Now go to

				<a href="https://bugs.freedesktop.org/editversions.cgi?action=add&amp;product=Mesa" target="_parent">Bugzilla</a> and add the new Mesa version X.Y.

				<a href="https://gitlab.freedesktop.org/mesa/mesa/-/milestones" target="_parent">gitlab</a> and add the new Mesa version X.Y.

				</p>

				<p>

				@@ -468,96 +468,48 @@ So we do a quick 'touch test'

				</p>

				<ul>

				<li>make distcheck (you can omit this if you're not using --dist below)

				<li>meson dist

				<li>scons (from release tarball)

				<li>the produced binaries work

				</ul>

				<p>

				Here is one solution that I've been using.

				  Here is one solution:

				</p>

				<pre>

					# Set MAKEFLAGS if you haven't already

					git clean -fXd; git clean -nxd

					read # quick cross check any outstanding files

					export __version=`cat VERSION`

					export __mesa_root=../

					export __build_root=./foo

					chmod 755 -fR $__build_root; rm -rf $__build_root

					mkdir -p $__build_root &amp;&amp; cd $__build_root

					# For the native builds - such as distcheck, scons, sanity test, you

					# may want to specify which LLVM to use:

					# export LLVM_CONFIG=/usr/lib/llvm-3.9/bin/llvm-config

					# Do a full distcheck

					$__mesa_root/autogen.sh &amp;&amp; make distcheck

					# Build check the tarballs (scons, linux)

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					scons

					cd .. &amp;&amp; rm -rf mesa-$__version

					# Build check the tarballs (scons, windows/mingw)

					# Temporary drop LLVM_CONFIG, unless you have a Windows/mingw one.

					# save_LLVM_CONFIG=`echo $LLVM_CONFIG`; unset LLVM_CONFIG

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					scons platform=windows toolchain=crossmingw

					cd .. &amp;&amp; rm -rf mesa-$__version

					# Test the automake binaries

					# Restore LLVM_CONFIG, if applicable:

					# export LLVM_CONFIG=`echo $save_LLVM_CONFIG`; unset save_LLVM_CONFIG

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					./configure \

						--with-dri-drivers=i965,swrast \

						--with-gallium-drivers=swrast \

						--with-vulkan-drivers=intel \

						--enable-llvm-shared-libs \

						--enable-llvm \

						--enable-glx-tls \

						--enable-gbm \

						--enable-egl \

						--with-platforms=x11,drm,wayland,surfaceless

					make &amp;&amp; DESTDIR=`pwd`/test make install

					# Drop LLVM_CONFIG, if applicable:

					# unset LLVM_CONFIG

					__glxinfo_cmd='glxinfo 2&gt;&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'

					__glxgears_cmd='glxgears 2&gt;&amp;1 | grep -v "configuration file"'

					__es2info_cmd='es2_info 2&gt;&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'

					__es2gears_cmd='es2gears_x11 2&gt;&amp;1 | grep -v "configuration file"'

					test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"

					export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"

					export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/

					export LIBGL_DEBUG=verbose

					eval $__glxinfo_cmd

					eval $__glxgears_cmd

					eval $__es2info_cmd

					eval $__es2gears_cmd

					export LIBGL_ALWAYS_SOFTWARE=true

					eval $__glxinfo_cmd

					eval $__glxgears_cmd

					eval $__es2info_cmd

					eval $__es2gears_cmd

					export LIBGL_ALWAYS_SOFTWARE=true

					export GALLIUM_DRIVER=softpipe

					eval $__glxinfo_cmd

					eval $__glxgears_cmd

					eval $__es2info_cmd

					eval $__es2gears_cmd

					# Smoke test DOTA2

					unset LD_LIBRARY_PATH

					test "x$__old_ld" != 'x' &amp;&amp; export LD_LIBRARY_PATH="$__old_ld" &amp;&amp; unset __old_ld

					unset LIBGL_DRIVERS_PATH

					unset LIBGL_DEBUG

					unset LIBGL_ALWAYS_SOFTWARE

					unset GALLIUM_DRIVER

					export VK_ICD_FILENAMES=`pwd`/src/intel/vulkan/dev_icd.json

					steam steam://rungameid/570  -vconsole -vulkan

					unset VK_ICD_FILENAMES

				    __glxgears_cmd='glxgears 2&gt;&amp;1 | grep -v "configuration file"'

				    __es2info_cmd='es2_info 2&gt;&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'

				    __es2gears_cmd='es2gears_x11 2&gt;&amp;1 | grep -v "configuration file"'

				    test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"

				    export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"

				    export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/

				    export LIBGL_DEBUG=verbose

				    eval $__glxinfo_cmd

				    eval $__glxgears_cmd

				    eval $__es2info_cmd

				    eval $__es2gears_cmd

				    export LIBGL_ALWAYS_SOFTWARE=true

				    eval $__glxinfo_cmd

				    eval $__glxgears_cmd

				    eval $__es2info_cmd

				    eval $__es2gears_cmd

				    export LIBGL_ALWAYS_SOFTWARE=true

				    export GALLIUM_DRIVER=softpipe

				    eval $__glxinfo_cmd

				    eval $__glxgears_cmd

				    eval $__es2info_cmd

				    eval $__es2gears_cmd

				    # Smoke test DOTA2

				    unset LD_LIBRARY_PATH

				    test "x$__old_ld" != 'x' &amp;&amp; export LD_LIBRARY_PATH="$__old_ld" &amp;&amp; unset __old_ld

				    unset LIBGL_DRIVERS_PATH

				    unset LIBGL_DEBUG

				    unset LIBGL_ALWAYS_SOFTWARE

				    unset GALLIUM_DRIVER

				    export VK_ICD_FILENAMES=`pwd`/src/intel/vulkan/dev_icd.json

				    steam steam://rungameid/570  -vconsole -vulkan

				    unset VK_ICD_FILENAMES

				</pre>

				<h3>Update version in file VERSION</h3>

									
										8

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,14 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/19.0.3.html">19.0.3 release notes</a>

				<li><a href="relnotes/19.0.2.html">19.0.2 release notes</a>

				<li><a href="relnotes/18.3.6.html">18.3.6 release notes</a>

				<li><a href="relnotes/19.0.1.html">19.0.1 release notes</a>

				<li><a href="relnotes/18.3.5.html">18.3.5 release notes</a>

				<li><a href="relnotes/19.0.0.html">19.0.0 release notes</a>

				<li><a href="relnotes/18.3.4.html">18.3.4 release notes</a>

				<li><a href="relnotes/18.3.3.html">18.3.3 release notes</a>

				<li><a href="relnotes/18.3.2.html">18.3.2 release notes</a>

				<li><a href="relnotes/18.2.8.html">18.2.8 release notes</a>

				<li><a href="relnotes/18.2.7.html">18.2.7 release notes</a>

									
										6

docs/relnotes/10.2.html
									
												View File
												
				@@ -69,14 +69,15 @@ TBD.

				<h2>Changes</h2>

				<ul>

				<li>Renamed <i>--with-llvm-shared-libs</i> to <i>--enable-llvm-shared-libs</i></li>

				<li>Renamed <i>--with-llvm-shared-libs</i> to <i>--enable-llvm-shared-libs</i>

				<p>

				The option is used to control how mesa is linked against LLVM, and now

				defaults to enabled (shared linking).

				</p>

				</li>

				<li>Split <i>libxatracker.so</i> into a standalone library which can be used

				with any gallium driver.</li>

				with any gallium driver.

				<p>

				Previously the library was linked statically against vmware's virtual gpu

				driver(svga), whereas now it loads a shared pipe_*.so driver. Provide the

				@@ -88,6 +89,7 @@ following options during configure, if you would like support for svga driver

				Note: The files are installed in $(libdir)/gallium-pipe/ and the interface

				between them and libxatracker.so is <strong>not</strong> stable.

				</p>

				</li>

				<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>

				</ul>

									
										2

docs/relnotes/10.3.html
									
												View File
												
				@@ -327,7 +327,7 @@ DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm)</li>

				<li>Removed support for the GL_ATI_envmap_bumpmap extension</li>

				<li>The hacky --enable-32/64-bit is no longer available in configure. To build

				32/64 bit mesa refer to the default method recommended by your distribution</li>

				</li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>

				<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>

				</ul>

				</div>

									
										2

docs/relnotes/11.0.0.html
									
												View File
												
				@@ -252,7 +252,9 @@ Note: some of the new features are only available with certain drivers.

				<h2>Changes</h2>

				<ul>

				<li>Removed the EGL loader from the Linux SCons build.</li>

				</ul>

				</div>

				</body>

									
										4

docs/relnotes/11.1.0.html
									
												View File
												
				@@ -72,7 +72,7 @@ Note: some of the new features are only available with certain drivers.

				<li>GL_EXT_blend_func_extended on all drivers that support the ARB version</li>

				<li>GL_EXT_buffer_storage implemented for when ES 3.1 support is gained</li>

				<li>GL_EXT_draw_elements_base_vertex on all drivers</li>

				<li>GL_EXT_texture_compression_rgtc / latc on freedreno (a3xx & a4xx)</li>

				<li>GL_EXT_texture_compression_rgtc / latc on freedreno (a3xx &amp; a4xx)</li>

				<li>GL_KHR_debug (GLES)</li>

				<li>GL_NV_conditional_render on freedreno</li>

				<li>GL_OES_draw_elements_base_vertex on all drivers</li>

				@@ -274,7 +274,9 @@ Note: some of the new features are only available with certain drivers.

				<h2>Changes</h2>

				<ul>

				<li>MPEG4 decoding has been disabled by default in the VAAPI driver</li>

				</ul>

				</div>

				</body>

									
										5

docs/relnotes/17.3.5.html
									
												View File
												
				@@ -42,10 +42,7 @@ eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a  mesa-17.3.5.ta

				<h2>Bug fixes</h2>

				<ul>

				</ul>

				<p>None</p>

				<h2>Changes</h2>

									
										2

docs/relnotes/18.1.1.html
									
												View File
												
				@@ -42,7 +42,7 @@ d3312a2ede5aac14a47476b208b8e3a401367838330197c4588ab8ad420d7781  mesa-18.1.1.ta

				<h2>Bug fixes</h2>

				<p>None<p>

				<p>None</p>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

									
										2

docs/relnotes/18.1.2.html
									
												View File
												
				@@ -42,7 +42,7 @@ a644df23937f4078a2bd9a54349f6315c1955f5e3a4ac272832da51dea4d3c11  mesa-18.1.1.ta

				<h2>Bug fixes</h2>

				<p>None<p>

				<p>None</p>

				<h2>Changes</h2>

									
										4

docs/relnotes/18.3.0.html
									
												View File
												
				@@ -58,7 +58,7 @@ Note: some of the new features are only available with certain drivers.

				<li>GL_AMD_multi_draw_indirect on all GL 4.x drivers.</li>

				<li>GL_AMD_query_buffer_object on i965, nvc0, r600, radeonsi.</li>

				<li>GL_EXT_disjoint_timer_query on radeonsi and most other Gallium drivers (ES extension)</li>

				<li>GL_EXT_texture_compression_s3tc on all drivers (ES extension)<li>

				<li>GL_EXT_texture_compression_s3tc on all drivers (ES extension)</li>

				<li>GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.</li>

				<li>GL_EXT_window_rectangles on radeonsi.</li>

				<li>GL_KHR_texture_compression_astc_sliced_3d on radeonsi.</li>

				@@ -272,6 +272,8 @@ Note: some of the new features are only available with certain drivers.

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>

				</ul>

				<h2>Changes</h2>

				<ul>

									
										208

docs/relnotes/18.3.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,208 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.3 Release Notes / January 31, 2019</h1>

				<p>

				Mesa 18.3.3 is a bug fix release which fixes bugs found since the 18.3.2 release.

				</p>

				<p>

				Mesa 18.3.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				6b9893942fe8011c7736d51448deb6ef80ece2257e0fac27b02e997a6605d5e4  mesa-18.3.3.tar.gz

				2ab6886a6966c532ccbcc3b240925e681464b658244f0cbed752615af3936299  mesa-18.3.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108877">Bug 108877</a> - OpenGL CTS gl43 test cases were interrupted due to segment fault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109023">Bug 109023</a> - error: inlining failed in call to always_inline ‘__m512 _mm512_and_ps(__m512, __m512)’: target specific option mismatch</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109129">Bug 109129</a> - format_types.h:1220: undefined reference to `_mm256_cvtps_ph'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109229">Bug 109229</a> - glLinkProgram locks up for ~30 seconds</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109242">Bug 109242</a> - [RADV] The Witcher 3 system freeze</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109488">Bug 109488</a> - Mesa 18.3.2 crash on a specific fragment shader (assert triggered) / already fixed on the master branch.</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>bin/get-pick-list.sh: fix the oneline printing</li>

				  <li>bin/get-pick-list.sh: fix redirection in sh</li>

				</ul>

				<p>Axel Davy (1):</p>

				<ul>

				  <li>st/nine: Immediately upload user provided textures</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Only use 32 KiB per threadgroup on Stoney.</li>

				  <li>radv: Set partial_vs_wave for pipelines with just GS, not tess.</li>

				  <li>nir: Account for atomics in copy propagation.</li>

				</ul>

				<p>Bruce Cherniak (1):</p>

				<ul>

				  <li>gallium/swr: Fix multi-context sync fence deadlock.</li>

				</ul>

				<p>Carsten Haitzler (Rasterman) (2):</p>

				<ul>

				  <li>vc4: Use named parameters for the NEON inline asm.</li>

				  <li>vc4: Declare the cpu pointers as being modified in NEON asm.</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>glsl: Fix copying function's out to temp if dereferenced by array</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>dri_interface: add put shm image2 (v2)</li>

				  <li>glx: add support for putimageshm2 path (v2)</li>

				  <li>gallium: use put image shm2 path (v2)</li>

				</ul>

				<p>Dylan Baker (4):</p>

				<ul>

				  <li>meson: allow building dri driver without window system if osmesa is classic</li>

				  <li>meson: fix swr KNL build</li>

				  <li>meson: Fix compiler checks for SWR with ICC</li>

				  <li>meson: Add warnings and errors when using ICC</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.3.2</li>

				  <li>cherry-ignore: radv: Fix multiview depth clears</li>

				  <li>cherry-ignore: spirv: Handle arbitrary bit sizes for deref array indices</li>

				  <li>cherry-ignore: WARNING: Commit XXX lists invalid sha</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>vc4: Don't leak the GPU fd for renderonly usage.</li>

				  <li>vc4: Enable NEON asm on meson cross-builds.</li>

				</ul>

				<p>Eric Engestrom (2):</p>

				<ul>

				  <li>configure: EGL requirements only apply if EGL is built</li>

				  <li>meson/vdpau: add missing soversion</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>anv/device: fix maximum number of images supported</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>anv/nir: Rework arguments to apply_pipeline_layout</li>

				  <li>anv: Only parse pImmutableSamplers if the descriptor has samplers</li>

				  <li>nir/xfb: Fix offset accounting for dvec3/4</li>

				</ul>

				<p>Karol Herbst (2):</p>

				<ul>

				  <li>nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions</li>

				  <li>glsl/lower_output_reads: set invariant and precise flags on temporaries</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: fix invalid binding table index computation</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>radeonsi: also apply the GS hang workaround to draws without tessellation</li>

				  <li>radeonsi: fix a u_blitter crash after a shader with FBFETCH</li>

				  <li>radeonsi: fix rendering to tiny viewports where the viewport center is &gt; 8K</li>

				  <li>st/mesa: purge framebuffers when unbinding a context</li>

				</ul>

				<p>Niklas Haas (1):</p>

				<ul>

				  <li>radv: correctly use vulkan 1.0 by default</li>

				</ul>

				<p>Pierre Moreau (1):</p>

				<ul>

				  <li>meson: Fix with_gallium_icd to with_opencl_icd</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>loader: fix the no-modifiers case</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: clean up setting partial_es_wave for distributed tess on VI</li>

				</ul>

				<p>Timothy Arceri (5):</p>

				<ul>

				  <li>ac/nir_to_llvm: fix interpolateAt* for arrays</li>

				  <li>ac/nir_to_llvm: fix clamp shadow reference for more hardware</li>

				  <li>radv/ac: fix some fp16 handling</li>

				  <li>glsl: use remap location when serialising uniform program resource data</li>

				  <li>glsl: Copy function out to temp if we don't directly ref a variable</li>

				</ul>

				<p>Tomeu Vizoso (1):</p>

				<ul>

				  <li>etnaviv: Consolidate buffer references from framebuffers</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>meson: Fix typo.</li>

				</ul>

				</div>

				</body>

				</html>

									
										180

docs/relnotes/18.3.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,180 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.4 Release Notes / February 18, 2019</h1>

				<p>

				Mesa 18.3.4 is a bug fix release which fixes bugs found since the 18.3.3 release.

				</p>

				<p>

				Mesa 18.3.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e22e6fe4c3aca80fe872a0a7285b6c5523e0cfc0bfb57ffcc3b3d66d292593e4  mesa-18.3.4.tar.gz

				32314da4365d37f80d84f599bd9625b00161c273c39600ba63b45002d500bb07  mesa-18.3.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109107">Bug 109107</a> - gallium/st/va: change va max_profiles when using Radeon VCN Hardware</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109401">Bug 109401</a> - [DXVK] Project Cars rendering problems</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109543">Bug 109543</a> - After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working [&quot;vulkan-cube&quot; received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109603">Bug 109603</a> - nir_instr_as_deref: Assertion `parent &amp;&amp; parent-&gt;type == nir_instr_type_deref' failed.</li>

				</ul>

				<h2>Changes</h2>

				<p>Bart Oldeman (1):</p>

				<ul>

				  <li>gallium-xlib: query MIT-SHM before using it.</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Only look at pImmutableSamples if the descriptor has a sampler.</li>

				  <li>amd/common: Use correct writemask for shared memory stores.</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>get-pick-list: Add --pretty=medium to the arguments for Cc patches</li>

				  <li>meson: Add dependency on genxml to anvil</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.3.3</li>

				  <li>cherry-ignore: nv50,nvc0: add explicit settings for recent caps</li>

				  <li>cherry-ignore: add more 19.0 only nominations from Ilia</li>

				  <li>cherry-ignore: radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8</li>

				  <li>Update version to 18.3.4</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: Fix copy-and-paste fail in backport of NEON asm fixes.</li>

				</ul>

				<p>Eric Engestrom (2):</p>

				<ul>

				  <li>xvmc: fix string comparison</li>

				  <li>xvmc: fix string comparison</li>

				</ul>

				<p>Ernestas Kulik (2):</p>

				<ul>

				  <li>vc4: Fix leak in HW queries error path</li>

				  <li>v3d: Fix leak in resource setup error path</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nvc0: we have 16k-sized framebuffers, fix default scissors</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()</li>

				  <li>intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode</li>

				  <li>nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>anv/cmd_buffer: check for NULL framebuffer</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048</li>

				</ul>

				<p>Kristian H. Kristensen (1):</p>

				<ul>

				  <li>freedreno/a6xx: Emit blitter dst with OUT_RELOCW</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>st/va: fix the incorrect max profiles report</li>

				  <li>st/va/vp9: set max reference as default of VP9 reference number</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>meson: drop the xcb-xrandr version requirement</li>

				  <li>gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets &gt; 0</li>

				  <li>radeonsi: fix EXPLICIT_FLUSH for flush offsets &gt; 0</li>

				  <li>winsys/amdgpu: don't drop manually added fence dependencies</li>

				</ul>

				<p>Mario Kleiner (2):</p>

				<ul>

				  <li>egl/wayland: Allow client-&gt;server format conversion for PRIME offload. (v2)</li>

				  <li>egl/wayland-drm: Only announce formats via wl_drm which the driver supports.</li>

				</ul>

				<p>Oscar Blumberg (1):</p>

				<ul>

				  <li>radeonsi: Fix guardband computation for large render targets</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno: stop frob'ing pipe_resource::nr_samples</li>

				</ul>

				<p>Rodrigo Vivi (1):</p>

				<ul>

				  <li>intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: fix compiler issues with GCC 9</li>

				  <li>radv: always export gl_SampleMask when the fragment shader uses it</li>

				</ul>

				</div>

				</body>

				</html>

									
										271

docs/relnotes/18.3.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,271 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.5 Release Notes / March 18, 2019</h1>

				<p>

				Mesa 18.3.5 is a bug fix release which fixes bugs found since the 18.3.4 release.

				</p>

				<p>

				Mesa 18.3.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				5f40a336cb2af9b1d66fa243bb03c2c8a3f9b3f067aab6aaaad4316d1bc0e58b  mesa-18.3.5.tar.gz

				4027aea82cc63240b3fcf60eec9eea882955f098c989b29357b01d1695747953  mesa-18.3.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104297">Bug 104297</a> - [i965] Downward causes GPU hangs and misrendering on Haswell</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104602">Bug 104602</a> - [apitrace] Graphical artifacts in Civilization VI on RX Vega</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107052">Bug 107052</a> - [Regression][bisected]. Crookz - The Big Heist Demo can't be launched despite the &quot;true&quot; flag in &quot;drirc&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107563">Bug 107563</a> - [RADV] Broken rendering in Unity demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108457">Bug 108457</a> - [OpenGL CTS] KHR-GL46.tessellation_shader.single.xfb_captures_data_from_correct_stage fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108999">Bug 108999</a> - Calculating the scissors fields when the y is flipped (0 on top) can generate negative numbers that will cause assertion failure later on.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109328">Bug 109328</a> - [BSW BXT GLK] dEQP-VK.subgroups.arithmetic.subgroup regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109443">Bug 109443</a> - Build failure with MSVC when using Scons &gt;= 3.0.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109451">Bug 109451</a> - [IVB,SNB] LINE_STRIPs following a TRIANGLE_FAN fail to use primitive restart</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109594">Bug 109594</a> - totem assert failure: totem: src/intel/genxml/gen9_pack.h:72: __gen_uint: La declaración `v &lt;= max' no se cumple.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109597">Bug 109597</a> - wreckfest issues with transparent objects &amp; skybox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109601">Bug 109601</a> - [Regression] RuneLite GPU rendering broken on 18.3.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109698">Bug 109698</a> - dri.pc contents invalid when built with meson</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109735">Bug 109735</a> - [Regression] broken font with mesa_vulkan_overlay</li>

				</ul>

				<h2>Changes</h2>

				<p>Alok Hota (1):</p>

				<ul>

				  <li>swr/rast: bypass size limit for non-sampled textures</li>

				</ul>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>i965: re-emit index buffer state on a reset option change.</li>

				</ul>

				<p>Axel Davy (2):</p>

				<ul>

				  <li>st/nine: Ignore window size if error</li>

				  <li>st/nine: Ignore multisample quality level if no ms</li>

				</ul>

				<p>Bas Nieuwenhuizen (4):</p>

				<ul>

				  <li>radv: Sync ETC2 whitelisted devices.</li>

				  <li>radv: Fix float16 interpolation set up.</li>

				  <li>radv: Allow interpolation on non-float types.</li>

				  <li>radv: Interpolate less aggressively.</li>

				</ul>

				<p>Carlos Garnacho (1):</p>

				<ul>

				  <li>wayland/egl: Ensure EGL surface is resized on DRI update_buffers()</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>glsl/linker: Fix unmatched TCS outputs being reduced to local variable</li>

				</ul>

				<p>David Shao (1):</p>

				<ul>

				  <li>meson: ensure that xmlpool_options.h is generated for gallium targets that need it</li>

				</ul>

				<p>Eleni Maria Stea (1):</p>

				<ul>

				  <li>i965: fixed clamping in set_scissor_bits when the y is flipped</li>

				</ul>

				<p>Emil Velikov (7):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.3.4</li>

				  <li>meson: egl: correctly manage loader/xmlconfig</li>

				  <li>cherry-ignore: add 19.0 only anv/push buffer nominations</li>

				  <li>cherry-ignore: add gitlab-ci fixup commit</li>

				  <li>cherry-ignore: ignore glsl_types memory cleanup patch</li>

				  <li>cherry-ignore: add explicit 19.0 performance optimisations</li>

				  <li>Update version to 18.3.5</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>egl: fix libdrm-less builds</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>intel/fs: Implement extended strides greater than 4 for IR source regions.</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>intel/fs: nir_op_extract_i8 extracts a byte, not a word</li>

				  <li>intel/fs: Fix extract_u8 of an odd byte from a 64-bit integer</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>glsl: fix recording of variables for XFB in TCS shaders</li>

				</ul>

				<p>Jason Ekstrand (10):</p>

				<ul>

				  <li>intel/fs: Bail in optimize_extract_to_float if we have modifiers</li>

				  <li>compiler/types: Add a contains_64bit helper</li>

				  <li>nir/xfb: Properly align 64-bit values</li>

				  <li>nir/xfb: Work in terms of components rather than slots</li>

				  <li>nir/xfb: Handle compact arrays in gather_xfb_info</li>

				  <li>anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupport</li>

				  <li>spirv: OpImageQueryLod requires a sampler</li>

				  <li>spirv: Pull offset/stride from the pointer for OpArrayLength</li>

				  <li>glsl/list: Add a list variant of insert_after</li>

				  <li>glsl/lower_vector_derefs: Don't use a temporary for TCS outputs</li>

				</ul>

				<p>Jose Maria Casanova Crespo (1):</p>

				<ul>

				  <li>glsl: TCS outputs can not be transform feedback candidates on GLES</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>scons: Workaround failures with MSVC when using SCons 3.0.[2-4].</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>genxml: add missing field values for 3DSTATE_SF</li>

				  <li>anv: advertise 8 subpixel precision bits</li>

				  <li>anv: destroy descriptor sets when pool gets reset</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>intel/fs: Fix opt_peephole_csel to not throw away saturates.</li>

				</ul>

				<p>Kevin Strasser (1):</p>

				<ul>

				  <li>egl/dri: Avoid out of bounds array access</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>intel: fix urb size for CFL GT1</li>

				</ul>

				<p>Marek Olšák (5):</p>

				<ul>

				  <li>radeonsi: add driconf option radeonsi_enable_nir</li>

				  <li>radeonsi: always enable NIR for Civilization 6 to fix corruption</li>

				  <li>driconf: add Civ6Sub executable for Civilization 6</li>

				  <li>tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics</li>

				  <li>radeonsi: compile clear and copy buffer compute shaders on demand</li>

				</ul>

				<p>Mauro Rossi (2):</p>

				<ul>

				  <li>android: anv: fix generated files depedencies (v2)</li>

				  <li>android: anv: fix libexpat shared dependency</li>

				</ul>

				<p>Ray Zhang (1):</p>

				<ul>

				  <li>glx: fix shared memory leak in X11</li>

				</ul>

				<p>Rhys Perry (2):</p>

				<ul>

				  <li>radv: bitcast 16-bit outputs to integers</li>

				  <li>radv: ensure export arguments are always float</li>

				</ul>

				<p>Samuel Pitoiset (8):</p>

				<ul>

				  <li>radv: write the alpha channel of MRT0 when alpha coverage is enabled</li>

				  <li>radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled</li>

				  <li>radv: fix clearing attachments in secondary command buffers</li>

				  <li>radv: fix out-of-bounds access when copying descriptors BO list</li>

				  <li>radv: don't copy buffer descriptors list for samplers</li>

				  <li>radv: properly align the fence and EOP bug VA on GFX9</li>

				  <li>radv: fix pointSizeRange limits</li>

				  <li>radv: always initialize HTILE when the src layout is UNDEFINED</li>

				</ul>

				<p>Sergii Romantsov (2):</p>

				<ul>

				  <li>dri: meson: do not prefix user provided dri-drivers-path</li>

				  <li>d3d: meson: do not prefix user provided d3d-drivers-path</li>

				</ul>

				<p>Tapani Pälli (3):</p>

				<ul>

				  <li>nir: initialize value in copy_prop_vars_block</li>

				  <li>anv: retain the is_array state in create_plane_tex_instr_implicit</li>

				  <li>anv: destroy descriptor sets when pool gets destroyed</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>glsl: fix shader cache for packed param list</li>

				</ul>

				<p>Yevhenii Kolesnikov (1):</p>

				<ul>

				  <li>i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0</li>

				</ul>

				<p>pal1000 (1):</p>

				<ul>

				  <li>scons: Compatibility with Scons development version string</li>

				</ul>

				</div>

				</body>

				</html>

									
										169

docs/relnotes/18.3.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,169 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.6 Release Notes / April 5, 2019</h1>

				<p>

				Mesa 18.3.6 is a bug fix release which fixes bugs found since the 18.3.5 release.

				</p>

				<p>

				Mesa 18.3.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				4619d92afadf7072f7956599a2ccd0934fc45b4ddbc2eb865bdcb50ddf963f87  mesa-18.3.6.tar.gz

				aaf17638dcf5a90b93b6389e152fdc9ef147768b09598f24d2c5cf482fcfc705  mesa-18.3.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100316">Bug 100316</a> - Linking GLSL 1.30 shaders with invariant and deprecated variables triggers an 'mismatching invariant qualifiers' error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108766">Bug 108766</a> - Mesa built with meson has RPATH entries</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109648">Bug 109648</a> - AMD Raven hang during va-api decoding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109980">Bug 109980</a> - [i915 CI][HSW] spec&#64;arb_fragment_shader_interlock&#64;arb_fragment_shader_interlock-image-load-store - fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110211">Bug 110211</a> - If DESTDIR is set to an empty string, the dri drivers are not installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110221">Bug 110221</a> - build error with meson</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110259">Bug 110259</a> - radv: Sampling depth-stencil image in GENERAL layout returns nothing but zero (regression, bisected)</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (4):</p>

				<ul>

				  <li>glsl: correctly validate component layout qualifier for dvec{3,4}</li>

				  <li>glsl/linker: don't fail non static used inputs without matching outputs</li>

				  <li>glsl/linker: simplify xfb_offset vs xfb_stride overflow check</li>

				  <li>Revert "glsl: relax input-&gt;output validation for SSO programs"</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Use correct image view comparison for fast clears.</li>

				  <li>ac/nir: Return frag_coord as integer.</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>glsl: Cross validate variable's invariance by explicit invariance only</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>softpipe: fix texture view crashes</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>bin/install_megadrivers.py: Correctly handle DESTDIR=''</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.3.5</li>

				  <li>Update version to 18.3.6</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>dri3: Return the current swap interval from glXGetSwapIntervalMESA().</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>meson: strip rpath from megadrivers</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>anv/pass: Flag the need for a RT flush for resolve attachments</li>

				  <li>Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"</li>

				</ul>

				<p>Józef Kucia (2):</p>

				<ul>

				  <li>mesa: Fix GL_NUM_DEVICE_UUIDS_EXT</li>

				  <li>radv: Fix driverUUID</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>radeon/vcn: add H.264 constrained baseline support</li>

				  <li>radeon/vcn/vp9: search the render target from the whole list</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: fix assertion failure by using the correct type</li>

				</ul>

				<p>Mark Janes (1):</p>

				<ul>

				  <li>mesa: properly report the length of truncated log messages</li>

				</ul>

				<p>Plamena Manolova (1):</p>

				<ul>

				  <li>i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: fix binding transform feedback buffers</li>

				  <li>radv: do not always initialize HTILE in compressed state</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>anv/radv: release memory allocated by glsl types during spirv_to_nir</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>st/glsl_to_nir: fix incorrect arrary access</li>

				</ul>

				<p>Tobias Klausmann (1):</p>

				<ul>

				  <li>vulkan/util: meson build - add wayland client include</li>

				</ul>

				</div>

				</body>

				</html>

2403

docs/relnotes/19.0.0.html

View File

File diff suppressed because it is too large Load Diff

									
										159

docs/relnotes/19.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,159 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.0.1 Release Notes / March 27, 2019</h1>

				<p>

				Mesa 19.0.1 is a bug fix release which fixes bugs found since the 19.0.0 release.

				</p>

				<p>

				Mesa 19.0.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				f1dd1980ed628edea3935eed7974fbc5d8353e9578c562728b880d63ac613dbd  mesa-19.0.1.tar.gz

				6884163c0ea9e4c98378ab8fecd72fe7b5f437713a14471beda378df247999d4  mesa-19.0.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100316">Bug 100316</a> - Linking GLSL 1.30 shaders with invariant and deprecated variables triggers an 'mismatching invariant qualifiers' error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107563">Bug 107563</a> - [RADV] Broken rendering in Unity demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109698">Bug 109698</a> - dri.pc contents invalid when built with meson</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109980">Bug 109980</a> - [i915 CI][HSW] spec&#64;arb_fragment_shader_interlock&#64;arb_fragment_shader_interlock-image-load-store - fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110211">Bug 110211</a> - If DESTDIR is set to an empty string, the dri drivers are not installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110221">Bug 110221</a> - build error with meson</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (4):</p>

				<ul>

				  <li>glsl: correctly validate component layout qualifier for dvec{3,4}</li>

				  <li>glsl/linker: don't fail non static used inputs without matching outputs</li>

				  <li>glsl/linker: simplify xfb_offset vs xfb_stride overflow check</li>

				  <li>Revert "glsl: relax input-&gt;output validation for SSO programs"</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Use correct image view comparison for fast clears.</li>

				  <li>ac/nir: Return frag_coord as integer.</li>

				</ul>

				<p>Danylo Piliaiev (2):</p>

				<ul>

				  <li>anv: Treat zero size XFB buffer as disabled</li>

				  <li>glsl: Cross validate variable's invariance by explicit invariance only</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>softpipe: fix texture view crashes</li>

				</ul>

				<p>Dylan Baker (5):</p>

				<ul>

				  <li>docs: Add SHA256 sums for 19.0.0</li>

				  <li>cherry-ignore: Add commit that doesn't apply</li>

				  <li>bin/install_megadrivers.py: Correctly handle DESTDIR=''</li>

				  <li>bin/install_megadrivers.py: Fix regression for set DESTDIR</li>

				  <li>bump version for 19.0.1</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>v3d: Fix leak of the renderonly struct on screen destruction.</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>glsl/lower_vector_derefs: Don't use a temporary for TCS outputs</li>

				  <li>glsl/list: Add a list variant of insert_after</li>

				  <li>anv/pass: Flag the need for a RT flush for resolve attachments</li>

				  <li>nir/builder: Add a vector extract helper</li>

				  <li>nir: Add a new pass to lower array dereferences on vectors</li>

				  <li>intel/nir: Lower array-deref-of-vector UBO and SSBO loads</li>

				</ul>

				<p>Józef Kucia (2):</p>

				<ul>

				  <li>radv: Fix driverUUID</li>

				  <li>mesa: Fix GL_NUM_DEVICE_UUIDS_EXT</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>intel/fs: Fix opt_peephole_csel to not throw away saturates.</li>

				</ul>

				<p>Kevin Strasser (1):</p>

				<ul>

				  <li>egl/dri: Avoid out of bounds array access</li>

				</ul>

				<p>Mark Janes (1):</p>

				<ul>

				  <li>mesa: properly report the length of truncated log messages</li>

				</ul>

				<p>Plamena Manolova (1):</p>

				<ul>

				  <li>i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9</li>

				</ul>

				<p>Samuel Pitoiset (3):</p>

				<ul>

				  <li>radv: set the maximum number of IBs per submit to 192</li>

				  <li>radv: always initialize HTILE when the src layout is UNDEFINED</li>

				  <li>radv: fix binding transform feedback buffers</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>d3d: meson: do not prefix user provided d3d-drivers-path</li>

				</ul>

				<p>Tapani Pälli (2):</p>

				<ul>

				  <li>isl: fix automake build when sse41 is not supported</li>

				  <li>anv/radv: release memory allocated by glsl types during spirv_to_nir</li>

				</ul>

				</div>

				</body>

				</html>

									
										122

docs/relnotes/19.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,122 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.0.2 Release Notes / April 10, 2019</h1>

				<p>

				Mesa 19.0.2 is a bug fix release which fixes bugs found since the 19.0.1 release.

				</p>

				<p>

				Mesa 19.0.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: eb972fc11d4e1261d34ec0b91a701f158d4870c0428fb108353ae7eab64b1118  mesa-19.0.2.tar.gz

				SHA256: 1a2edc3ce56906a676c91e6851298db45903df1f5cb9827395a922c1452db802  mesa-19.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108766">Bug 108766</a> - Mesa built with meson has RPATH entries</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109648">Bug 109648</a> - AMD Raven hang during va-api decoding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110257">Bug 110257</a> - Major artifacts in mpeg2 vaapi hw decoding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110259">Bug 110259</a> - radv: Sampling depth-stencil image in GENERAL layout returns nothing but zero (regression, bisected)</li>

				</ul>

				<h2>Changes</h2>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>st/va: reverse qt matrix back to its original order</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>nir: Take if_uses into account when repairing SSA</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>docs: Add SHA256 sums for mesa 19.0.1</li>

				  <li>VERSION: bump version for 19.0.2</li>

				</ul>

				<p>Eric Anholt (3):</p>

				<ul>

				  <li>dri3: Return the current swap interval from glXGetSwapIntervalMESA().</li>

				  <li>v3d: Bump the maximum texture size to 4k for V3D 4.x.</li>

				  <li>v3d: Don't try to use the TFU blit path if a scissor is enabled.</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>meson: strip rpath from megadrivers</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nir/print: fix printing the image_array intrinsic index</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>radeon/vcn: add H.264 constrained baseline support</li>

				  <li>radeon/vcn/vp9: search the render target from the whole list</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>intel: add dependency on genxml generated files</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: fix assertion failure by using the correct type</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: skip updating depth/color metadata for conditional rendering</li>

				  <li>radv: do not always initialize HTILE in compressed state</li>

				</ul>

				</div>

				</body>

				</html>

									
										148

docs/relnotes/19.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,148 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.0.3 Release Notes / April 24, 2019</h1>

				<p>

				Mesa 19.0.3 is a bug fix release which fixes bugs found since the l9.0.2 release.

				</p>

				<p>

				Mesa 19.0.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				59543ec3c9f8c72990e77887f13d1678cb6739e5d5f56abc21ebf9e772389c5e  mesa-19.0.3.tar.gz

				f027244e38dc309a4c12db45ef79be81ab62c797a50a88d566e4edb6159fc4d5  mesa-19.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>N/A</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108879">Bug 108879</a> - [CIK] [regression] All opencl apps hangs indefinitely in si_create_context</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110201">Bug 110201</a> - [ivb] mesa 19.0.0 breaks rendering in kitty</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110356">Bug 110356</a> - install_megadrivers.py creates new dangling symlink [bisected]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110441">Bug 110441</a> - [llvmpipe] complex-loop-analysis-bug regression</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>glsl/linker: location aliasing requires types to have the same width</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>ac: Move has_local_buffers disable to radeonsi.</li>

				</ul>

				<p>Chia-I Wu (1):</p>

				<ul>

				  <li>virgl: fix fence fd version check</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>intel/compiler: Do not reswizzle dst if instruction writes to flag register</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for 19.0.2</li>

				  <li>Bump version for 19.0.3</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>nir: Fix deref offset calculation for structs.</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>meson: remove meson-created megadrivers symlinks</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>anv/pipeline: Fix MEDIA_VFE_STATE::PerThreadScratchSpace on gen7</li>

				  <li>anv: Add a #define for the max binding table size</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>meson: Add dependency on genxml to anvil genfiles</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Set location on structure-split sampler uniform variables</li>

				  <li>Revert "glsl: Set location on structure-split sampler uniform variables"</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>anv: fix uninitialized pthread cond clock domain</li>

				  <li>intel/devinfo: fix missing num_thread_per_eu on ICL</li>

				</ul>

				<p>Lubomir Rintel (2):</p>

				<ul>

				  <li>gallivm: guess CPU features also on ARM</li>

				  <li>gallivm: disable NEON instructions if they are not supported</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: use CP DMA for the null const buffer clear on CIK</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>nir,ac/nir: fix cube_face_coord</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>gallivm: fix bogus assert in get_indirect_index</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+</li>

				  <li>radv: do not load vertex attributes that are not provided by the pipeline</li>

				</ul>

				</div>

				</body>

				</html>

4610

docs/relnotes/19.1.0.html Normal file

View File

File diff suppressed because it is too large Load Diff

									
										154

docs/relnotes/19.1.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,154 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.1 Release Notes / June 25, 2019</h1>

				<p>

				Mesa 19.1.1 is a bug fix release which fixes bugs found since the 19.1.0 release.

				</p>

				<p>

				Mesa 19.1.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				72114b16b4a84373b2acda060fe2bb1d45ea2598efab3ef2d44bdeda74f15581  mesa-19.1.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110709">Bug 110709</a> - g_glxglvnddispatchfuncs.c and glxglvnd.c fail to build with clang 8.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110901">Bug 110901</a> - mesa-19.1.0/src/util/futex.h:82: use of out of scope variable ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110902">Bug 110902</a> - mesa-19.1.0/src/broadcom/compiler/vir_opt_redundant_flags.c:104]: (style) Same expression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110921">Bug 110921</a> - virgl on OpenGL 3.3 host regressed to OpenGL 2.1</li>

				</ul>

				<h2>Changes</h2>

				<p>Alejandro Piñeiro (1):</p>

				<ul>

				  <li>v3d: fix checking twice auf flag</li>

				</ul>

				<p>Bas Nieuwenhuizen (5):</p>

				<ul>

				  <li>radv: Skip transitions coming from external queue.</li>

				  <li>radv: Decompress DCC when the image format is not allowed for buffers.</li>

				  <li>radv: Fix vulkan build in meson.</li>

				  <li>anv: Fix vulkan build in meson.</li>

				  <li>meson: Allow building radeonsi with just the android platform.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>nouveau: fix frees in unsupported IR error paths.</li>

				</ul>

				<p>Eduardo Lima Mitev (1):</p>

				<ul>

				  <li>freedreno/a5xx: Fix indirect draw max_indices calculation</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>util/futex: fix dangling pointer use</li>

				  <li>glx: fix glvnd pointer types</li>

				  <li>util/os_file: resize buffer to what was actually needed</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>virgl: Assume sRGB write control for older guest kernels or virglrenderer hosts</li>

				</ul>

				<p>Haihao Xiang (1):</p>

				<ul>

				  <li>i965: support UYVY for external import only</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>anv: Set STATE_BASE_ADDRESS upper bounds on gen7</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>docs: Add SHA256 sums for 19.1.0</li>

				  <li>Update version to 19.1.1</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Fix out of bounds read in shader_cache_read_program_metadata</li>

				  <li>iris: Fix iris_flush_and_dirty_history to actually dirty history.</li>

				</ul>

				<p>Kevin Strasser (2):</p>

				<ul>

				  <li>gallium/winsys/kms: Fix dumb buffer bpp</li>

				  <li>st/mesa: Add rgbx handling for fp formats</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>anv: do not parse genxml data without INTEL_DEBUG=bat</li>

				  <li>intel/dump: fix segfault when the app hasn't accessed the device</li>

				</ul>

				<p>Mathias Fröhlich (1):</p>

				<ul>

				  <li>egl: Don't add hardware device if there is no render node v2.</li>

				</ul>

				<p>Richard Thier (1):</p>

				<ul>

				  <li>r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno/a6xx: un-swap X24S8_UINT</li>

				</ul>

				<p>Samuel Pitoiset (4):</p>

				<ul>

				  <li>radv: fix occlusion queries on VegaM</li>

				  <li>radv: fix VK_EXT_memory_budget if one heap isn't available</li>

				  <li>radv: fix FMASK expand with SRGB formats</li>

				  <li>radv: disable viewport clamping even if FS doesn't write Z</li>

				</ul>

				</div>

				</body>

				</html>

									
										194

docs/relnotes/19.1.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,194 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.2 Release Notes / July 9, 2019</h1>

				<p>

				Mesa 19.1.2 is a bug fix release which fixes bugs found since the 19.1.1 release.

				</p>

				<p>

				Mesa 19.1.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				813a144ea8ebefb7b48b6733f3f603855b0f61268d86cc1cc26a6b4be908fcfd  mesa-19.1.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110702">Bug 110702</a> - segfault in radeonsi HEVC hardware decoding with yuv420p10le</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110783">Bug 110783</a> - Mesa 19.1 rc crashing MPV with VAAPI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110944">Bug 110944</a> - [Bisected] Blender 2.8 crashes when closing certain windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110953">Bug 110953</a> - Adding a redundant single-iteration do-while loop causes different image to be rendered</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110999">Bug 110999</a> - 19.1.0: assert in vkAllocateDescriptorSets using immutable samplers on Ivy Bridge</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111019">Bug 111019</a> - radv doesn't handle variable descriptor count properly</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (3):</p>

				<ul>

				  <li>Revert "i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch"</li>

				  <li>Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch"</li>

				  <li>Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch"</li>

				</ul>

				<p>Arfrever Frehtes Taifersar Arahesis (1):</p>

				<ul>

				  <li>meson: Improve detection of Python when using Meson &gt;=0.50.</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Only allocate supplied number of descriptors when variable.</li>

				  <li>radv: Fix interactions between variable descriptor count and inline uniform blocks.</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>meson: Add support for using cmake for finding LLVM</li>

				  <li>Revert "meson: Add support for using cmake for finding LLVM"</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>freedreno: Fix UBO load range detection on booleans.</li>

				  <li>freedreno: Fix up end range of unaligned UBO loads.</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>meson: bump required libdrm version to 2.4.81</li>

				</ul>

				<p>Gert Wollny (2):</p>

				<ul>

				  <li>gallium: Add CAP for opcode DIV</li>

				  <li>vl: Use CS composite shader only if TEX_LZ and DIV are supported</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>glsl: Don't increase the iteration count when there are no terminators</li>

				</ul>

				<p>James Clarke (1):</p>

				<ul>

				  <li>meson: GNU/kFreeBSD has DRM/KMS and requires -D_GNU_SOURCE</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>anv/descriptor_set: Only write texture swizzles if we have an image view</li>

				  <li>iris: Use a uint16_t for key sizes</li>

				</ul>

				<p>Jory Pratt (2):</p>

				<ul>

				  <li>util: Heap-allocate 256K zlib buffer</li>

				  <li>meson: Search for execinfo.h</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.1</li>

				  <li>intel: fix wrong format usage</li>

				  <li>Update version to 19.1.2</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS</li>

				  <li>gallium: Make util_copy_image_view handle shader_access</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>intel/compiler: fix derivative on y axis implementation</li>

				  <li>intel/compiler: don't use byte operands for src1 on ICL</li>

				</ul>

				<p>Nanley Chery (2):</p>

				<ul>

				  <li>intel: Add and use helpers for level0 extent</li>

				  <li>isl: Don't align phys_level0_sa by block dimension</li>

				</ul>

				<p>Nataraj Deshpande (1):</p>

				<ul>

				  <li>anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format</li>

				</ul>

				<p>Pierre-Eric Pelloux-Prayer (2):</p>

				<ul>

				  <li>mesa: delete framebuffer texture attachment sampler views</li>

				  <li>radeon/uvd: fix calc_ctx_size_h265_main10</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno/a5xx: fix batch leak in fd5 blitter path</li>

				</ul>

				<p>Sagar Ghuge (1):</p>

				<ul>

				  <li>glsl: Fix round64 conversion function</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>i965: leaking of upload-BO with push constants</li>

				</ul>

				<p>Ville Syrjälä (1):</p>

				<ul>

				  <li>anv/cmd_buffer: Reuse gen8 Cmd{Set, Reset}Event on gen7</li>

				</ul>

				</div>

				</body>

				</html>

									
										191

docs/relnotes/19.1.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,191 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.3 Release Notes / July 23, 2019</h1>

				<p>

				Mesa 19.1.3 is a bug fix release which fixes bugs found since the 19.1.2 release.

				</p>

				<p>

				Mesa 19.1.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				845460b2225d15c15d4a9743dec798ff0b7396b533011d43e774e67f7825b7e0  mesa-19.1.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109203">Bug 109203</a> - [cfl dxvk] GPU Crash Launching Monopoly Plus (Iris Plus 655 / Wine + DXVK)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109524">Bug 109524</a> - &quot;Invalid glsl version in shading_language_version()&quot; when trying to run directX games using wine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110309">Bug 110309</a> - [icl][bisected] regression on piglit arb_gpu_shader_int 64.execution.fs-ishl-then-* tests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110663">Bug 110663</a> - threads_posix.h:96: undefined reference to `pthread_once'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110955">Bug 110955</a> - Mesa 18.2.8 implementation error: Invalid GLSL version in shading_language_version()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111010">Bug 111010</a> - Cemu Shader Cache Corruption Displaying Solid Color After commit 11e16ca7ce0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111071">Bug 111071</a> - SPIR-V shader processing fails with message about &quot;extra dangling SSA sources&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111075">Bug 111075</a> - Processing of SPIR-V shader causes device hang, sometimes leading to system reboot</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111097">Bug 111097</a> - Can not detect VK_ERROR_OUT_OF_DATE_KHR or VK_SUBOPTIMAL_KHR when window resizing</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Handle cmask being disallowed by addrlib.</li>

				  <li>anv: Add android dependencies on android.</li>

				  <li>radv: Only save the descriptor set if we have one.</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (2):</p>

				<ul>

				  <li>anv: Fix pool allocator when first alloc needs to grow</li>

				  <li>spirv: Fix stride calculation when lowering Workgroup to offsets</li>

				</ul>

				<p>Chia-I Wu (2):</p>

				<ul>

				  <li>anv: fix VkExternalBufferProperties for unsupported handles</li>

				  <li>anv: fix VkExternalBufferProperties for host allocation</li>

				</ul>

				<p>Connor Abbott (1):</p>

				<ul>

				  <li>nir: Add a helper to determine if an intrinsic can be reordered</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv: fix crash in shader tracing.</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>freedreno: Fix assertion failures in context setup in shader-db mode.</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>softpipe: Remove unused static function</li>

				</ul>

				<p>Ian Romanick (4):</p>

				<ul>

				  <li>intel/vec4: Reswizzle VF immediates too</li>

				  <li>nir: Add unit tests for nir_opt_comparison_pre</li>

				  <li>nir: Use nir_src_bit_size instead of alu1-&gt;dest.dest.ssa.bit_size</li>

				  <li>mesa: Set minimum possible GLSL version</li>

				</ul>

				<p>Jason Ekstrand (13):</p>

				<ul>

				  <li>nir/instr_set: Expose nir_instrs_equal()</li>

				  <li>nir/loop_analyze: Fix phi-of-identical-alu detection</li>

				  <li>nir: Add more helpers for working with const values</li>

				  <li>nir/loop_analyze: Handle bit sizes correctly in calculate_iterations</li>

				  <li>nir/loop_analyze: Bail if we encounter swizzles</li>

				  <li>anv: Set Stateless Data Port Access MOCS</li>

				  <li>nir/opt_if: Clean up single-src phis in opt_if_loop_terminator</li>

				  <li>nir,intel: Add support for lowering 64-bit nir_opt_extract_*</li>

				  <li>anv: Account for dynamic stencil write disables in the PMA fix</li>

				  <li>nir/regs_to_ssa: Handle regs in phi sources properly</li>

				  <li>nir/loop_analyze: Refactor detection of limit vars</li>

				  <li>nir: Add some helpers for chasing SSA values properly</li>

				  <li>nir/loop_analyze: Properly handle swizzles in loop conditions</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.2</li>

				  <li>Update version to 19.1.3</li>

				</ul>

				<p>Lepton Wu (1):</p>

				<ul>

				  <li>virgl: Set meta data for textures from handle.</li>

				</ul>

				<p>Lionel Landwerlin (6):</p>

				<ul>

				  <li>vulkan/overlay: fix command buffer stats</li>

				  <li>vulkan/overlay: fix crash on freeing NULL command buffer</li>

				  <li>anv: fix crash in vkCmdClearAttachments with unused attachment</li>

				  <li>vulkan/wsi: update swapchain status on vkQueuePresent</li>

				  <li>anv: report timestampComputeAndGraphics true</li>

				  <li>anv: fix format mapping for depth/stencil formats</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't set READ_ONLY for const_uploader to fix bindless texture hangs</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>anv: fix alphaToCoverage when there is no color attachment</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix VGT_GS_MODE if VS uses the primitive ID</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>meta: memory leak of CopyPixels usage</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: save/restore SSO flag when using ARB_get_program_binary</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>meson: Add dep_thread dependency.</li>

				</ul>

				<p>Yevhenii Kolesnikov (1):</p>

				<ul>

				  <li>meta: leaking of BO with DrawPixels</li>

				</ul>

				</div>

				</body>

				</html>

									
										227

docs/relnotes/19.1.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,227 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.4 Release Notes / August 7, 2019</h1>

				<p>

				Mesa 19.1.4 is a bug fix release which fixes bugs found since the 19.1.3 release.

				</p>

				<p>

				Mesa 19.1.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a6d268a7d9edcfd92b6da80f2e34e6e0a7baaa442efbeba2fc66c404943c6bfb  mesa-19.1.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109203">Bug 109203</a> - [cfl dxvk] GPU Crash Launching Monopoly Plus (Iris Plus 655 / Wine + DXVK)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109524">Bug 109524</a> - &quot;Invalid glsl version in shading_language_version()&quot; when trying to run directX games using wine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110309">Bug 110309</a> - [icl][bisected] regression on piglit arb_gpu_shader_int 64.execution.fs-ishl-then-* tests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110663">Bug 110663</a> - threads_posix.h:96: undefined reference to `pthread_once'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110955">Bug 110955</a> - Mesa 18.2.8 implementation error: Invalid GLSL version in shading_language_version()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111010">Bug 111010</a> - Cemu Shader Cache Corruption Displaying Solid Color After commit 11e16ca7ce0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111071">Bug 111071</a> - SPIR-V shader processing fails with message about &quot;extra dangling SSA sources&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111075">Bug 111075</a> - Processing of SPIR-V shader causes device hang, sometimes leading to system reboot</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111097">Bug 111097</a> - Can not detect VK_ERROR_OUT_OF_DATE_KHR or VK_SUBOPTIMAL_KHR when window resizing</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Rodriguez (1):</p>

				<ul>

				  <li>radv: fix queries with WAIT_BIT returning VK_NOT_READY</li>

				</ul>

				<p>Andrii Simiklit (2):</p>

				<ul>

				  <li>intel/compiler: don't use a keyword struct for a class fs_reg</li>

				  <li>meson: add a warning for meson &lt; 0.46.0</li>

				</ul>

				<p>Arcady Goldmints-Orlov (1):</p>

				<ul>

				  <li>anv: report HOST_ALLOCATION as supported for images</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Set correct metadata size for GFX9+.</li>

				  <li>radv: Take variable descriptor counts into account for buffer entries.</li>

				  <li>radv: Fix descriptor set allocation failure.</li>

				</ul>

				<p>Boyuan Zhang (4):</p>

				<ul>

				  <li>radeon/uvd: fix poc for hevc encode</li>

				  <li>radeon/vcn: fix poc for hevc encode</li>

				  <li>radeon/uvd: enable rate control for hevc encoding</li>

				  <li>radeon/vcn: enable rate control for hevc encoding</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>anv: Remove special allocation for anv_push_constants</li>

				</ul>

				<p>Connor Abbott (1):</p>

				<ul>

				  <li>nir: Allow qualifiers on copy_deref and image instructions</li>

				</ul>

				<p>Daniel Schürmann (1):</p>

				<ul>

				  <li>spirv: Fix order of barriers in SpvOpControlBarrier</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>st/nir: fix arb fragment stage conversion</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>meson: allow building all glx without any drivers</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>egl/drm: ensure the backing gbm is set before using it</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>freedreno: Fix data races with allocating/freeing struct ir3.</li>

				</ul>

				<p>Eric Engestrom (5):</p>

				<ul>

				  <li>nir: don't return void</li>

				  <li>util: fix no-op macro (bad number of arguments)</li>

				  <li>gallium+mesa: fix tgsi_semantic array type</li>

				  <li>scons+meson: suppress spammy build warning on MacOS</li>

				  <li>nir: remove explicit nir_intrinsic_index_flag values</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>intel/ir: Fix CFG corruption in opt_predicated_break().</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>gallium/vl: fix compute tgsi shaders to not process undefined components</li>

				  <li>nv50,nvc0: update sampler/view bind functions to accept NULL array</li>

				  <li>nvc0: allow a non-user buffer to be bound at position 0</li>

				  <li>nv50/ir: handle insn not being there for definition of CVT arg</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>intel/fs: Stop stack allocating large arrays</li>

				  <li>anv: Disable transform feedback on gen7</li>

				  <li>isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW</li>

				  <li>anv: Don't claim support for 24 and 48-bit formats on IVB</li>

				  <li>intel/fs: Use ALIGN16 instructions for all derivatives on gen &lt;= 7</li>

				  <li>intel/fs: Implement quad_swap_horizontal with a swizzle on gen7</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.3</li>

				  <li>Update version to 19.1.4</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>mesa: Fix ReadBuffers with pbuffers</li>

				  <li>egl: Quiet warning about front buffer rendering for pixmaps/pbuffers</li>

				  <li>egl: Make the 565 pbuffer-only config single buffered.</li>

				  <li>egl: Only expose 565 pbuffer configs if X can export them as DRI3 images</li>

				</ul>

				<p>Lionel Landwerlin (5):</p>

				<ul>

				  <li>anv: fix use of comma operator</li>

				  <li>nir: add access to image_deref intrinsics</li>

				  <li>spirv: wrap push ssa/pointer values</li>

				  <li>spirv: propagate access qualifiers through ssa &amp; pointer</li>

				  <li>spirv: don't discard access set by vtn_pointer_dereference</li>

				</ul>

				<p>Mark Menzynski (1):</p>

				<ul>

				  <li>nvc0/ir: Fix assert accessing null pointer</li>

				</ul>

				<p>Nataraj Deshpande (1):</p>

				<ul>

				  <li>egl/android: Update color_buffers querying for buffer age</li>

				</ul>

				<p>Nicolas Dufresne (1):</p>

				<ul>

				  <li>egl: Also query modifiers when exporting DMABuf</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>ac/nir: fix txf_ms with an offset</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix crash in vkCmdClearAttachments with unused attachment</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>mesa: add glsl_type ref to one_time_init and decref to atexit</li>

				</ul>

				<p>Yevhenii Kolesnikov (1):</p>

				<ul>

				  <li>main: Fix memleaks in mesa_use_program</li>

				</ul>

				</div>

				</body>

				</html>

									
										119

docs/relnotes/19.1.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,119 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.5 Release Notes / August 23, 2019</h1>

				<p>

				Mesa 19.1.5 is a bug fix release which fixes bugs found since the 19.1.4 release.

				</p>

				<p>

				Mesa 19.1.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				7b54e14e35c7251b171b4cf9d84cbc1d760eafe00132117db193454999cd6eb4  mesa-19.1.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109630">Bug 109630</a> - vkQuake flickering geometry under Intel</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110395">Bug 110395</a> - Shadows are flickering in SuperTuxKart</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111113">Bug 111113</a> - ANGLE BlitFramebufferTest.MultisampleDepthClear/ES3_OpenGL fails on Intel Ubuntu19.04</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111267">Bug 111267</a> - [CM246] Flickering with multiple draw calls within the same graphics pipeline if a compute pipeline is present</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (4):</p>

				<ul>

				  <li>radv: Do non-uniform lowering before bool lowering.</li>

				  <li>ac/nir: Use correct cast for readfirstlane and ptrs.</li>

				  <li>radv: Avoid binning RAVEN hangs.</li>

				  <li>radv: Avoid VEGA/RAVEN scissor bug in binning.</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>util: fix mem leak of program path</li>

				</ul>

				<p>Erik Faye-Lund (2):</p>

				<ul>

				  <li>gallium/dump: add missing query-type to short-list</li>

				  <li>gallium/dump: add missing query-type to short-list</li>

				</ul>

				<p>Greg V (2):</p>

				<ul>

				  <li>anv: remove unused Linux-specific include</li>

				  <li>intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.4</li>

				  <li>cherry-ignore: panfrost: Make ctx-&gt;job useful</li>

				  <li>Update version to 19.1.5</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>radeonsi: disable SDMA image copies on dGPUs to fix corruption in games</li>

				  <li>radeonsi: fix an assertion failure: assert(!res-&gt;b.is_shared)</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>meson: Test for program_invocation_name</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>i965/clear: clear_value better precision</li>

				</ul>

				</div>

				</body>

				</html>

									
										132

docs/relnotes/19.1.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,132 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.6 Release Notes / September 3, 2019</h1>

				<p>

				Mesa 19.1.6 is a bug fix release which fixes bugs found since the 19.1.5 release.

				</p>

				<p>

				Mesa 19.1.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2a369b7b48545c6486e7e44913ad022daca097c8bd937bf30dcf3f17a94d3496  mesa-19.1.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104395">Bug 104395</a> - [CTS] GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels tests fail on 32bit Mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111213">Bug 111213</a> - VA-API nouveau SIGSEGV and asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111241">Bug 111241</a> - Shadertoy shader causing hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111411">Bug 111411</a> - SPIR-V shader leads to GPU hang, sometimes making machine unstable</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Rodriguez (1):</p>

				<ul>

				  <li>radv: additional query fixes</li>

				</ul>

				<p>Daniel Schürmann (1):</p>

				<ul>

				  <li>nir/lcssa: handle deref instructions properly</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled</li>

				  <li>intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>gallium/vl: use compute preference for all multimedia, not just blit</li>

				</ul>

				<p>Jonas Ådahl (1):</p>

				<ul>

				  <li>wayland/egl: Ensure correct buffer size when allocating</li>

				</ul>

				<p>Juan A. Suarez Romero (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.5</li>

				  <li>cherry-ignore: add explicit 19.2 only nominations</li>

				  <li>cherry-ignore: iris: Replace devinfo-&gt;gen with GEN_GEN</li>

				  <li>cherry-ignore: iris: Update fast clear colors on Gen9 with direct immediate writes.</li>

				  <li>cherry-ignore: iris: Avoid unnecessary resolves on transfer maps</li>

				  <li>Update version to 19.1.6</li>

				</ul>

				<p>Kenneth Graunke (6):</p>

				<ul>

				  <li>iris: Fix broken aux.possible/sampler_usages bitmask handling</li>

				  <li>iris: Drop copy format hacks from copy region based transfer path.</li>

				  <li>iris: Fix large timeout handling in rel2abs()</li>

				  <li>util: Add a _mesa_i64roundevenf() helper.</li>

				  <li>mesa: Fix _mesa_float_to_unorm() on 32-bit systems.</li>

				  <li>intel/compiler: Fix src0/desc setter ordering</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: fix scratch buffer WAVESIZE setting leading to corruption</li>

				</ul>

				<p>Paulo Zanoni (1):</p>

				<ul>

				  <li>intel/fs: grab fail_msg from v32 instead of v16 when v32-&gt;run_cs fails</li>

				</ul>

				<p>Pierre-Eric Pelloux-Prayer (1):</p>

				<ul>

				  <li>glsl: replace 'x + (-x)' with constant 0</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>egl: reset blob cache set/get functions on terminate</li>

				</ul>

				</div>

				</body>

				</html>

									
										157

docs/relnotes/19.1.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,157 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.7 Release Notes / September 17, 2019</h1>

				<p>

				Mesa 19.1.7 is a bug fix release which fixes bugs found since the 19.1.6 release.

				</p>

				<p>

				Mesa 19.1.7 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<p>

				Mesa 19.1.7 implements the Vulkan 1.1 API, but the version reported by

				the apiVersion property of the VkPhysicalDeviceProperties struct

				depends on the particular driver being used.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e287920fdb38712a9fed448dc90b3ca95048c7face5db52e58361f8b6e0f3cd5  mesa-19.1.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110814">Bug 110814</a> - KWin compositor crashes on launch</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111069">Bug 111069</a> - Assertion fails in nir_opt_remove_phis.c during compilation of SPIR-V shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111271">Bug 111271</a> - Crash in eglMakeCurrent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111401">Bug 111401</a> - Vulkan overlay layer - async compute not supported, making overlay disappear in Doom</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111405">Bug 111405</a> - Some infinite 'do{}while' loops lead mesa to an infinite compilation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111467">Bug 111467</a> - WOLF RPG Editor + Gallium Nine Standalone: Rendering issue when using Iris driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111552">Bug 111552</a> - Geekbench 5.0 Vulkan compute benchmark fails on Anvil</li>

				</ul>

				<h2>Changes</h2>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>glsl/nir: Avoid overflow when setting max_uniform_location</li>

				</ul>

				<p>Connor Abbott (1):</p>

				<ul>

				  <li>radv: Call nir_propagate_invariant()</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE</li>

				</ul>

				<p>Eric Engestrom (10):</p>

				<ul>

				  <li>ttn: fix 64-bit shift on 32-bit `1`</li>

				  <li>egl: fix deadlock in malloc error path</li>

				  <li>util/os_file: fix double-close()</li>

				  <li>anv: fix format string in error message</li>

				  <li>nir: fix memleak in error path</li>

				  <li>anv: add support for driconf</li>

				  <li>wsi: add minImageCount override</li>

				  <li>anv: add support for vk_x11_override_min_image_count</li>

				  <li>amd: move adaptive sync to performance section, as it is defined in xmlpool</li>

				  <li>radv: add support for vk_x11_override_min_image_count</li>

				</ul>

				<p>Erik Faye-Lund (2):</p>

				<ul>

				  <li>gallium/auxiliary/indices: consistently apply start only to input</li>

				  <li>util: fix SSE-version needed for double opcodes</li>

				</ul>

				<p>Hal Gentz (1):</p>

				<ul>

				  <li>glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.</li>

				</ul>

				<p>Jason Ekstrand (7):</p>

				<ul>

				  <li>Revert "intel/fs: Move the scalar-region conversion to the generator."</li>

				  <li>anv: Bump maxComputeWorkgroupSize</li>

				  <li>nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block</li>

				  <li>nir: Add a block_is_unreachable helper</li>

				  <li>nir/repair_ssa: Repair dominance for unreachable blocks</li>

				  <li>nir/repair_ssa: Insert deref casts when needed</li>

				  <li>nir/dead_cf: Repair SSA if the pass makes progress</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.6</li>

				  <li>cherry-ignore: add explicit 19.2 only nominations</li>

				  <li>Update version to 19.1.7</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>gallium: Fix util_format_get_depth_only</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>vulkan/overlay: bounce image back to present layout</li>

				</ul>

				<p>Mauro Rossi (3):</p>

				<ul>

				  <li>android: radv: fix necessary dependecies</li>

				  <li>android: amd/common: fix missing include path</li>

				  <li>android: anv: libmesa_vulkan_common: add libmesa_util static dependency</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix allocating number of user sgprs if streamout is used</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>intel/dri: finish proper glthread</li>

				</ul>

				</div>

				</body>

				</html>

									
										267

docs/relnotes/19.1.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,267 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.1.8 Release Notes / October 21, 2019</h1>

				<p>

				Mesa 19.1.8 is a bug fix release which fixes bugs found since the 19.1.7 release.

				</p>

				<p>

				Mesa 19.1.8 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<p>

				Mesa 19.1.8 implements the Vulkan 1.1 API, but the version reported by

				the apiVersion property of the VkPhysicalDeviceProperties struct

				depends on the particular driver being used.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				f0fe8289b7d147943bf2fc2147833254881577e8f9ed3d94ddb39e430e711725  mesa-19.1.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111236">Bug 111236</a> - VA-API radeonsi SIGSEGV __memmove_avx_unaligned</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111664">Bug 111664</a> - [Bisected] Segmentation fault on FS shader compilation (mat4x3 * mat4x3)</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/121">Issue #121</a> - Shared Memeory leakage in XCreateDrawable</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/795">Issue #795</a> - Xorg does not render with mesa 19.1.7</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/939">Issue #939</a> - Meson can't find 32-bit libXvMCW in non-standard path</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/944">Issue #944</a> - Mesa doesn't build with current Scons version (3.1.0)</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1838">Issue #1838</a> - Mesa installs gl.pc and egl.pc even with libglvnd &gt;= 1.2.0</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1844">Issue #1844</a> - libXvMC-1.0.12 breaks mesa build</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1869">Issue #1869</a> - X server does not start with Mesa 19.2.0</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1872">Issue #1872</a> - [bisected] piglit spec.arb_texture_view.bug-layers-image causes gpu hangs on IVB</li>

				<li><a href="https://gitlab.freedesktop.org/mesa/mesa/issues/1878">Issue #1878</a> - meson.build:1447:6: ERROR: Problem encountered: libdrm required for gallium video statetrackers when using x11</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (1):</p>

				<ul>

				  <li>docs: Update bug report URLs for the gitlab migration</li>

				</ul>

				<p>Alan Coopersmith (5):</p>

				<ul>

				  <li>c99_compat.h: Don't try to use 'restrict' in C++ code</li>

				  <li>util: Make Solaris implemention of p_atomic_add work with gcc</li>

				  <li>util: Workaround lack of flock on Solaris</li>

				  <li>meson: recognize "sunos" as the system name for Solaris</li>

				  <li>intel/common: include unistd.h for ioctl() prototype on Solaris</li>

				</ul>

				<p>Andreas Gottschling (1):</p>

				<ul>

				  <li>drisw: Fix shared memory leak on drawable resize</li>

				</ul>

				<p>Andres Gomez (3):</p>

				<ul>

				  <li>docs: Add the maximum implemented Vulkan API version in 19.1 rel notes</li>

				  <li>docs/features: Update VK_KHR_display_swapchain status</li>

				  <li>egl: Remove the 565 pbuffer-only EGL config under X11.</li>

				</ul>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>glsl: disallow incompatible matrices multiplication</li>

				</ul>

				<p>Arcady Goldmints-Orlov (1):</p>

				<ul>

				  <li>anv: fix descriptor limits on gen8</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>tu: Set up glsl types.</li>

				  <li>radv: Add workaround for hang in The Surge 2.</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader</li>

				</ul>

				<p>Dylan Baker (5):</p>

				<ul>

				  <li>meson: fix logic for generating .pc files with old glvnd</li>

				  <li>meson: Try finding libxvmcw via pkg-config before using find_library</li>

				  <li>meson: Link xvmc with libxv</li>

				  <li>meson: gallium media state trackers require libdrm with x11</li>

				  <li>meson: Only error building gallium video without libdrm when the platform is drm</li>

				</ul>

				<p>Eric Engestrom (4):</p>

				<ul>

				  <li>gl: drop incorrect pkg-config file for glvnd</li>

				  <li>meson: re-add incorrect pkg-config files with GLVND for backward compatibility</li>

				  <li>util/anon_file: add missing #include</li>

				  <li>util/anon_file: const string param</li>

				</ul>

				<p>Erik Faye-Lund (1):</p>

				<ul>

				  <li>glsl: correct bitcast-helpers</li>

				</ul>

				<p>Greg V (1):</p>

				<ul>

				  <li>util: add anon_file.h for all memfd/temp file usage</li>

				</ul>

				<p>Haihao Xiang (1):</p>

				<ul>

				  <li>i965: support AYUV/XYUV for external import only</li>

				</ul>

				<p>Hal Gentz (1):</p>

				<ul>

				  <li>gallium/osmesa: Fix the inability to set no context as current.</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>nir/repair_ssa: Replace the unreachable check with the phi builder</li>

				  <li>intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates</li>

				</ul>

				<p>Juan A. Suarez Romero (11):</p>

				<ul>

				  <li>docs: add sha256 checksums for 19.1.7</li>

				  <li>cherry-ignore: add explicit 19.2 only nominations</li>

				  <li>cherry-ignore: add explicit 19.3 only nominations</li>

				  <li>Revert "Revert "intel/fs: Move the scalar-region conversion to the generator.""</li>

				  <li>cherry-ignore: Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"</li>

				  <li>bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars</li>

				  <li>cherry-ignore: nir/opt_large_constants: Handle store writemasks</li>

				  <li>cherry-ignore: util: added missing headers in anon-file</li>

				  <li>cherry-ignore: radv: Fix condition for skipping the continue CS.</li>

				  <li>cherry-ignore: Revert "radv: disable viewport clamping even if FS doesn't write Z"</li>

				  <li>Update version to 19.1.8</li>

				</ul>

				<p>Ken Mays (1):</p>

				<ul>

				  <li>haiku: fix Mesa build</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>iris: Initialize ice-&gt;state.prim_mode to an invalid value</li>

				  <li>intel: Increase Gen11 compute shader scratch IDs to 64.</li>

				  <li>iris: Disable CCS_E for 32-bit floating point textures.</li>

				  <li>iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.</li>

				</ul>

				<p>Lionel Landwerlin (5):</p>

				<ul>

				  <li>anv: gem-stubs: return a valid fd got anv_gem_userptr()</li>

				  <li>intel: use proper label for Comet Lake skus</li>

				  <li>mesa: don't forget to clear _Layer field on texture unit</li>

				  <li>intel: fix subslice computation from topology data</li>

				  <li>intel/isl: Set null surface format to R32_UINT</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>util: Drop preprocessor guards for glibc-2.12</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeonsi: fix VAAPI segfault due to various bugs</li>

				</ul>

				<p>Michel Zou (2):</p>

				<ul>

				  <li>scons: add py3 support</li>

				  <li>scons: For MinGW use -posix flag.</li>

				</ul>

				<p>Paulo Zanoni (1):</p>

				<ul>

				  <li>intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32</li>

				</ul>

				<p>Prodea Alexandru-Liviu (1):</p>

				<ul>

				  <li>scons/MSYS2-MinGW-W64: Fix build options defaults Signed-off-by: Prodea Alexandru-Liviu &lt;liviuprodea@yahoo.com&gt; Reviewed-by: Jose Fonseca &lt;jfonseca@vmware.com&gt; Cc: &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				</ul>

				<p>Rhys Perry (2):</p>

				<ul>

				  <li>radv: always emit a position export in gs copy shaders</li>

				  <li>nir/opt_remove_phis: handle phis with no sources</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>intel/nir: do not apply the fsin and fcos trig workarounds for consts</li>

				</ul>

				<p>Stephen Barber (1):</p>

				<ul>

				  <li>nouveau: add idep_nir_headers as dep for libnouveau</li>

				</ul>

				<p>Tapani Pälli (3):</p>

				<ul>

				  <li>iris: close screen fd on iris_destroy_screen</li>

				  <li>egl: check for NULL value like eglGetSyncAttribKHR does</li>

				  <li>util: fix os_create_anonymous_file on android</li>

				</ul>

				<p>pal1000 (2):</p>

				<ul>

				  <li>scons/windows: Support build with LLVM 9.</li>

				  <li>scons: Fix MSYS2 Mingw-w64 build.</li>

				</ul>

				</div>

				</body>

				</html>

									
										3

docs/shading.html
									
												View File
												
				@@ -59,6 +59,7 @@ execution.  These are generally used for debugging.

				<li><b>nopfrag</b> - force fragment shader to be a simple shader that passes

				    through the color attribute.

				<li><b>useprog</b> - log glUseProgram calls to stderr

				<li><b>errors</b> - GLSL compilation and link errors will be reported to stderr.

				</ul>

				<p>

				Example:  export MESA_GLSL=dump,nopt

				@@ -70,6 +71,7 @@ Shaders can be dumped and replaced on runtime for debugging purposes. This

				feature is not currently supported by SCons build.

				This is controlled via following environment variables:

				</p>

				<ul>

				<li><b>MESA_SHADER_DUMP_PATH</b> - path where shader sources are dumped

				<li><b>MESA_SHADER_READ_PATH</b> - path where replacement shaders are read

				@@ -78,7 +80,6 @@ Note, path set must exist before running for dumping or replacing to work.

				When both are set, these paths should be different so the dumped shaders do 

				not clobber the replacement shaders. Also, the filenames of the replacement shaders

				should match the filenames of the corresponding dumped shaders.

				</p>

				<h3 id="capture">Capturing Shaders</h3>

									
										2

docs/sourcetree.html
									
												View File
												
				@@ -158,7 +158,7 @@ each directory.

				  <ul>

				  <li><b>glx</b> - The GLX library code for building libGL using DRI drivers.

				  </ul>

				<li><b>lib</b> - hardlinks to most binaries as produced by <strong>make</strong>.

				<li><b>lib</b> - hardlinks to most binaries as produced by the build system.

				        These (shortcuts) are used for development purposes in conjunction with

				        LD_LIBRARY_PATH and/or LIBGL_DRIVERS_PATH.

				</ul>

									
										49

docs/submittingpatches.html
									
												View File
												
				@@ -141,7 +141,7 @@ do whatever testing is prudent.

				<p>

				You should always run the Mesa test suite before submitting patches.

				The test suite can be run using the 'make check' command. All tests

				The test suite can be run using the 'meson test' command. All tests

				must pass before patches will be accepted, this may mean you have

				to update the tests themselves.

				</p>

				@@ -160,10 +160,10 @@ to run your tests on each commit. Assuming your branch is based off

				<code>origin/master</code>, you can run:

				</p>

				<pre>

				$ git rebase --interactive --exec "make check" origin/master

				$ git rebase --interactive --exec "meson test -C build/" origin/master

				</pre>

				<p>

				replacing <code>"make check"</code> with whatever other test you want to

				replacing <code>"meson test"</code> with whatever other test you want to

				run.

				</p>

				@@ -228,14 +228,19 @@ your email administrator for this.)

				</p>

				<p>

				  Add labels to your MR to help reviewers find it. For example:

				  <ul>

				    <li>Mesa changes affecting all drivers: mesa

				    <li>Hardware vendor specific code: amd, intel, nvidia, ...

				    <li>Driver specific code: anvil, freedreno, i965, iris, radeonsi,

				      radv, vc4, ...

				    <li>Other tag examples: gallium, util

				  </ul>

				</p>

				<ul>

				  <li>Mesa changes affecting all drivers: mesa

				  <li>Hardware vendor specific code: amd, intel, nvidia, ...

				  <li>Driver specific code: anvil, freedreno, i965, iris, radeonsi,

				    radv, vc4, ...

				  <li>Other tag examples: gallium, util

				</ul>

				<p>

				  Tick the following when creating the MR. It allows developers to

				  rebase your work on top of master.

				</p>

				<pre>Allow commits from members who can merge to the target branch</pre>

				<p>

				  If you revise your patches based on code review and push an update

				  to your branch, you should maintain a <strong>clean</strong> history

				@@ -250,18 +255,18 @@ your email administrator for this.)

				</p>

				<p>

				  Some other notes:

				  <ul>

				    <li>Make changes and update your branch based on feedback

				    <li>Old, stale MR may be closed, but you can reopen it if you

				      still want to pursue the changes

				    <li>You should periodically check to see if your MR needs to be

				      rebased

				    <li>Make sure your MR is closed if your patches get pushed outside

				      of GitLab

				    <li>Please send MRs from a personal fork rather than from the main

				      Mesa repository, as it clutters it unnecessarily.

				  </ul>

				</p>

				<ul>

				  <li>Make changes and update your branch based on feedback

				  <li>Old, stale MR may be closed, but you can reopen it if you

				    still want to pursue the changes

				  <li>You should periodically check to see if your MR needs to be

				    rebased

				  <li>Make sure your MR is closed if your patches get pushed outside

				    of GitLab

				  <li>Please send MRs from a personal fork rather than from the main

				    Mesa repository, as it clutters it unnecessarily.

				</ul>

				<h2 id="reviewing">Reviewing Patches</h2>

				@@ -461,7 +466,7 @@ within the commit summary.

				</pre>

				<li>Test for build breakage between patches e.g last 8 commits.

				<pre>

				    git rebase -i --exec="make -j4" HEAD~8

				    git rebase -i --exec="ninja -C build/" HEAD~8

				</pre>

				<li>Sets the default mailing address for your repo.

				<pre>

									
										6

docs/versions.html
									
												View File
												
				@@ -14,15 +14,13 @@

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Mesa Version History</h1>

				<b>

				NOTE: Changes for Mesa 6.4 and later are documented in the corresponding

				<a href="relnotes.html">release notes</a> file.

				</b>

				<h1>Mesa Version History</h1>

				<h2>1.0 beta   February 1995</h2>

				<ul>

				<li>Initial release

									
										16

docs/vmware-guest.html
									
												View File
												
				@@ -30,6 +30,7 @@ MacOS are all supported.

				With the August 2015 Workstation 12 / Fusion 8 releases, OpenGL 3.3

				is supported in the guest.

				This requires:

				</p>

				<ul>

				<li>The VM is configured for virtual hardware version 12.

				<li>The host OS, GPU and graphics driver supports DX11 (Windows) or

				@@ -37,7 +38,6 @@ This requires:

				<li>On Linux, the vmwgfx kernel module must be version 2.9.0 or later.

				<li>A recent version of Mesa with the updated svga gallium driver.

				</ul>

				</p>

				<p>

				Otherwise, OpenGL 2.1 is supported.

				@@ -191,9 +191,9 @@ For 64-bit Fedora systems:

				<li>Build libdrm:

				  <pre>

				  cd $TOP/drm

				  ./autogen.sh --prefix=/usr --libdir=${LIBDIR}

				  make

				  sudo make install

				  meson builddir --prefix=/usr --libdir=${LIBDIR}

				  ninja -C builddir

				  sudo ninja -C builddir install

				  </pre>

				<li>Build Mesa and the vmwgfx_dri.so driver, the vmwgfx_drv.so xorg driver, the X acceleration library libxatracker.

				The vmwgfx_dri.so is used by the OpenGL libraries during direct rendering,

				@@ -204,9 +204,9 @@ copy and video acceleration:

				The following configure options doesn't build the EGL system.

				  <pre>

				  cd $TOP/mesa

				  ./autogen.sh --prefix=/usr --libdir=${LIBDIR} --with-gallium-drivers=svga --with-dri-drivers=swrast --enable-xa --disable-dri3 --enable-glx-tls

				  make

				  sudo make install

				  meson builddir --prefix=/usr --libdir=${LIBDIR} -Dgallium-drivers=svga -Ddri-drivers=swrast -Dgallium-xa=true -Ddri3=false

				  ninja -C builddir

				  sudo ninja -C builddir install

				  </pre>

				Note that you may have to install other packages that Mesa depends upon

				@@ -311,7 +311,7 @@ If OpenGL 3.3 is not working (you only get OpenGL 2.1):

				<li>Make sure the vmwgfx kernel module is version 2.9.0 or later.

				<li>Check the vmware.log file for errors.

				<li>Run 'dmesg | grep vmwgfx' and look for "DX: yes".

				</ul>

				</div>

				</body>

1772

include/CL/cl.h

View File

File diff suppressed because it is too large Load Diff

11745

include/CL/cl.hpp

View File

File diff suppressed because it is too large Load Diff

9690

include/CL/cl2.hpp Normal file

View File

File diff suppressed because it is too large Load Diff

									
										7

include/CL/cl_d3d10.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 * Copyright (c) 2008-2015 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

									
										7

include/CL/cl_d3d11.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 * Copyright (c) 2008-2015 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

									
										9

include/CL/cl_dx9_media_sharing.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 * Copyright (c) 2008-2015 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				@@ -33,7 +38,7 @@

				extern "C" {

				#endif

				/******************************************************************************

				/******************************************************************************/

				/* cl_khr_dx9_media_sharing                                                   */

				#define cl_khr_dx9_media_sharing 1

									
										182

include/CL/cl_dx9_media_sharing_intel.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,182 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 **********************************************************************************/

				/*****************************************************************************\

				Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.

				THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS

				"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT

				LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR

				A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS

				CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

				EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

				PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY

				OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING

				NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE

				MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				File Name: cl_dx9_media_sharing_intel.h

				Abstract:

				Notes:

				\*****************************************************************************/

				#ifndef __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H

				#define __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H

				#include <CL/cl.h>

				#include <CL/cl_platform.h>

				#include <d3d9.h>

				#include <dxvahd.h>

				#include <wtypes.h>

				#include <d3d9types.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/***************************************

				* cl_intel_dx9_media_sharing extension *

				****************************************/

				#define cl_intel_dx9_media_sharing 1

				typedef cl_uint cl_dx9_device_source_intel;

				typedef cl_uint cl_dx9_device_set_intel;

				/* error codes */

				#define CL_INVALID_DX9_DEVICE_INTEL                   -1010

				#define CL_INVALID_DX9_RESOURCE_INTEL                 -1011

				#define CL_DX9_RESOURCE_ALREADY_ACQUIRED_INTEL        -1012

				#define CL_DX9_RESOURCE_NOT_ACQUIRED_INTEL            -1013

				/* cl_dx9_device_source_intel */

				#define CL_D3D9_DEVICE_INTEL                          0x4022

				#define CL_D3D9EX_DEVICE_INTEL                        0x4070

				#define CL_DXVA_DEVICE_INTEL                          0x4071

				/* cl_dx9_device_set_intel */

				#define CL_PREFERRED_DEVICES_FOR_DX9_INTEL            0x4024

				#define CL_ALL_DEVICES_FOR_DX9_INTEL                  0x4025

				/* cl_context_info */

				#define CL_CONTEXT_D3D9_DEVICE_INTEL                  0x4026

				#define CL_CONTEXT_D3D9EX_DEVICE_INTEL                0x4072

				#define CL_CONTEXT_DXVA_DEVICE_INTEL                  0x4073

				/* cl_mem_info */

				#define CL_MEM_DX9_RESOURCE_INTEL                     0x4027

				#define CL_MEM_DX9_SHARED_HANDLE_INTEL                0x4074

				/* cl_image_info */

				#define CL_IMAGE_DX9_PLANE_INTEL                      0x4075

				/* cl_command_type */

				#define CL_COMMAND_ACQUIRE_DX9_OBJECTS_INTEL          0x402A

				#define CL_COMMAND_RELEASE_DX9_OBJECTS_INTEL          0x402B

				/******************************************************************************/

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetDeviceIDsFromDX9INTEL(

				    cl_platform_id              platform,

				    cl_dx9_device_source_intel  dx9_device_source,

				    void*                       dx9_object,

				    cl_dx9_device_set_intel     dx9_device_set,

				    cl_uint                     num_entries,

				    cl_device_id*               devices,

				    cl_uint*                    num_devices) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_int (CL_API_CALL* clGetDeviceIDsFromDX9INTEL_fn)(

				    cl_platform_id              platform,

				    cl_dx9_device_source_intel  dx9_device_source,

				    void*                       dx9_object,

				    cl_dx9_device_set_intel     dx9_device_set,

				    cl_uint                     num_entries,

				    cl_device_id*               devices,

				    cl_uint*                    num_devices) CL_EXT_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromDX9MediaSurfaceINTEL(

				    cl_context                  context,

				    cl_mem_flags                flags,

				    IDirect3DSurface9*          resource,

				    HANDLE                      sharedHandle,

				    UINT                        plane,

				    cl_int*                     errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromDX9MediaSurfaceINTEL_fn)(

				    cl_context                  context,

				    cl_mem_flags                flags,

				    IDirect3DSurface9*          resource,

				    HANDLE                      sharedHandle,

				    UINT                        plane,

				    cl_int*                     errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueAcquireDX9ObjectsINTEL(

				    cl_command_queue            command_queue,

				    cl_uint                     num_objects,

				    const cl_mem*               mem_objects,

				    cl_uint                     num_events_in_wait_list,

				    const cl_event*             event_wait_list,

				    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireDX9ObjectsINTEL_fn)(

				    cl_command_queue            command_queue,

				    cl_uint                     num_objects,

				    const cl_mem*               mem_objects,

				    cl_uint                     num_events_in_wait_list,

				    const cl_event*             event_wait_list,

				    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseDX9ObjectsINTEL(

				    cl_command_queue            command_queue,

				    cl_uint                     num_objects,

				    cl_mem*                     mem_objects,

				    cl_uint                     num_events_in_wait_list,

				    const cl_event*             event_wait_list,

				    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseDX9ObjectsINTEL_fn)(

				    cl_command_queue            command_queue,

				    cl_uint                     num_objects,

				    cl_mem*                     mem_objects,

				    cl_uint                     num_events_in_wait_list,

				    const cl_event*             event_wait_list,

				    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;

				#ifdef __cplusplus

				}

				#endif

				#endif  /* __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H */

									
										101

include/CL/cl_egl.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2010 The Khronos Group Inc.

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				@@ -24,13 +29,7 @@

				#ifndef __OPENCL_CL_EGL_H

				#define __OPENCL_CL_EGL_H

				#ifdef __APPLE__

				#else

				#include <CL/cl.h>

				#include <EGL/egl.h>

				#include <EGL/eglext.h>

				#endif  

				#ifdef __cplusplus

				extern "C" {

				@@ -62,69 +61,69 @@ typedef intptr_t cl_egl_image_properties_khr;

				#define cl_khr_egl_image 1

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromEGLImageKHR(cl_context                  /* context */,

				                        CLeglDisplayKHR             /* egldisplay */,

				                        CLeglImageKHR               /* eglimage */,

				                        cl_mem_flags                /* flags */,

				                        const cl_egl_image_properties_khr * /* properties */,

				                        cl_int *                    /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				clCreateFromEGLImageKHR(cl_context                  context,

				                        CLeglDisplayKHR             egldisplay,

				                        CLeglImageKHR               eglimage,

				                        cl_mem_flags                flags,

				                        const cl_egl_image_properties_khr * properties,

				                        cl_int *                    errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromEGLImageKHR_fn)(

					cl_context                  context,

					CLeglDisplayKHR             egldisplay,

					CLeglImageKHR               eglimage,

					cl_mem_flags                flags,

					const cl_egl_image_properties_khr * properties,

					cl_int *                    errcode_ret);

				    cl_context                  context,

				    CLeglDisplayKHR             egldisplay,

				    CLeglImageKHR               eglimage,

				    cl_mem_flags                flags,

				    const cl_egl_image_properties_khr * properties,

				    cl_int *                    errcode_ret);

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueAcquireEGLObjectsKHR(cl_command_queue /* command_queue */,

				                              cl_uint          /* num_objects */,

				                              const cl_mem *   /* mem_objects */,

				                              cl_uint          /* num_events_in_wait_list */,

				                              const cl_event * /* event_wait_list */,

				                              cl_event *       /* event */) CL_API_SUFFIX__VERSION_1_0;

				clEnqueueAcquireEGLObjectsKHR(cl_command_queue command_queue,

				                              cl_uint          num_objects,

				                              const cl_mem *   mem_objects,

				                              cl_uint          num_events_in_wait_list,

				                              const cl_event * event_wait_list,

				                              cl_event *       event) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireEGLObjectsKHR_fn)(

					cl_command_queue command_queue,

					cl_uint          num_objects,

					const cl_mem *   mem_objects,

					cl_uint          num_events_in_wait_list,

					const cl_event * event_wait_list,

					cl_event *       event);

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event);

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseEGLObjectsKHR(cl_command_queue /* command_queue */,

				                              cl_uint          /* num_objects */,

				                              const cl_mem *   /* mem_objects */,

				                              cl_uint          /* num_events_in_wait_list */,

				                              const cl_event * /* event_wait_list */,

				                              cl_event *       /* event */) CL_API_SUFFIX__VERSION_1_0;

				clEnqueueReleaseEGLObjectsKHR(cl_command_queue command_queue,

				                              cl_uint          num_objects,

				                              const cl_mem *   mem_objects,

				                              cl_uint          num_events_in_wait_list,

				                              const cl_event * event_wait_list,

				                              cl_event *       event) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseEGLObjectsKHR_fn)(

					cl_command_queue command_queue,

					cl_uint          num_objects,

					const cl_mem *   mem_objects,

					cl_uint          num_events_in_wait_list,

					const cl_event * event_wait_list,

					cl_event *       event);

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event);

				#define cl_khr_egl_event 1

				extern CL_API_ENTRY cl_event CL_API_CALL

				clCreateEventFromEGLSyncKHR(cl_context      /* context */,

				                            CLeglSyncKHR    /* sync */,

				                            CLeglDisplayKHR /* display */,

				                            cl_int *        /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				clCreateEventFromEGLSyncKHR(cl_context      context,

				                            CLeglSyncKHR    sync,

				                            CLeglDisplayKHR display,

				                            cl_int *        errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_event (CL_API_CALL *clCreateEventFromEGLSyncKHR_fn)(

					cl_context      context,

					CLeglSyncKHR    sync,

					CLeglDisplayKHR display,

					cl_int *        errcode_ret);

				    cl_context      context,

				    CLeglSyncKHR    sync,

				    CLeglDisplayKHR display,

				    cl_int *        errcode_ret);

				#ifdef __cplusplus

				}

									
										664

include/CL/cl_ext.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2013 The Khronos Group Inc.

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				@@ -21,8 +26,6 @@

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 ******************************************************************************/

				/* $Revision: 11928 $ on $Date: 2010-07-13 09:04:56 -0700 (Tue, 13 Jul 2010) $ */

				/* cl_ext.h contains OpenCL extensions which don't have external */

				/* (OpenGL, D3D) dependencies.                                   */

				@@ -33,11 +36,13 @@

				extern "C" {

				#endif

				#ifdef __APPLE__

				        #include <OpenCL/cl.h>

				    #include <AvailabilityMacros.h>

				#else

				        #include <CL/cl.h>

				#include <CL/cl.h>

				/* cl_khr_fp64 extension - no extension #define since it has no functions  */

				/* CL_DEVICE_DOUBLE_FP_CONFIG is defined in CL.h for OpenCL >= 120 */

				#if CL_TARGET_OPENCL_VERSION <= 110

				#define CL_DEVICE_DOUBLE_FP_CONFIG                       0x1032

				#endif

				/* cl_khr_fp16 extension - no extension #define since it has no functions  */

				@@ -47,12 +52,12 @@ extern "C" {

				 *

				 * Apple extension for use to manage externally allocated buffers used with cl_mem objects with CL_MEM_USE_HOST_PTR

				 *

				 * Registers a user callback function that will be called when the memory object is deleted and its resources 

				 * freed. Each call to clSetMemObjectCallbackFn registers the specified user callback function on a callback 

				 * stack associated with memobj. The registered user callback functions are called in the reverse order in 

				 * which they were registered. The user callback functions are called and then the memory object is deleted 

				 * and its resources freed. This provides a mechanism for the application (and libraries) using memobj to be 

				 * notified when the memory referenced by host_ptr, specified when the memory object is created and used as 

				 * Registers a user callback function that will be called when the memory object is deleted and its resources

				 * freed. Each call to clSetMemObjectCallbackFn registers the specified user callback function on a callback

				 * stack associated with memobj. The registered user callback functions are called in the reverse order in

				 * which they were registered. The user callback functions are called and then the memory object is deleted

				 * and its resources freed. This provides a mechanism for the application (and libraries) using memobj to be

				 * notified when the memory referenced by host_ptr, specified when the memory object is created and used as

				 * the storage bits for the memory object, can be reused or freed.

				 *

				 * The application may not call CL api's with the cl_mem object passed to the pfn_notify.

				@@ -61,9 +66,9 @@ extern "C" {

				 * before using.

				 */

				#define cl_APPLE_SetMemObjectDestructor 1

				cl_int  CL_API_ENTRY clSetMemObjectDestructorAPPLE(  cl_mem /* memobj */, 

				                                        void (* /*pfn_notify*/)( cl_mem /* memobj */, void* /*user_data*/), 

				                                        void * /*user_data */ )             CL_EXT_SUFFIX__VERSION_1_0;  

				cl_int  CL_API_ENTRY clSetMemObjectDestructorAPPLE(  cl_mem memobj,

				                                        void (* pfn_notify)(cl_mem memobj, void * user_data),

				                                        void * user_data)             CL_EXT_SUFFIX__VERSION_1_0;

				/* Context Logging Functions

				@@ -72,29 +77,29 @@ cl_int  CL_API_ENTRY clSetMemObjectDestructorAPPLE(  cl_mem /* memobj */,

				 * Please check for the "cl_APPLE_ContextLoggingFunctions" extension using clGetDeviceInfo(CL_DEVICE_EXTENSIONS)

				 * before using.

				 *

				 * clLogMessagesToSystemLog fowards on all log messages to the Apple System Logger 

				 * clLogMessagesToSystemLog forwards on all log messages to the Apple System Logger

				 */

				#define cl_APPLE_ContextLoggingFunctions 1

				extern void CL_API_ENTRY clLogMessagesToSystemLogAPPLE(  const char * /* errstr */, 

				                                            const void * /* private_info */, 

				                                            size_t       /* cb */, 

				                                            void *       /* user_data */ )  CL_EXT_SUFFIX__VERSION_1_0;

				extern void CL_API_ENTRY clLogMessagesToSystemLogAPPLE(  const char * errstr,

				                                            const void * private_info,

				                                            size_t       cb,

				                                            void *       user_data)  CL_EXT_SUFFIX__VERSION_1_0;

				/* clLogMessagesToStdout sends all log messages to the file descriptor stdout */

				extern void CL_API_ENTRY clLogMessagesToStdoutAPPLE(   const char * /* errstr */, 

				                                          const void * /* private_info */, 

				                                          size_t       /* cb */, 

				                                          void *       /* user_data */ )    CL_EXT_SUFFIX__VERSION_1_0;

				extern void CL_API_ENTRY clLogMessagesToStdoutAPPLE(   const char * errstr,

				                                          const void * private_info,

				                                          size_t       cb,

				                                          void *       user_data)    CL_EXT_SUFFIX__VERSION_1_0;

				/* clLogMessagesToStderr sends all log messages to the file descriptor stderr */

				extern void CL_API_ENTRY clLogMessagesToStderrAPPLE(   const char * /* errstr */, 

				                                          const void * /* private_info */, 

				                                          size_t       /* cb */, 

				                                          void *       /* user_data */ )    CL_EXT_SUFFIX__VERSION_1_0;

				extern void CL_API_ENTRY clLogMessagesToStderrAPPLE(   const char * errstr,

				                                          const void * private_info,

				                                          size_t       cb,

				                                          void *       user_data)    CL_EXT_SUFFIX__VERSION_1_0;

				/************************ 

				* cl_khr_icd extension *                                                  

				/************************

				* cl_khr_icd extension *

				************************/

				#define cl_khr_icd 1

				@@ -105,16 +110,43 @@ extern void CL_API_ENTRY clLogMessagesToStderrAPPLE(   const char * /* errstr */

				#define CL_PLATFORM_NOT_FOUND_KHR                   -1001

				extern CL_API_ENTRY cl_int CL_API_CALL

				clIcdGetPlatformIDsKHR(cl_uint          /* num_entries */,

				                       cl_platform_id * /* platforms */,

				                       cl_uint *        /* num_platforms */);

				clIcdGetPlatformIDsKHR(cl_uint          num_entries,

				                       cl_platform_id * platforms,

				                       cl_uint *        num_platforms);

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(

				    cl_uint          /* num_entries */,

				    cl_platform_id * /* platforms */,

				    cl_uint *        /* num_platforms */);

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(cl_uint          num_entries,

				                                         cl_platform_id * platforms,

				                                         cl_uint *        num_platforms);

				/*******************************

				 * cl_khr_il_program extension *

				 *******************************/

				#define cl_khr_il_program 1

				/* New property to clGetDeviceInfo for retrieving supported intermediate

				 * languages

				 */

				#define CL_DEVICE_IL_VERSION_KHR                    0x105B

				/* New property to clGetProgramInfo for retrieving for retrieving the IL of a

				 * program

				 */

				#define CL_PROGRAM_IL_KHR                           0x1169

				extern CL_API_ENTRY cl_program CL_API_CALL

				clCreateProgramWithILKHR(cl_context   context,

				                         const void * il,

				                         size_t       length,

				                         cl_int *     errcode_ret);

				typedef CL_API_ENTRY cl_program

				(CL_API_CALL *clCreateProgramWithILKHR_fn)(cl_context   context,

				                                           const void * il,

				                                           size_t       length,

				                                           cl_int *     errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				/* Extension: cl_khr_image2D_buffer

				 *

				 * This extension allows a 2D image to be created from a cl_mem buffer without a copy.

				@@ -129,31 +161,33 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(

				 * The pitch specified must be a multiple of CL_DEVICE_IMAGE_PITCH_ALIGNMENT pixels.

				 * The base address of the buffer must be aligned to CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT pixels.

				 */

				/*************************************

				 * cl_khr_initalize_memory extension *

				 *************************************/

				#define CL_CONTEXT_MEMORY_INITIALIZE_KHR            0x200E

				/**************************************

				 * cl_khr_initialize_memory extension *

				 **************************************/

				#define CL_CONTEXT_MEMORY_INITIALIZE_KHR            0x2030

				/**************************************

				 * cl_khr_terminate_context extension *

				 **************************************/

				#define CL_DEVICE_TERMINATE_CAPABILITY_KHR          0x200F

				#define CL_CONTEXT_TERMINATE_KHR                    0x2010

				#define CL_DEVICE_TERMINATE_CAPABILITY_KHR          0x2031

				#define CL_CONTEXT_TERMINATE_KHR                    0x2032

				#define cl_khr_terminate_context 1

				extern CL_API_ENTRY cl_int CL_API_CALL clTerminateContextKHR(cl_context /* context */) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clTerminateContextKHR(cl_context context) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL *clTerminateContextKHR_fn)(cl_context context) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /* context */) CL_EXT_SUFFIX__VERSION_1_2;

				/*

				 * Extension: cl_khr_spir

				 *

				 * This extension adds support to create an OpenCL program object from a 

				 * This extension adds support to create an OpenCL program object from a

				 * Standard Portable Intermediate Representation (SPIR) instance

				 */

				@@ -161,9 +195,30 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /

				#define CL_PROGRAM_BINARY_TYPE_INTERMEDIATE         0x40E1

				/*****************************************

				 * cl_khr_create_command_queue extension *

				 *****************************************/

				#define cl_khr_create_command_queue 1

				typedef cl_bitfield cl_queue_properties_khr;

				extern CL_API_ENTRY cl_command_queue CL_API_CALL

				clCreateCommandQueueWithPropertiesKHR(cl_context context,

				                                      cl_device_id device,

				                                      const cl_queue_properties_khr* properties,

				                                      cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_command_queue

				(CL_API_CALL *clCreateCommandQueueWithPropertiesKHR_fn)(cl_context context,

				                                                        cl_device_id device,

				                                                        const cl_queue_properties_khr* properties,

				                                                        cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				/******************************************

				* cl_nv_device_attribute_query extension *

				******************************************/

				/* cl_nv_device_attribute_query extension - no extension #define since it has no functions */

				#define CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV       0x4000

				#define CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV       0x4001

				@@ -173,88 +228,124 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /

				#define CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV            0x4005

				#define CL_DEVICE_INTEGRATED_MEMORY_NV              0x4006

				/*********************************

				* cl_amd_device_attribute_query *

				*********************************/

				#define CL_DEVICE_PROFILING_TIMER_OFFSET_AMD        0x4036

				/*********************************

				* cl_arm_printf extension

				*********************************/

				#define CL_PRINTF_CALLBACK_ARM                      0x40B0

				#define CL_PRINTF_BUFFERSIZE_ARM                    0x40B1

				#ifdef CL_VERSION_1_1

				   /***********************************

				    * cl_ext_device_fission extension *

				    ***********************************/

				    #define cl_ext_device_fission   1

				    extern CL_API_ENTRY cl_int CL_API_CALL

				    clReleaseDeviceEXT( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1; 

				    typedef CL_API_ENTRY cl_int 

				    (CL_API_CALL *clReleaseDeviceEXT_fn)( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1;

				    extern CL_API_ENTRY cl_int CL_API_CALL

				    clRetainDeviceEXT( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1; 

				    typedef CL_API_ENTRY cl_int 

				    (CL_API_CALL *clRetainDeviceEXT_fn)( cl_device_id /*device*/ ) CL_EXT_SUFFIX__VERSION_1_1;

				/***********************************

				* cl_ext_device_fission extension

				***********************************/

				#define cl_ext_device_fission   1

				    typedef cl_ulong  cl_device_partition_property_ext;

				    extern CL_API_ENTRY cl_int CL_API_CALL

				    clCreateSubDevicesEXT(  cl_device_id /*in_device*/,

				                            const cl_device_partition_property_ext * /* properties */,

				                            cl_uint /*num_entries*/,

				                            cl_device_id * /*out_devices*/,

				                            cl_uint * /*num_devices*/ ) CL_EXT_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clReleaseDeviceEXT(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;

				    typedef CL_API_ENTRY cl_int 

				    ( CL_API_CALL * clCreateSubDevicesEXT_fn)(  cl_device_id /*in_device*/,

				                                                const cl_device_partition_property_ext * /* properties */,

				                                                cl_uint /*num_entries*/,

				                                                cl_device_id * /*out_devices*/,

				                                                cl_uint * /*num_devices*/ ) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL *clReleaseDeviceEXT_fn)(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clRetainDeviceEXT(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL *clRetainDeviceEXT_fn)(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;

				typedef cl_ulong  cl_device_partition_property_ext;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clCreateSubDevicesEXT(cl_device_id   in_device,

				                      const cl_device_partition_property_ext * properties,

				                      cl_uint        num_entries,

				                      cl_device_id * out_devices,

				                      cl_uint *      num_devices) CL_EXT_SUFFIX__VERSION_1_1;

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL * clCreateSubDevicesEXT_fn)(cl_device_id   in_device,

				                                         const cl_device_partition_property_ext * properties,

				                                         cl_uint        num_entries,

				                                         cl_device_id * out_devices,

				                                         cl_uint *      num_devices) CL_EXT_SUFFIX__VERSION_1_1;

				/* cl_device_partition_property_ext */

				#define CL_DEVICE_PARTITION_EQUALLY_EXT             0x4050

				#define CL_DEVICE_PARTITION_BY_COUNTS_EXT           0x4051

				#define CL_DEVICE_PARTITION_BY_NAMES_EXT            0x4052

				#define CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT  0x4053

				/* clDeviceGetInfo selectors */

				#define CL_DEVICE_PARENT_DEVICE_EXT                 0x4054

				#define CL_DEVICE_PARTITION_TYPES_EXT               0x4055

				#define CL_DEVICE_AFFINITY_DOMAINS_EXT              0x4056

				#define CL_DEVICE_REFERENCE_COUNT_EXT               0x4057

				#define CL_DEVICE_PARTITION_STYLE_EXT               0x4058

				/* error codes */

				#define CL_DEVICE_PARTITION_FAILED_EXT              -1057

				#define CL_INVALID_PARTITION_COUNT_EXT              -1058

				#define CL_INVALID_PARTITION_NAME_EXT               -1059

				/* CL_AFFINITY_DOMAINs */

				#define CL_AFFINITY_DOMAIN_L1_CACHE_EXT             0x1

				#define CL_AFFINITY_DOMAIN_L2_CACHE_EXT             0x2

				#define CL_AFFINITY_DOMAIN_L3_CACHE_EXT             0x3

				#define CL_AFFINITY_DOMAIN_L4_CACHE_EXT             0x4

				#define CL_AFFINITY_DOMAIN_NUMA_EXT                 0x10

				#define CL_AFFINITY_DOMAIN_NEXT_FISSIONABLE_EXT     0x100

				/* cl_device_partition_property_ext list terminators */

				#define CL_PROPERTIES_LIST_END_EXT                  ((cl_device_partition_property_ext) 0)

				#define CL_PARTITION_BY_COUNTS_LIST_END_EXT         ((cl_device_partition_property_ext) 0)

				#define CL_PARTITION_BY_NAMES_LIST_END_EXT          ((cl_device_partition_property_ext) 0 - 1)

				/***********************************

				 * cl_ext_migrate_memobject extension definitions

				 ***********************************/

				#define cl_ext_migrate_memobject 1

				typedef cl_bitfield cl_mem_migration_flags_ext;

				#define CL_MIGRATE_MEM_OBJECT_HOST_EXT              0x1

				#define CL_COMMAND_MIGRATE_MEM_OBJECT_EXT           0x4040

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueMigrateMemObjectEXT(cl_command_queue command_queue,

				                             cl_uint          num_mem_objects,

				                             const cl_mem *   mem_objects,

				                             cl_mem_migration_flags_ext flags,

				                             cl_uint          num_events_in_wait_list,

				                             const cl_event * event_wait_list,

				                             cl_event *       event);

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL *clEnqueueMigrateMemObjectEXT_fn)(cl_command_queue command_queue,

				                                               cl_uint          num_mem_objects,

				                                               const cl_mem *   mem_objects,

				                                               cl_mem_migration_flags_ext flags,

				                                               cl_uint          num_events_in_wait_list,

				                                               const cl_event * event_wait_list,

				                                               cl_event *       event);

				    /* cl_device_partition_property_ext */

				    #define CL_DEVICE_PARTITION_EQUALLY_EXT             0x4050

				    #define CL_DEVICE_PARTITION_BY_COUNTS_EXT           0x4051

				    #define CL_DEVICE_PARTITION_BY_NAMES_EXT            0x4052

				    #define CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT  0x4053

				    /* clDeviceGetInfo selectors */

				    #define CL_DEVICE_PARENT_DEVICE_EXT                 0x4054

				    #define CL_DEVICE_PARTITION_TYPES_EXT               0x4055

				    #define CL_DEVICE_AFFINITY_DOMAINS_EXT              0x4056

				    #define CL_DEVICE_REFERENCE_COUNT_EXT               0x4057

				    #define CL_DEVICE_PARTITION_STYLE_EXT               0x4058

				    /* error codes */

				    #define CL_DEVICE_PARTITION_FAILED_EXT              -1057

				    #define CL_INVALID_PARTITION_COUNT_EXT              -1058

				    #define CL_INVALID_PARTITION_NAME_EXT               -1059

				    /* CL_AFFINITY_DOMAINs */

				    #define CL_AFFINITY_DOMAIN_L1_CACHE_EXT             0x1

				    #define CL_AFFINITY_DOMAIN_L2_CACHE_EXT             0x2

				    #define CL_AFFINITY_DOMAIN_L3_CACHE_EXT             0x3

				    #define CL_AFFINITY_DOMAIN_L4_CACHE_EXT             0x4

				    #define CL_AFFINITY_DOMAIN_NUMA_EXT                 0x10

				    #define CL_AFFINITY_DOMAIN_NEXT_FISSIONABLE_EXT     0x100

				    /* cl_device_partition_property_ext list terminators */

				    #define CL_PROPERTIES_LIST_END_EXT                  ((cl_device_partition_property_ext) 0)

				    #define CL_PARTITION_BY_COUNTS_LIST_END_EXT         ((cl_device_partition_property_ext) 0)

				    #define CL_PARTITION_BY_NAMES_LIST_END_EXT          ((cl_device_partition_property_ext) 0 - 1)

				/*********************************

				* cl_qcom_ext_host_ptr extension

				*********************************/

				#define cl_qcom_ext_host_ptr 1

				#define CL_MEM_EXT_HOST_PTR_QCOM                  (1 << 29)

				#define CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM   0x40A0      

				#define CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM   0x40A0

				#define CL_DEVICE_PAGE_SIZE_QCOM                  0x40A1

				#define CL_IMAGE_ROW_ALIGNMENT_QCOM               0x40A2

				#define CL_IMAGE_SLICE_ALIGNMENT_QCOM             0x40A3

				@@ -280,12 +371,21 @@ typedef struct _cl_mem_ext_host_ptr

				    /* Type of external memory allocation. */

				    /* Legal values will be defined in layered extensions. */

				    cl_uint  allocation_type;

					/* Host cache policy for this external memory allocation. */

				    /* Host cache policy for this external memory allocation. */

				    cl_uint  host_cache_policy;

				} cl_mem_ext_host_ptr;

				/*******************************************

				* cl_qcom_ext_host_ptr_iocoherent extension

				********************************************/

				/* Cache policy specifying io-coherence */

				#define CL_MEM_HOST_IOCOHERENT_QCOM               0x40A9

				/*********************************

				* cl_qcom_ion_host_ptr extension

				*********************************/

				@@ -300,13 +400,339 @@ typedef struct _cl_mem_ion_host_ptr

				    /* ION file descriptor */

				    int                  ion_filedesc;

				    /* Host pointer to the ION allocated memory */

				    void*                ion_hostptr;

				} cl_mem_ion_host_ptr;

				#endif /* CL_VERSION_1_1 */

				/*********************************

				* cl_qcom_android_native_buffer_host_ptr extension

				*********************************/

				#define CL_MEM_ANDROID_NATIVE_BUFFER_HOST_PTR_QCOM                  0x40C6

				typedef struct _cl_mem_android_native_buffer_host_ptr

				{

				    /* Type of external memory allocation. */

				    /* Must be CL_MEM_ANDROID_NATIVE_BUFFER_HOST_PTR_QCOM for Android native buffers. */

				    cl_mem_ext_host_ptr  ext_host_ptr;

				    /* Virtual pointer to the android native buffer */

				    void*                anb_ptr;

				} cl_mem_android_native_buffer_host_ptr;

				/******************************************

				 * cl_img_yuv_image extension *

				 ******************************************/

				/* Image formats used in clCreateImage */

				#define CL_NV21_IMG                                 0x40D0

				#define CL_YV12_IMG                                 0x40D1

				/******************************************

				 * cl_img_cached_allocations extension *

				 ******************************************/

				/* Flag values used by clCreateBuffer */

				#define CL_MEM_USE_UNCACHED_CPU_MEMORY_IMG          (1 << 26)

				#define CL_MEM_USE_CACHED_CPU_MEMORY_IMG            (1 << 27)

				/******************************************

				 * cl_img_use_gralloc_ptr extension *

				 ******************************************/

				#define cl_img_use_gralloc_ptr 1

				/* Flag values used by clCreateBuffer */

				#define CL_MEM_USE_GRALLOC_PTR_IMG                  (1 << 28)

				/* To be used by clGetEventInfo: */

				#define CL_COMMAND_ACQUIRE_GRALLOC_OBJECTS_IMG      0x40D2

				#define CL_COMMAND_RELEASE_GRALLOC_OBJECTS_IMG      0x40D3

				/* Error code from clEnqueueReleaseGrallocObjectsIMG */

				#define CL_GRALLOC_RESOURCE_NOT_ACQUIRED_IMG        0x40D4

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueAcquireGrallocObjectsIMG(cl_command_queue      command_queue,

				                                  cl_uint               num_objects,

				                                  const cl_mem *        mem_objects,

				                                  cl_uint               num_events_in_wait_list,

				                                  const cl_event *      event_wait_list,

				                                  cl_event *            event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseGrallocObjectsIMG(cl_command_queue      command_queue,

				                                  cl_uint               num_objects,

				                                  const cl_mem *        mem_objects,

				                                  cl_uint               num_events_in_wait_list,

				                                  const cl_event *      event_wait_list,

				                                  cl_event *            event) CL_EXT_SUFFIX__VERSION_1_2;

				/*********************************

				* cl_khr_subgroups extension

				*********************************/

				#define cl_khr_subgroups 1

				#if !defined(CL_VERSION_2_1)

				/* For OpenCL 2.1 and newer, cl_kernel_sub_group_info is declared in CL.h.

				   In hindsight, there should have been a khr suffix on this type for

				   the extension, but keeping it un-suffixed to maintain backwards

				   compatibility. */

				typedef cl_uint             cl_kernel_sub_group_info;

				#endif

				/* cl_kernel_sub_group_info */

				#define CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE_KHR    0x2033

				#define CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE_KHR       0x2034

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetKernelSubGroupInfoKHR(cl_kernel    in_kernel,

				                           cl_device_id in_device,

				                           cl_kernel_sub_group_info param_name,

				                           size_t       input_value_size,

				                           const void * input_value,

				                           size_t       param_value_size,

				                           void *       param_value,

				                           size_t *     param_value_size_ret) CL_EXT_SUFFIX__VERSION_2_0_DEPRECATED;

				typedef CL_API_ENTRY cl_int

				(CL_API_CALL * clGetKernelSubGroupInfoKHR_fn)(cl_kernel    in_kernel,

				                                              cl_device_id in_device,

				                                              cl_kernel_sub_group_info param_name,

				                                              size_t       input_value_size,

				                                              const void * input_value,

				                                              size_t       param_value_size,

				                                              void *       param_value,

				                                              size_t *     param_value_size_ret) CL_EXT_SUFFIX__VERSION_2_0_DEPRECATED;

				/*********************************

				* cl_khr_mipmap_image extension

				*********************************/

				/* cl_sampler_properties */

				#define CL_SAMPLER_MIP_FILTER_MODE_KHR              0x1155

				#define CL_SAMPLER_LOD_MIN_KHR                      0x1156

				#define CL_SAMPLER_LOD_MAX_KHR                      0x1157

				/*********************************

				* cl_khr_priority_hints extension

				*********************************/

				/* This extension define is for backwards compatibility.

				   It shouldn't be required since this extension has no new functions. */

				#define cl_khr_priority_hints 1

				typedef cl_uint  cl_queue_priority_khr;

				/* cl_command_queue_properties */

				#define CL_QUEUE_PRIORITY_KHR 0x1096

				/* cl_queue_priority_khr */

				#define CL_QUEUE_PRIORITY_HIGH_KHR (1<<0)

				#define CL_QUEUE_PRIORITY_MED_KHR (1<<1)

				#define CL_QUEUE_PRIORITY_LOW_KHR (1<<2)

				/*********************************

				* cl_khr_throttle_hints extension

				*********************************/

				/* This extension define is for backwards compatibility.

				   It shouldn't be required since this extension has no new functions. */

				#define cl_khr_throttle_hints 1

				typedef cl_uint  cl_queue_throttle_khr;

				/* cl_command_queue_properties */

				#define CL_QUEUE_THROTTLE_KHR 0x1097

				/* cl_queue_throttle_khr */

				#define CL_QUEUE_THROTTLE_HIGH_KHR (1<<0)

				#define CL_QUEUE_THROTTLE_MED_KHR (1<<1)

				#define CL_QUEUE_THROTTLE_LOW_KHR (1<<2)

				/*********************************

				* cl_khr_subgroup_named_barrier

				*********************************/

				/* This extension define is for backwards compatibility.

				   It shouldn't be required since this extension has no new functions. */

				#define cl_khr_subgroup_named_barrier 1

				/* cl_device_info */

				#define CL_DEVICE_MAX_NAMED_BARRIER_COUNT_KHR       0x2035

				/**********************************

				 * cl_arm_import_memory extension *

				 **********************************/

				#define cl_arm_import_memory 1

				typedef intptr_t cl_import_properties_arm;

				/* Default and valid proporties name for cl_arm_import_memory */

				#define CL_IMPORT_TYPE_ARM                        0x40B2

				/* Host process memory type default value for CL_IMPORT_TYPE_ARM property */

				#define CL_IMPORT_TYPE_HOST_ARM                   0x40B3

				/* DMA BUF memory type value for CL_IMPORT_TYPE_ARM property */

				#define CL_IMPORT_TYPE_DMA_BUF_ARM                0x40B4

				/* Protected DMA BUF memory type value for CL_IMPORT_TYPE_ARM property */

				#define CL_IMPORT_TYPE_PROTECTED_ARM              0x40B5

				/* This extension adds a new function that allows for direct memory import into

				 * OpenCL via the clImportMemoryARM function.

				 *

				 * Memory imported through this interface will be mapped into the device's page

				 * tables directly, providing zero copy access. It will never fall back to copy

				 * operations and aliased buffers.

				 *

				 * Types of memory supported for import are specified as additional extension

				 * strings.

				 *

				 * This extension produces cl_mem allocations which are compatible with all other

				 * users of cl_mem in the standard API.

				 *

				 * This extension maps pages with the same properties as the normal buffer creation

				 * function clCreateBuffer.

				 */

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clImportMemoryARM( cl_context context,

				                   cl_mem_flags flags,

				                   const cl_import_properties_arm *properties,

				                   void *memory,

				                   size_t size,

				                   cl_int *errcode_ret) CL_EXT_SUFFIX__VERSION_1_0;

				/******************************************

				 * cl_arm_shared_virtual_memory extension *

				 ******************************************/

				#define cl_arm_shared_virtual_memory 1

				/* Used by clGetDeviceInfo */

				#define CL_DEVICE_SVM_CAPABILITIES_ARM                  0x40B6

				/* Used by clGetMemObjectInfo */

				#define CL_MEM_USES_SVM_POINTER_ARM                     0x40B7

				/* Used by clSetKernelExecInfoARM: */

				#define CL_KERNEL_EXEC_INFO_SVM_PTRS_ARM                0x40B8

				#define CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM_ARM   0x40B9

				/* To be used by clGetEventInfo: */

				#define CL_COMMAND_SVM_FREE_ARM                         0x40BA

				#define CL_COMMAND_SVM_MEMCPY_ARM                       0x40BB

				#define CL_COMMAND_SVM_MEMFILL_ARM                      0x40BC

				#define CL_COMMAND_SVM_MAP_ARM                          0x40BD

				#define CL_COMMAND_SVM_UNMAP_ARM                        0x40BE

				/* Flag values returned by clGetDeviceInfo with CL_DEVICE_SVM_CAPABILITIES_ARM as the param_name. */

				#define CL_DEVICE_SVM_COARSE_GRAIN_BUFFER_ARM           (1 << 0)

				#define CL_DEVICE_SVM_FINE_GRAIN_BUFFER_ARM             (1 << 1)

				#define CL_DEVICE_SVM_FINE_GRAIN_SYSTEM_ARM             (1 << 2)

				#define CL_DEVICE_SVM_ATOMICS_ARM                       (1 << 3)

				/* Flag values used by clSVMAllocARM: */

				#define CL_MEM_SVM_FINE_GRAIN_BUFFER_ARM                (1 << 10)

				#define CL_MEM_SVM_ATOMICS_ARM                          (1 << 11)

				typedef cl_bitfield cl_svm_mem_flags_arm;

				typedef cl_uint     cl_kernel_exec_info_arm;

				typedef cl_bitfield cl_device_svm_capabilities_arm;

				extern CL_API_ENTRY void * CL_API_CALL

				clSVMAllocARM(cl_context       context,

				              cl_svm_mem_flags_arm flags,

				              size_t           size,

				              cl_uint          alignment) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY void CL_API_CALL

				clSVMFreeARM(cl_context        context,

				             void *            svm_pointer) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueSVMFreeARM(cl_command_queue  command_queue,

				                    cl_uint           num_svm_pointers,

				                    void *            svm_pointers[],

				                    void (CL_CALLBACK * pfn_free_func)(cl_command_queue queue,

				                                                       cl_uint          num_svm_pointers,

				                                                       void *           svm_pointers[],

				                                                       void *           user_data),

				                    void *            user_data,

				                    cl_uint           num_events_in_wait_list,

				                    const cl_event *  event_wait_list,

				                    cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueSVMMemcpyARM(cl_command_queue  command_queue,

				                      cl_bool           blocking_copy,

				                      void *            dst_ptr,

				                      const void *      src_ptr,

				                      size_t            size,

				                      cl_uint           num_events_in_wait_list,

				                      const cl_event *  event_wait_list,

				                      cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueSVMMemFillARM(cl_command_queue  command_queue,

				                       void *            svm_ptr,

				                       const void *      pattern,

				                       size_t            pattern_size,

				                       size_t            size,

				                       cl_uint           num_events_in_wait_list,

				                       const cl_event *  event_wait_list,

				                       cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueSVMMapARM(cl_command_queue  command_queue,

				                   cl_bool           blocking_map,

				                   cl_map_flags      flags,

				                   void *            svm_ptr,

				                   size_t            size,

				                   cl_uint           num_events_in_wait_list,

				                   const cl_event *  event_wait_list,

				                   cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueSVMUnmapARM(cl_command_queue  command_queue,

				                     void *            svm_ptr,

				                     cl_uint           num_events_in_wait_list,

				                     const cl_event *  event_wait_list,

				                     cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clSetKernelArgSVMPointerARM(cl_kernel    kernel,

				                            cl_uint      arg_index,

				                            const void * arg_value) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clSetKernelExecInfoARM(cl_kernel            kernel,

				                       cl_kernel_exec_info_arm  param_name,

				                       size_t               param_value_size,

				                       const void *         param_value) CL_EXT_SUFFIX__VERSION_1_2;

				/********************************

				 * cl_arm_get_core_id extension *

				 ********************************/

				#ifdef CL_VERSION_1_2

				#define cl_arm_get_core_id 1

				/* Device info property for bitfield of cores present */

				#define CL_DEVICE_COMPUTE_UNITS_BITFIELD_ARM      0x40BF

				#endif  /* CL_VERSION_1_2 */

				#ifdef __cplusplus

				}

									
										423

include/CL/cl_ext_intel.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,423 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 ******************************************************************************/

				/*****************************************************************************\

				Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.

				THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS

				"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT

				LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR

				A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS

				CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

				EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

				PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY

				OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING

				NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE

				MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				File Name: cl_ext_intel.h

				Abstract:

				Notes:

				\*****************************************************************************/

				#ifndef __CL_EXT_INTEL_H

				#define __CL_EXT_INTEL_H

				#include <CL/cl.h>

				#include <CL/cl_platform.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/***************************************

				* cl_intel_thread_local_exec extension *

				****************************************/

				#define cl_intel_thread_local_exec 1

				#define CL_QUEUE_THREAD_LOCAL_EXEC_ENABLE_INTEL      (((cl_bitfield)1) << 31)

				/***********************************************

				* cl_intel_device_partition_by_names extension *

				************************************************/

				#define cl_intel_device_partition_by_names 1

				#define CL_DEVICE_PARTITION_BY_NAMES_INTEL          0x4052

				#define CL_PARTITION_BY_NAMES_LIST_END_INTEL        -1

				/************************************************

				* cl_intel_accelerator extension                *

				* cl_intel_motion_estimation extension          *

				* cl_intel_advanced_motion_estimation extension *

				*************************************************/

				#define cl_intel_accelerator 1

				#define cl_intel_motion_estimation 1

				#define cl_intel_advanced_motion_estimation 1

				typedef struct _cl_accelerator_intel* cl_accelerator_intel;

				typedef cl_uint cl_accelerator_type_intel;

				typedef cl_uint cl_accelerator_info_intel;

				typedef struct _cl_motion_estimation_desc_intel {

				    cl_uint mb_block_type;

				    cl_uint subpixel_mode;

				    cl_uint sad_adjust_mode;

				    cl_uint search_path_type;

				} cl_motion_estimation_desc_intel;

				/* error codes */

				#define CL_INVALID_ACCELERATOR_INTEL                              -1094

				#define CL_INVALID_ACCELERATOR_TYPE_INTEL                         -1095

				#define CL_INVALID_ACCELERATOR_DESCRIPTOR_INTEL                   -1096

				#define CL_ACCELERATOR_TYPE_NOT_SUPPORTED_INTEL                   -1097

				/* cl_accelerator_type_intel */

				#define CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL               0x0

				/* cl_accelerator_info_intel */

				#define CL_ACCELERATOR_DESCRIPTOR_INTEL                           0x4090

				#define CL_ACCELERATOR_REFERENCE_COUNT_INTEL                      0x4091

				#define CL_ACCELERATOR_CONTEXT_INTEL                              0x4092

				#define CL_ACCELERATOR_TYPE_INTEL                                 0x4093

				/* cl_motion_detect_desc_intel flags */

				#define CL_ME_MB_TYPE_16x16_INTEL                                 0x0

				#define CL_ME_MB_TYPE_8x8_INTEL                                   0x1

				#define CL_ME_MB_TYPE_4x4_INTEL                                   0x2

				#define CL_ME_SUBPIXEL_MODE_INTEGER_INTEL                         0x0

				#define CL_ME_SUBPIXEL_MODE_HPEL_INTEL                            0x1

				#define CL_ME_SUBPIXEL_MODE_QPEL_INTEL                            0x2

				#define CL_ME_SAD_ADJUST_MODE_NONE_INTEL                          0x0

				#define CL_ME_SAD_ADJUST_MODE_HAAR_INTEL                          0x1

				#define CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL                        0x0

				#define CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL                        0x1

				#define CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL                      0x5

				#define CL_ME_SKIP_BLOCK_TYPE_16x16_INTEL                         0x0

				#define CL_ME_CHROMA_INTRA_PREDICT_ENABLED_INTEL                  0x1

				#define CL_ME_LUMA_INTRA_PREDICT_ENABLED_INTEL                    0x2

				#define CL_ME_SKIP_BLOCK_TYPE_8x8_INTEL                           0x4

				#define CL_ME_FORWARD_INPUT_MODE_INTEL                            0x1

				#define CL_ME_BACKWARD_INPUT_MODE_INTEL                           0x2

				#define CL_ME_BIDIRECTION_INPUT_MODE_INTEL                        0x3

				#define CL_ME_BIDIR_WEIGHT_QUARTER_INTEL                          16

				#define CL_ME_BIDIR_WEIGHT_THIRD_INTEL                            21

				#define CL_ME_BIDIR_WEIGHT_HALF_INTEL                             32

				#define CL_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL                        43

				#define CL_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL                    48

				#define CL_ME_COST_PENALTY_NONE_INTEL                             0x0

				#define CL_ME_COST_PENALTY_LOW_INTEL                              0x1

				#define CL_ME_COST_PENALTY_NORMAL_INTEL                           0x2

				#define CL_ME_COST_PENALTY_HIGH_INTEL                             0x3

				#define CL_ME_COST_PRECISION_QPEL_INTEL                           0x0

				#define CL_ME_COST_PRECISION_HPEL_INTEL                           0x1

				#define CL_ME_COST_PRECISION_PEL_INTEL                            0x2

				#define CL_ME_COST_PRECISION_DPEL_INTEL                           0x3

				#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL                  0x0

				#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL                0x1

				#define CL_ME_LUMA_PREDICTOR_MODE_DC_INTEL                        0x2

				#define CL_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL        0x3

				#define CL_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL       0x4

				#define CL_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL                     0x4

				#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL            0x5

				#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL           0x6

				#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL             0x7

				#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL             0x8

				#define CL_ME_CHROMA_PREDICTOR_MODE_DC_INTEL                      0x0

				#define CL_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL              0x1

				#define CL_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL                0x2

				#define CL_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL                   0x3

				/* cl_device_info */

				#define CL_DEVICE_ME_VERSION_INTEL                                0x407E

				#define CL_ME_VERSION_LEGACY_INTEL                                0x0

				#define CL_ME_VERSION_ADVANCED_VER_1_INTEL                        0x1

				#define CL_ME_VERSION_ADVANCED_VER_2_INTEL                        0x2

				extern CL_API_ENTRY cl_accelerator_intel CL_API_CALL

				clCreateAcceleratorINTEL(

				    cl_context                   context,

				    cl_accelerator_type_intel    accelerator_type,

				    size_t                       descriptor_size,

				    const void*                  descriptor,

				    cl_int*                      errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_accelerator_intel (CL_API_CALL *clCreateAcceleratorINTEL_fn)(

				    cl_context                   context,

				    cl_accelerator_type_intel    accelerator_type,

				    size_t                       descriptor_size,

				    const void*                  descriptor,

				    cl_int*                      errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetAcceleratorInfoINTEL(

				    cl_accelerator_intel         accelerator,

				    cl_accelerator_info_intel    param_name,

				    size_t                       param_value_size,

				    void*                        param_value,

				    size_t*                      param_value_size_ret) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetAcceleratorInfoINTEL_fn)(

				    cl_accelerator_intel         accelerator,

				    cl_accelerator_info_intel    param_name,

				    size_t                       param_value_size,

				    void*                        param_value,

				    size_t*                      param_value_size_ret) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clRetainAcceleratorINTEL(

				    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clRetainAcceleratorINTEL_fn)(

				    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clReleaseAcceleratorINTEL(

				    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clReleaseAcceleratorINTEL_fn)(

				    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;

				/******************************************

				* cl_intel_simultaneous_sharing extension *

				*******************************************/

				#define cl_intel_simultaneous_sharing 1

				#define CL_DEVICE_SIMULTANEOUS_INTEROPS_INTEL            0x4104

				#define CL_DEVICE_NUM_SIMULTANEOUS_INTEROPS_INTEL        0x4105

				/***********************************

				* cl_intel_egl_image_yuv extension *

				************************************/

				#define cl_intel_egl_image_yuv 1

				#define CL_EGL_YUV_PLANE_INTEL                           0x4107

				/********************************

				* cl_intel_packed_yuv extension *

				*********************************/

				#define cl_intel_packed_yuv 1

				#define CL_YUYV_INTEL                                    0x4076

				#define CL_UYVY_INTEL                                    0x4077

				#define CL_YVYU_INTEL                                    0x4078

				#define CL_VYUY_INTEL                                    0x4079

				/********************************************

				* cl_intel_required_subgroup_size extension *

				*********************************************/

				#define cl_intel_required_subgroup_size 1

				#define CL_DEVICE_SUB_GROUP_SIZES_INTEL                  0x4108

				#define CL_KERNEL_SPILL_MEM_SIZE_INTEL                   0x4109

				#define CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL           0x410A

				/****************************************

				* cl_intel_driver_diagnostics extension *

				*****************************************/

				#define cl_intel_driver_diagnostics 1

				typedef cl_uint cl_diagnostics_verbose_level;

				#define CL_CONTEXT_SHOW_DIAGNOSTICS_INTEL                0x4106

				#define CL_CONTEXT_DIAGNOSTICS_LEVEL_ALL_INTEL           ( 0xff )

				#define CL_CONTEXT_DIAGNOSTICS_LEVEL_GOOD_INTEL          ( 1 )

				#define CL_CONTEXT_DIAGNOSTICS_LEVEL_BAD_INTEL           ( 1 << 1 )

				#define CL_CONTEXT_DIAGNOSTICS_LEVEL_NEUTRAL_INTEL       ( 1 << 2 )

				/********************************

				* cl_intel_planar_yuv extension *

				*********************************/

				#define CL_NV12_INTEL                                       0x410E

				#define CL_MEM_NO_ACCESS_INTEL                              ( 1 << 24 )

				#define CL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL              ( 1 << 25 )

				#define CL_DEVICE_PLANAR_YUV_MAX_WIDTH_INTEL                0x417E

				#define CL_DEVICE_PLANAR_YUV_MAX_HEIGHT_INTEL               0x417F

				/*******************************************************

				* cl_intel_device_side_avc_motion_estimation extension *

				********************************************************/

				#define CL_DEVICE_AVC_ME_VERSION_INTEL                      0x410B

				#define CL_DEVICE_AVC_ME_SUPPORTS_TEXTURE_SAMPLER_USE_INTEL 0x410C

				#define CL_DEVICE_AVC_ME_SUPPORTS_PREEMPTION_INTEL          0x410D

				#define CL_AVC_ME_VERSION_0_INTEL                           0x0;  // No support.

				#define CL_AVC_ME_VERSION_1_INTEL                           0x1;  // First supported version.

				#define CL_AVC_ME_MAJOR_16x16_INTEL                         0x0

				#define CL_AVC_ME_MAJOR_16x8_INTEL                          0x1

				#define CL_AVC_ME_MAJOR_8x16_INTEL                          0x2

				#define CL_AVC_ME_MAJOR_8x8_INTEL                           0x3

				#define CL_AVC_ME_MINOR_8x8_INTEL                           0x0

				#define CL_AVC_ME_MINOR_8x4_INTEL                           0x1

				#define CL_AVC_ME_MINOR_4x8_INTEL                           0x2

				#define CL_AVC_ME_MINOR_4x4_INTEL                           0x3

				#define CL_AVC_ME_MAJOR_FORWARD_INTEL                       0x0

				#define CL_AVC_ME_MAJOR_BACKWARD_INTEL                      0x1

				#define CL_AVC_ME_MAJOR_BIDIRECTIONAL_INTEL                 0x2

				#define CL_AVC_ME_PARTITION_MASK_ALL_INTEL                  0x0

				#define CL_AVC_ME_PARTITION_MASK_16x16_INTEL                0x7E

				#define CL_AVC_ME_PARTITION_MASK_16x8_INTEL                 0x7D

				#define CL_AVC_ME_PARTITION_MASK_8x16_INTEL                 0x7B

				#define CL_AVC_ME_PARTITION_MASK_8x8_INTEL                  0x77

				#define CL_AVC_ME_PARTITION_MASK_8x4_INTEL                  0x6F

				#define CL_AVC_ME_PARTITION_MASK_4x8_INTEL                  0x5F

				#define CL_AVC_ME_PARTITION_MASK_4x4_INTEL                  0x3F

				#define CL_AVC_ME_SEARCH_WINDOW_EXHAUSTIVE_INTEL            0x0

				#define CL_AVC_ME_SEARCH_WINDOW_SMALL_INTEL                 0x1

				#define CL_AVC_ME_SEARCH_WINDOW_TINY_INTEL                  0x2

				#define CL_AVC_ME_SEARCH_WINDOW_EXTRA_TINY_INTEL            0x3

				#define CL_AVC_ME_SEARCH_WINDOW_DIAMOND_INTEL               0x4

				#define CL_AVC_ME_SEARCH_WINDOW_LARGE_DIAMOND_INTEL         0x5

				#define CL_AVC_ME_SEARCH_WINDOW_RESERVED0_INTEL             0x6

				#define CL_AVC_ME_SEARCH_WINDOW_RESERVED1_INTEL             0x7

				#define CL_AVC_ME_SEARCH_WINDOW_CUSTOM_INTEL                0x8

				#define CL_AVC_ME_SEARCH_WINDOW_16x12_RADIUS_INTEL          0x9

				#define CL_AVC_ME_SEARCH_WINDOW_4x4_RADIUS_INTEL            0x2

				#define CL_AVC_ME_SEARCH_WINDOW_2x2_RADIUS_INTEL            0xa

				#define CL_AVC_ME_SAD_ADJUST_MODE_NONE_INTEL                0x0

				#define CL_AVC_ME_SAD_ADJUST_MODE_HAAR_INTEL                0x2

				#define CL_AVC_ME_SUBPIXEL_MODE_INTEGER_INTEL               0x0

				#define CL_AVC_ME_SUBPIXEL_MODE_HPEL_INTEL                  0x1

				#define CL_AVC_ME_SUBPIXEL_MODE_QPEL_INTEL                  0x3

				#define CL_AVC_ME_COST_PRECISION_QPEL_INTEL                 0x0

				#define CL_AVC_ME_COST_PRECISION_HPEL_INTEL                 0x1

				#define CL_AVC_ME_COST_PRECISION_PEL_INTEL                  0x2

				#define CL_AVC_ME_COST_PRECISION_DPEL_INTEL                 0x3

				#define CL_AVC_ME_BIDIR_WEIGHT_QUARTER_INTEL                0x10

				#define CL_AVC_ME_BIDIR_WEIGHT_THIRD_INTEL                  0x15

				#define CL_AVC_ME_BIDIR_WEIGHT_HALF_INTEL                   0x20

				#define CL_AVC_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL              0x2B

				#define CL_AVC_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL          0x30

				#define CL_AVC_ME_BORDER_REACHED_LEFT_INTEL                 0x0

				#define CL_AVC_ME_BORDER_REACHED_RIGHT_INTEL                0x2

				#define CL_AVC_ME_BORDER_REACHED_TOP_INTEL                  0x4

				#define CL_AVC_ME_BORDER_REACHED_BOTTOM_INTEL               0x8

				#define CL_AVC_ME_SKIP_BLOCK_PARTITION_16x16_INTEL          0x0

				#define CL_AVC_ME_SKIP_BLOCK_PARTITION_8x8_INTEL            0x4000

				#define CL_AVC_ME_SKIP_BLOCK_16x16_FORWARD_ENABLE_INTEL     ( 0x1 << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_16x16_BACKWARD_ENABLE_INTEL    ( 0x2 << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_16x16_DUAL_ENABLE_INTEL        ( 0x3 << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_FORWARD_ENABLE_INTEL       ( 0x55 << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_BACKWARD_ENABLE_INTEL      ( 0xAA << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_DUAL_ENABLE_INTEL          ( 0xFF << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_0_FORWARD_ENABLE_INTEL     ( 0x1 << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_0_BACKWARD_ENABLE_INTEL    ( 0x2 << 24 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_1_FORWARD_ENABLE_INTEL     ( 0x1 << 26 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_1_BACKWARD_ENABLE_INTEL    ( 0x2 << 26 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_2_FORWARD_ENABLE_INTEL     ( 0x1 << 28 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_2_BACKWARD_ENABLE_INTEL    ( 0x2 << 28 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_3_FORWARD_ENABLE_INTEL     ( 0x1 << 30 )

				#define CL_AVC_ME_SKIP_BLOCK_8x8_3_BACKWARD_ENABLE_INTEL    ( 0x2 << 30 )

				#define CL_AVC_ME_BLOCK_BASED_SKIP_4x4_INTEL                0x00

				#define CL_AVC_ME_BLOCK_BASED_SKIP_8x8_INTEL                0x80

				#define CL_AVC_ME_INTRA_16x16_INTEL                         0x0

				#define CL_AVC_ME_INTRA_8x8_INTEL                           0x1

				#define CL_AVC_ME_INTRA_4x4_INTEL                           0x2

				#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_16x16_INTEL     0x6

				#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_8x8_INTEL       0x5

				#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_4x4_INTEL       0x3

				#define CL_AVC_ME_INTRA_NEIGHBOR_LEFT_MASK_ENABLE_INTEL         0x60

				#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_MASK_ENABLE_INTEL        0x10

				#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_RIGHT_MASK_ENABLE_INTEL  0x8

				#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_LEFT_MASK_ENABLE_INTEL   0x4

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL            0x0

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL          0x1

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DC_INTEL                  0x2

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL  0x3

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL 0x4

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL               0x4

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL      0x5

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL     0x6

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL       0x7

				#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL       0x8

				#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_DC_INTEL                0x0

				#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL        0x1

				#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL          0x2

				#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL             0x3

				#define CL_AVC_ME_FRAME_FORWARD_INTEL                       0x1

				#define CL_AVC_ME_FRAME_BACKWARD_INTEL                      0x2

				#define CL_AVC_ME_FRAME_DUAL_INTEL                          0x3

				#define CL_AVC_ME_SLICE_TYPE_PRED_INTEL                     0x0

				#define CL_AVC_ME_SLICE_TYPE_BPRED_INTEL                    0x1

				#define CL_AVC_ME_SLICE_TYPE_INTRA_INTEL                    0x2

				#define CL_AVC_ME_INTERLACED_SCAN_TOP_FIELD_INTEL           0x0

				#define CL_AVC_ME_INTERLACED_SCAN_BOTTOM_FIELD_INTEL        0x1

				#ifdef __cplusplus

				}

				#endif

				#endif /* __CL_EXT_INTEL_H */

									
										145

include/CL/cl_gl.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/**********************************************************************************

				 * Copyright (c) 2008 - 2012 The Khronos Group Inc.

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				@@ -24,11 +29,7 @@

				#ifndef __OPENCL_CL_GL_H

				#define __OPENCL_CL_GL_H

				#ifdef __APPLE__

				#include <OpenCL/cl.h>

				#else

				#include <CL/cl.h>

				#endif	

				#ifdef __cplusplus

				extern "C" {

				@@ -44,110 +45,118 @@ typedef struct __GLsync *cl_GLsync;

				#define CL_GL_OBJECT_TEXTURE2D                  0x2001

				#define CL_GL_OBJECT_TEXTURE3D                  0x2002

				#define CL_GL_OBJECT_RENDERBUFFER               0x2003

				#ifdef CL_VERSION_1_2

				#define CL_GL_OBJECT_TEXTURE2D_ARRAY            0x200E

				#define CL_GL_OBJECT_TEXTURE1D                  0x200F

				#define CL_GL_OBJECT_TEXTURE1D_ARRAY            0x2010

				#define CL_GL_OBJECT_TEXTURE_BUFFER             0x2011

				#endif

				/* cl_gl_texture_info           */

				#define CL_GL_TEXTURE_TARGET                    0x2004

				#define CL_GL_MIPMAP_LEVEL                      0x2005

				#ifdef CL_VERSION_1_2

				#define CL_GL_NUM_SAMPLES                       0x2012

				#endif

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromGLBuffer(cl_context     /* context */,

				                     cl_mem_flags   /* flags */,

				                     cl_GLuint      /* bufobj */,

				                     int *          /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				clCreateFromGLBuffer(cl_context     context,

				                     cl_mem_flags   flags,

				                     cl_GLuint      bufobj,

				                     cl_int *       errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				#ifdef CL_VERSION_1_2

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromGLTexture(cl_context      /* context */,

				                      cl_mem_flags    /* flags */,

				                      cl_GLenum       /* target */,

				                      cl_GLint        /* miplevel */,

				                      cl_GLuint       /* texture */,

				                      cl_int *        /* errcode_ret */) CL_API_SUFFIX__VERSION_1_2;

				clCreateFromGLTexture(cl_context      context,

				                      cl_mem_flags    flags,

				                      cl_GLenum       target,

				                      cl_GLint        miplevel,

				                      cl_GLuint       texture,

				                      cl_int *        errcode_ret) CL_API_SUFFIX__VERSION_1_2;

				#endif

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromGLRenderbuffer(cl_context   /* context */,

				                           cl_mem_flags /* flags */,

				                           cl_GLuint    /* renderbuffer */,

				                           cl_int *     /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				clCreateFromGLRenderbuffer(cl_context   context,

				                           cl_mem_flags flags,

				                           cl_GLuint    renderbuffer,

				                           cl_int *     errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetGLObjectInfo(cl_mem                /* memobj */,

				                  cl_gl_object_type *   /* gl_object_type */,

				                  cl_GLuint *           /* gl_object_name */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetGLTextureInfo(cl_mem               /* memobj */,

				                   cl_gl_texture_info   /* param_name */,

				                   size_t               /* param_value_size */,

				                   void *               /* param_value */,

				                   size_t *             /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;

				clGetGLObjectInfo(cl_mem                memobj,

				                  cl_gl_object_type *   gl_object_type,

				                  cl_GLuint *           gl_object_name) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueAcquireGLObjects(cl_command_queue      /* command_queue */,

				                          cl_uint               /* num_objects */,

				                          const cl_mem *        /* mem_objects */,

				                          cl_uint               /* num_events_in_wait_list */,

				                          const cl_event *      /* event_wait_list */,

				                          cl_event *            /* event */) CL_API_SUFFIX__VERSION_1_0;

				clGetGLTextureInfo(cl_mem               memobj,

				                   cl_gl_texture_info   param_name,

				                   size_t               param_value_size,

				                   void *               param_value,

				                   size_t *             param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseGLObjects(cl_command_queue      /* command_queue */,

				                          cl_uint               /* num_objects */,

				                          const cl_mem *        /* mem_objects */,

				                          cl_uint               /* num_events_in_wait_list */,

				                          const cl_event *      /* event_wait_list */,

				                          cl_event *            /* event */) CL_API_SUFFIX__VERSION_1_0;

				clEnqueueAcquireGLObjects(cl_command_queue      command_queue,

				                          cl_uint               num_objects,

				                          const cl_mem *        mem_objects,

				                          cl_uint               num_events_in_wait_list,

				                          const cl_event *      event_wait_list,

				                          cl_event *            event) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseGLObjects(cl_command_queue      command_queue,

				                          cl_uint               num_objects,

				                          const cl_mem *        mem_objects,

				                          cl_uint               num_events_in_wait_list,

				                          const cl_event *      event_wait_list,

				                          cl_event *            event) CL_API_SUFFIX__VERSION_1_0;

				/* Deprecated OpenCL 1.1 APIs */

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL

				clCreateFromGLTexture2D(cl_context      /* context */,

				                        cl_mem_flags    /* flags */,

				                        cl_GLenum       /* target */,

				                        cl_GLint        /* miplevel */,

				                        cl_GLuint       /* texture */,

				                        cl_int *        /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				clCreateFromGLTexture2D(cl_context      context,

				                        cl_mem_flags    flags,

				                        cl_GLenum       target,

				                        cl_GLint        miplevel,

				                        cl_GLuint       texture,

				                        cl_int *        errcode_ret) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL

				clCreateFromGLTexture3D(cl_context      /* context */,

				                        cl_mem_flags    /* flags */,

				                        cl_GLenum       /* target */,

				                        cl_GLint        /* miplevel */,

				                        cl_GLuint       /* texture */,

				                        cl_int *        /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				clCreateFromGLTexture3D(cl_context      context,

				                        cl_mem_flags    flags,

				                        cl_GLenum       target,

				                        cl_GLint        miplevel,

				                        cl_GLuint       texture,

				                        cl_int *        errcode_ret) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				/* cl_khr_gl_sharing extension  */

				#define cl_khr_gl_sharing 1

				typedef cl_uint     cl_gl_context_info;

				/* Additional Error Codes  */

				#define CL_INVALID_GL_SHAREGROUP_REFERENCE_KHR  -1000

				/* cl_gl_context_info  */

				#define CL_CURRENT_DEVICE_FOR_GL_CONTEXT_KHR    0x2006

				#define CL_DEVICES_FOR_GL_CONTEXT_KHR           0x2007

				/* Additional cl_context_properties  */

				#define CL_GL_CONTEXT_KHR                       0x2008

				#define CL_EGL_DISPLAY_KHR                      0x2009

				#define CL_GLX_DISPLAY_KHR                      0x200A

				#define CL_WGL_HDC_KHR                          0x200B

				#define CL_CGL_SHAREGROUP_KHR                   0x200C

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetGLContextInfoKHR(const cl_context_properties * /* properties */,

				                      cl_gl_context_info            /* param_name */,

				                      size_t                        /* param_value_size */,

				                      void *                        /* param_value */,

				                      size_t *                      /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;

				clGetGLContextInfoKHR(const cl_context_properties * properties,

				                      cl_gl_context_info            param_name,

				                      size_t                        param_value_size,

				                      void *                        param_value,

				                      size_t *                      param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetGLContextInfoKHR_fn)(

				    const cl_context_properties * properties,

				    cl_gl_context_info            param_name,

									
										39

include/CL/cl_gl_ext.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				@@ -21,11 +26,6 @@

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 **********************************************************************************/

				/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */

				/* cl_gl_ext.h contains vendor (non-KHR) OpenCL extensions which have           */

				/* OpenGL dependencies.                                                         */

				#ifndef __OPENCL_CL_GL_EXT_H

				#define __OPENCL_CL_GL_EXT_H

				@@ -33,34 +33,17 @@

				extern "C" {

				#endif

				#ifdef __APPLE__

				    #include <OpenCL/cl_gl.h>

				#else

				    #include <CL/cl_gl.h>

				#endif

				/*

				 * For each extension, follow this template

				 *  cl_VEN_extname extension  */

				/* #define cl_VEN_extname 1

				 * ... define new types, if any

				 * ... define new tokens, if any

				 * ... define new APIs, if any

				 *

				 *  If you need GLtypes here, mirror them with a cl_GLtype, rather than including a GL header

				 *  This allows us to avoid having to decide whether to include GL headers or GLES here.

				 */

				#include <CL/cl_gl.h>

				/* 

				 *  cl_khr_gl_event  extension

				 *  See section 9.9 in the OpenCL 1.1 spec for more information

				 *  cl_khr_gl_event extension

				 */

				#define CL_COMMAND_GL_FENCE_SYNC_OBJECT_KHR     0x200D

				extern CL_API_ENTRY cl_event CL_API_CALL

				clCreateEventFromGLsyncKHR(cl_context           /* context */,

				                           cl_GLsync            /* cl_GLsync */,

				                           cl_int *             /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1;

				clCreateEventFromGLsyncKHR(cl_context context,

				                           cl_GLsync  cl_GLsync,

				                           cl_int *   errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;

				#ifdef __cplusplus

				}

611

include/CL/cl_platform.h

View File

File diff suppressed because it is too large Load Diff

									
										172

include/CL/cl_va_api_media_sharing_intel.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,172 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2019 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 **********************************************************************************/

				/*****************************************************************************\

				Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.

				THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS

				"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT

				LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR

				A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS

				CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

				EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

				PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY

				OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING

				NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE

				MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				File Name: cl_va_api_media_sharing_intel.h

				Abstract:

				Notes:

				\*****************************************************************************/

				#ifndef __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H

				#define __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H

				#include <CL/cl.h>

				#include <CL/cl_platform.h>

				#include <va/va.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/******************************************

				* cl_intel_va_api_media_sharing extension *

				*******************************************/

				#define cl_intel_va_api_media_sharing 1

				/* error codes */

				#define CL_INVALID_VA_API_MEDIA_ADAPTER_INTEL               -1098

				#define CL_INVALID_VA_API_MEDIA_SURFACE_INTEL               -1099

				#define CL_VA_API_MEDIA_SURFACE_ALREADY_ACQUIRED_INTEL      -1100

				#define CL_VA_API_MEDIA_SURFACE_NOT_ACQUIRED_INTEL          -1101

				/* cl_va_api_device_source_intel */

				#define CL_VA_API_DISPLAY_INTEL                             0x4094

				/* cl_va_api_device_set_intel */

				#define CL_PREFERRED_DEVICES_FOR_VA_API_INTEL               0x4095

				#define CL_ALL_DEVICES_FOR_VA_API_INTEL                     0x4096

				/* cl_context_info */

				#define CL_CONTEXT_VA_API_DISPLAY_INTEL                     0x4097

				/* cl_mem_info */

				#define CL_MEM_VA_API_MEDIA_SURFACE_INTEL                   0x4098

				/* cl_image_info */

				#define CL_IMAGE_VA_API_PLANE_INTEL                         0x4099

				/* cl_command_type */

				#define CL_COMMAND_ACQUIRE_VA_API_MEDIA_SURFACES_INTEL      0x409A

				#define CL_COMMAND_RELEASE_VA_API_MEDIA_SURFACES_INTEL      0x409B

				typedef cl_uint cl_va_api_device_source_intel;

				typedef cl_uint cl_va_api_device_set_intel;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetDeviceIDsFromVA_APIMediaAdapterINTEL(

				    cl_platform_id                platform,

				    cl_va_api_device_source_intel media_adapter_type,

				    void*                         media_adapter,

				    cl_va_api_device_set_intel    media_adapter_set,

				    cl_uint                       num_entries,

				    cl_device_id*                 devices,

				    cl_uint*                      num_devices) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL * clGetDeviceIDsFromVA_APIMediaAdapterINTEL_fn)(

				    cl_platform_id                platform,

				    cl_va_api_device_source_intel media_adapter_type,

				    void*                         media_adapter,

				    cl_va_api_device_set_intel    media_adapter_set,

				    cl_uint                       num_entries,

				    cl_device_id*                 devices,

				    cl_uint*                      num_devices) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromVA_APIMediaSurfaceINTEL(

				    cl_context                    context,

				    cl_mem_flags                  flags,

				    VASurfaceID*                  surface,

				    cl_uint                       plane,

				    cl_int*                       errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL * clCreateFromVA_APIMediaSurfaceINTEL_fn)(

				    cl_context                    context,

				    cl_mem_flags                  flags,

				    VASurfaceID*                  surface,

				    cl_uint                       plane,

				    cl_int*                       errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueAcquireVA_APIMediaSurfacesINTEL(

				    cl_command_queue              command_queue,

				    cl_uint                       num_objects,

				    const cl_mem*                 mem_objects,

				    cl_uint                       num_events_in_wait_list,

				    const cl_event*               event_wait_list,

				    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireVA_APIMediaSurfacesINTEL_fn)(

				    cl_command_queue              command_queue,

				    cl_uint                       num_objects,

				    const cl_mem*                 mem_objects,

				    cl_uint                       num_events_in_wait_list,

				    const cl_event*               event_wait_list,

				    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseVA_APIMediaSurfacesINTEL(

				    cl_command_queue              command_queue,

				    cl_uint                       num_objects,

				    const cl_mem*                 mem_objects,

				    cl_uint                       num_events_in_wait_list,

				    const cl_event*               event_wait_list,

				    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseVA_APIMediaSurfacesINTEL_fn)(

				    cl_command_queue              command_queue,

				    cl_uint                       num_objects,

				    const cl_mem*                 mem_objects,

				    cl_uint                       num_events_in_wait_list,

				    const cl_event*               event_wait_list,

				    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;

				#ifdef __cplusplus

				}

				#endif

				#endif  /* __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H */

									
										86

include/CL/cl_version.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,86 @@

				/*******************************************************************************

				 * Copyright (c) 2018 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 ******************************************************************************/

				#ifndef __CL_VERSION_H

				#define __CL_VERSION_H

				/* Detect which version to target */

				#if !defined(CL_TARGET_OPENCL_VERSION)

				#pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")

				#define CL_TARGET_OPENCL_VERSION 220

				#endif

				#if CL_TARGET_OPENCL_VERSION != 100 && \

				    CL_TARGET_OPENCL_VERSION != 110 && \

				    CL_TARGET_OPENCL_VERSION != 120 && \

				    CL_TARGET_OPENCL_VERSION != 200 && \

				    CL_TARGET_OPENCL_VERSION != 210 && \

				    CL_TARGET_OPENCL_VERSION != 220

				#pragma message("cl_version: CL_TARGET_OPENCL_VERSION is not a valid value (100, 110, 120, 200, 210, 220). Defaulting to 220 (OpenCL 2.2)")

				#undef CL_TARGET_OPENCL_VERSION

				#define CL_TARGET_OPENCL_VERSION 220

				#endif

				/* OpenCL Version */

				#if CL_TARGET_OPENCL_VERSION >= 220 && !defined(CL_VERSION_2_2)

				#define CL_VERSION_2_2  1

				#endif

				#if CL_TARGET_OPENCL_VERSION >= 210 && !defined(CL_VERSION_2_1)

				#define CL_VERSION_2_1  1

				#endif

				#if CL_TARGET_OPENCL_VERSION >= 200 && !defined(CL_VERSION_2_0)

				#define CL_VERSION_2_0  1

				#endif

				#if CL_TARGET_OPENCL_VERSION >= 120 && !defined(CL_VERSION_1_2)

				#define CL_VERSION_1_2  1

				#endif

				#if CL_TARGET_OPENCL_VERSION >= 110 && !defined(CL_VERSION_1_1)

				#define CL_VERSION_1_1  1

				#endif

				#if CL_TARGET_OPENCL_VERSION >= 100 && !defined(CL_VERSION_1_0)

				#define CL_VERSION_1_0  1

				#endif

				/* Allow deprecated APIs for older OpenCL versions. */

				#if CL_TARGET_OPENCL_VERSION <= 210 && !defined(CL_USE_DEPRECATED_OPENCL_2_1_APIS)

				#define CL_USE_DEPRECATED_OPENCL_2_1_APIS

				#endif

				#if CL_TARGET_OPENCL_VERSION <= 200 && !defined(CL_USE_DEPRECATED_OPENCL_2_0_APIS)

				#define CL_USE_DEPRECATED_OPENCL_2_0_APIS

				#endif

				#if CL_TARGET_OPENCL_VERSION <= 120 && !defined(CL_USE_DEPRECATED_OPENCL_1_2_APIS)

				#define CL_USE_DEPRECATED_OPENCL_1_2_APIS

				#endif

				#if CL_TARGET_OPENCL_VERSION <= 110 && !defined(CL_USE_DEPRECATED_OPENCL_1_1_APIS)

				#define CL_USE_DEPRECATED_OPENCL_1_1_APIS

				#endif

				#if CL_TARGET_OPENCL_VERSION <= 100 && !defined(CL_USE_DEPRECATED_OPENCL_1_0_APIS)

				#define CL_USE_DEPRECATED_OPENCL_1_0_APIS

				#endif

				#endif  /* __CL_VERSION_H */

									
										19

include/CL/opencl.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 * Copyright (c) 2008-2015 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -12,6 +12,11 @@

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS

				 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS

				 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT

				 *    https://www.khronos.org/registry/

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				@@ -30,25 +35,13 @@

				extern "C" {

				#endif

				#ifdef __APPLE__

				#include <OpenCL/cl.h>

				#include <OpenCL/cl_gl.h>

				#include <OpenCL/cl_gl_ext.h>

				#include <OpenCL/cl_ext.h>

				#else

				#include <CL/cl.h>

				#include <CL/cl_gl.h>

				#include <CL/cl_gl_ext.h>

				#include <CL/cl_ext.h>

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif  /* __OPENCL_H   */

									
										3

include/D3D9/d3d9caps.h
									
												View File
												
				@@ -26,6 +26,9 @@

				#include "d3d9types.h"

				/* Caps flags */

				#define D3DCAPS_OVERLAY       0x00000800

				#define D3DCAPS_READ_SCANLINE 0x00020000

				#define D3DCAPS2_FULLSCREENGAMMA   0x00020000

				#define D3DCAPS2_CANCALIBRATEGAMMA 0x00100000

				#define D3DCAPS2_RESERVED          0x02000000

									
										9

include/GL/internal/dri_interface.h
									
												View File
												
				@@ -48,6 +48,8 @@ typedef unsigned int drm_drawable_t;

				typedef struct drm_clip_rect drm_clip_rect_t;

				#endif

				#include <GL/gl.h>

				#include <stdint.h>

				/**

				@@ -1290,6 +1292,7 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FORMAT_XBGR2101010  0x1010

				#define __DRI_IMAGE_FORMAT_ABGR2101010  0x1011

				#define __DRI_IMAGE_FORMAT_SABGR8       0x1012

				#define __DRI_IMAGE_FORMAT_UYVY         0x1013

				#define __DRI_IMAGE_USE_SHARE		0x0001

				#define __DRI_IMAGE_USE_SCANOUT		0x0002

				@@ -1345,6 +1348,7 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FOURCC_YUYV		0x56595559

				#define __DRI_IMAGE_FOURCC_UYVY		0x59565955

				#define __DRI_IMAGE_FOURCC_AYUV		0x56555941

				#define __DRI_IMAGE_FOURCC_XYUV8888	0x56555958

				#define __DRI_IMAGE_FOURCC_YVU410	0x39555659

				#define __DRI_IMAGE_FOURCC_YVU411	0x31315659

				@@ -1352,6 +1356,10 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FOURCC_YVU422	0x36315659

				#define __DRI_IMAGE_FOURCC_YVU444	0x34325659

				#define __DRI_IMAGE_FOURCC_P010		0x30313050

				#define __DRI_IMAGE_FOURCC_P012		0x32313050

				#define __DRI_IMAGE_FOURCC_P016		0x36313050

				/**

				 * Queryable on images created by createImageFromNames.

				 *

				@@ -1372,6 +1380,7 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_COMPONENTS_Y_XUXV	0x3005

				#define __DRI_IMAGE_COMPONENTS_Y_UXVX	0x3008

				#define __DRI_IMAGE_COMPONENTS_AYUV	0x3009

				#define __DRI_IMAGE_COMPONENTS_XYUV	0x300A

				#define __DRI_IMAGE_COMPONENTS_R	0x3006

				#define __DRI_IMAGE_COMPONENTS_RG	0x3007

									
										2

include/c99_compat.h
									
												View File
												
				@@ -96,7 +96,7 @@

				 * - http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html

				 */

				#ifndef restrict

				#  if (__STDC_VERSION__ >= 199901L)

				#  if (__STDC_VERSION__ >= 199901L) && !defined(__cplusplus)

				     /* C99 */

				#  elif defined(__GNUC__)

				#    define restrict __restrict__

									
										7

include/d3dadapter/drm.h
									
												View File
												
				@@ -29,11 +29,14 @@

				#define D3DADAPTER9DRM_NAME "drm"

				/* current version */

				#define D3DADAPTER9DRM_MAJOR 0

				#define D3DADAPTER9DRM_MINOR 1

				#define D3DADAPTER9DRM_MINOR 2

				/* version 0.0: Initial release

				 *         0.1: All IDirect3D objects can be assumed to have a pointer to the

				 *              internal vtable in second position of the structure */

				 *              internal vtable in second position of the structure

				 *         0.2: IDirect3DDevice9_SetCursorPosition always calls

				 *              ID3DPresent_SetCursorPos for hardware cursors

				 */

				struct D3DAdapter9DRM

				{

12

include/drm-uapi/README

View File

@@ -1,6 +1,6 @@
 This directory contains a copy of the installed kernel headers
 required by the anv & i965 drivers to communicate with the kernel.
 Whenever either of those driver needs new definitions for new kernel
 required by several drivers to communicate with the kernel.
 Whenever one of those driver needs new definitions for new kernel
 APIs, these files should be updated.
 These files in master should only be updated once the changes have landed
@@ -13,9 +13,9 @@ $ make headers_install INSTALL_HDR_PATH=/path/to/install
 The last update was done at the following kernel commit :
 commit 78230c46ec0a91dd4256c9e54934b3c7095a7ee3
 Merge: b65bd4031156 037f03155b7d
 commit a5f2fafece141ef3509e686cea576366d55cabb6
 Merge: 71f4e45a4ed3 860433ed2a55
 Author: Dave Airlie <airlied@redhat.com>
 Date:   Wed Mar 21 14:07:03 2018 +1000
 Date:   Wed Feb 20 12:16:30 2019 +1000
     Merge tag 'omapdrm-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-next
     Merge https://gitlab.freedesktop.org/drm/msm into drm-next

									
										16

include/drm-uapi/drm.h
									
												View File
												
				@@ -674,6 +674,22 @@ struct drm_get_cap {

				 */

				#define DRM_CLIENT_CAP_ATOMIC	3

				/**

				 * DRM_CLIENT_CAP_ASPECT_RATIO

				 *

				 * If set to 1, the DRM core will provide aspect ratio information in modes.

				 */

				#define DRM_CLIENT_CAP_ASPECT_RATIO    4

				/**

				 * DRM_CLIENT_CAP_WRITEBACK_CONNECTORS

				 *

				 * If set to 1, the DRM core will expose special connectors to be used for

				 * writing back to memory the scene setup in the commit. Depends on client

				 * also supporting DRM_CLIENT_CAP_ATOMIC

				 */

				#define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS	5

				/** DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */

				struct drm_set_client_cap {

					__u64 capability;

									
										211

include/drm-uapi/drm_fourcc.h
									
												View File
												
				@@ -30,11 +30,50 @@

				extern "C" {

				#endif

				/**

				 * DOC: overview

				 *

				 * In the DRM subsystem, framebuffer pixel formats are described using the

				 * fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the

				 * fourcc code, a Format Modifier may optionally be provided, in order to

				 * further describe the buffer's format - for example tiling or compression.

				 *

				 * Format Modifiers

				 * ----------------

				 *

				 * Format modifiers are used in conjunction with a fourcc code, forming a

				 * unique fourcc:modifier pair. This format:modifier pair must fully define the

				 * format and data layout of the buffer, and should be the only way to describe

				 * that particular buffer.

				 *

				 * Having multiple fourcc:modifier pairs which describe the same layout should

				 * be avoided, as such aliases run the risk of different drivers exposing

				 * different names for the same data format, forcing userspace to understand

				 * that they are aliases.

				 *

				 * Format modifiers may change any property of the buffer, including the number

				 * of planes and/or the required allocation size. Format modifiers are

				 * vendor-namespaced, and as such the relationship between a fourcc code and a

				 * modifier is specific to the modifer being used. For example, some modifiers

				 * may preserve meaning - such as number of planes - from the fourcc code,

				 * whereas others may not.

				 *

				 * Vendors should document their modifier usage in as much detail as

				 * possible, to ensure maximum compatibility across devices, drivers and

				 * applications.

				 *

				 * The authoritative list of format modifier codes is found in

				 * `include/uapi/drm/drm_fourcc.h`

				 */

				#define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \

								 ((__u32)(c) << 16) | ((__u32)(d) << 24))

				#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */

				/* Reserve 0 for the invalid format specifier */

				#define DRM_FORMAT_INVALID	0

				/* color index */

				#define DRM_FORMAT_C8		fourcc_code('C', '8', ' ', ' ') /* [7:0] C */

				@@ -112,6 +151,21 @@ extern "C" {

				#define DRM_FORMAT_VYUY		fourcc_code('V', 'Y', 'U', 'Y') /* [31:0] Y1:Cb0:Y0:Cr0 8:8:8:8 little endian */

				#define DRM_FORMAT_AYUV		fourcc_code('A', 'Y', 'U', 'V') /* [31:0] A:Y:Cb:Cr 8:8:8:8 little endian */

				#define DRM_FORMAT_XYUV8888		fourcc_code('X', 'Y', 'U', 'V') /* [31:0] X:Y:Cb:Cr 8:8:8:8 little endian */

				/*

				 * packed YCbCr420 2x2 tiled formats

				 * first 64 bits will contain Y,Cb,Cr components for a 2x2 tile

				 */

				/* [63:0]   A3:A2:Y3:0:Cr0:0:Y2:0:A1:A0:Y1:0:Cb0:0:Y0:0  1:1:8:2:8:2:8:2:1:1:8:2:8:2:8:2 little endian */

				#define DRM_FORMAT_Y0L0		fourcc_code('Y', '0', 'L', '0')

				/* [63:0]   X3:X2:Y3:0:Cr0:0:Y2:0:X1:X0:Y1:0:Cb0:0:Y0:0  1:1:8:2:8:2:8:2:1:1:8:2:8:2:8:2 little endian */

				#define DRM_FORMAT_X0L0		fourcc_code('X', '0', 'L', '0')

				/* [63:0]   A3:A2:Y3:Cr0:Y2:A1:A0:Y1:Cb0:Y0  1:1:10:10:10:1:1:10:10:10 little endian */

				#define DRM_FORMAT_Y0L2		fourcc_code('Y', '0', 'L', '2')

				/* [63:0]   X3:X2:Y3:Cr0:Y2:X1:X0:Y1:Cb0:Y0  1:1:10:10:10:1:1:10:10:10 little endian */

				#define DRM_FORMAT_X0L2		fourcc_code('X', '0', 'L', '2')

				/*

				 * 2 plane RGB + A

				@@ -141,6 +195,27 @@ extern "C" {

				#define DRM_FORMAT_NV24		fourcc_code('N', 'V', '2', '4') /* non-subsampled Cr:Cb plane */

				#define DRM_FORMAT_NV42		fourcc_code('N', 'V', '4', '2') /* non-subsampled Cb:Cr plane */

				/*

				 * 2 plane YCbCr MSB aligned

				 * index 0 = Y plane, [15:0] Y:x [10:6] little endian

				 * index 1 = Cr:Cb plane, [31:0] Cr:x:Cb:x [10:6:10:6] little endian

				 */

				#define DRM_FORMAT_P010		fourcc_code('P', '0', '1', '0') /* 2x2 subsampled Cr:Cb plane 10 bits per channel */

				/*

				 * 2 plane YCbCr MSB aligned

				 * index 0 = Y plane, [15:0] Y:x [12:4] little endian

				 * index 1 = Cr:Cb plane, [31:0] Cr:x:Cb:x [12:4:12:4] little endian

				 */

				#define DRM_FORMAT_P012		fourcc_code('P', '0', '1', '2') /* 2x2 subsampled Cr:Cb plane 12 bits per channel */

				/*

				 * 2 plane YCbCr MSB aligned

				 * index 0 = Y plane, [15:0] Y little endian

				 * index 1 = Cr:Cb plane, [31:0] Cr:Cb [16:16] little endian

				 */

				#define DRM_FORMAT_P016		fourcc_code('P', '0', '1', '6') /* 2x2 subsampled Cr:Cb plane 16 bits per channel */

				/*

				 * 3 plane YCbCr

				 * index 0: Y plane, [7:0] Y

				@@ -183,6 +258,9 @@ extern "C" {

				#define DRM_FORMAT_MOD_VENDOR_QCOM    0x05

				#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06

				#define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07

				#define DRM_FORMAT_MOD_VENDOR_ARM     0x08

				#define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09

				/* add more to the end as needed */

				#define DRM_FORMAT_RESERVED	      ((1ULL << 56) - 1)

				@@ -298,6 +376,15 @@ extern "C" {

				 */

				#define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE	fourcc_mod_code(SAMSUNG, 1)

				/*

				 * Tiled, 16 (pixels) x 16 (lines) - sized macroblocks

				 *

				 * This is a simple tiled layout using tiles of 16x16 pixels in a row-major

				 * layout. For YCbCr formats Cb/Cr components are taken in such a way that

				 * they correspond to their 16x16 luma block.

				 */

				#define DRM_FORMAT_MOD_SAMSUNG_16_16_TILE	fourcc_mod_code(SAMSUNG, 2)

				/*

				 * Qualcomm Compressed Format

				 *

				@@ -309,7 +396,7 @@ extern "C" {

				 * Pixel data height is aligned with macrotile height.

				 * Entire pixel data buffer is aligned with 4k(bytes).

				 */

				#define DRM_FORMAT_MOD_QCOM_COMPRESSED  fourcc_mod_code(QCOM, 1)

				#define DRM_FORMAT_MOD_QCOM_COMPRESSED	fourcc_mod_code(QCOM, 1)

				/* Vivante framebuffer modifiers */

				@@ -498,6 +585,128 @@ extern "C" {

				 */

				#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6)

				/*

				 * Arm Framebuffer Compression (AFBC) modifiers

				 *

				 * AFBC is a proprietary lossless image compression protocol and format.

				 * It provides fine-grained random access and minimizes the amount of data

				 * transferred between IP blocks.

				 *

				 * AFBC has several features which may be supported and/or used, which are

				 * represented using bits in the modifier. Not all combinations are valid,

				 * and different devices or use-cases may support different combinations.

				 *

				 * Further information on the use of AFBC modifiers can be found in

				 * Documentation/gpu/afbc.rst

				 */

				#define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode)	fourcc_mod_code(ARM, __afbc_mode)

				/*

				 * AFBC superblock size

				 *

				 * Indicates the superblock size(s) used for the AFBC buffer. The buffer

				 * size (in pixels) must be aligned to a multiple of the superblock size.

				 * Four lowest significant bits(LSBs) are reserved for block size.

				 *

				 * Where one superblock size is specified, it applies to all planes of the

				 * buffer (e.g. 16x16, 32x8). When multiple superblock sizes are specified,

				 * the first applies to the Luma plane and the second applies to the Chroma

				 * plane(s). e.g. (32x8_64x4 means 32x8 Luma, with 64x4 Chroma).

				 * Multiple superblock sizes are only valid for multi-plane YCbCr formats.

				 */

				#define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK      0xf

				#define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16     (1ULL)

				#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8      (2ULL)

				#define AFBC_FORMAT_MOD_BLOCK_SIZE_64x4      (3ULL)

				#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8_64x4 (4ULL)

				/*

				 * AFBC lossless colorspace transform

				 *

				 * Indicates that the buffer makes use of the AFBC lossless colorspace

				 * transform.

				 */

				#define AFBC_FORMAT_MOD_YTR     (1ULL <<  4)

				/*

				 * AFBC block-split

				 *

				 * Indicates that the payload of each superblock is split. The second

				 * half of the payload is positioned at a predefined offset from the start

				 * of the superblock payload.

				 */

				#define AFBC_FORMAT_MOD_SPLIT   (1ULL <<  5)

				/*

				 * AFBC sparse layout

				 *

				 * This flag indicates that the payload of each superblock must be stored at a

				 * predefined position relative to the other superblocks in the same AFBC

				 * buffer. This order is the same order used by the header buffer. In this mode

				 * each superblock is given the same amount of space as an uncompressed

				 * superblock of the particular format would require, rounding up to the next

				 * multiple of 128 bytes in size.

				 */

				#define AFBC_FORMAT_MOD_SPARSE  (1ULL <<  6)

				/*

				 * AFBC copy-block restrict

				 *

				 * Buffers with this flag must obey the copy-block restriction. The restriction

				 * is such that there are no copy-blocks referring across the border of 8x8

				 * blocks. For the subsampled data the 8x8 limitation is also subsampled.

				 */

				#define AFBC_FORMAT_MOD_CBR     (1ULL <<  7)

				/*

				 * AFBC tiled layout

				 *

				 * The tiled layout groups superblocks in 8x8 or 4x4 tiles, where all

				 * superblocks inside a tile are stored together in memory. 8x8 tiles are used

				 * for pixel formats up to and including 32 bpp while 4x4 tiles are used for

				 * larger bpp formats. The order between the tiles is scan line.

				 * When the tiled layout is used, the buffer size (in pixels) must be aligned

				 * to the tile size.

				 */

				#define AFBC_FORMAT_MOD_TILED   (1ULL <<  8)

				/*

				 * AFBC solid color blocks

				 *

				 * Indicates that the buffer makes use of solid-color blocks, whereby bandwidth

				 * can be reduced if a whole superblock is a single color.

				 */

				#define AFBC_FORMAT_MOD_SC      (1ULL <<  9)

				/*

				 * AFBC double-buffer

				 *

				 * Indicates that the buffer is allocated in a layout safe for front-buffer

				 * rendering.

				 */

				#define AFBC_FORMAT_MOD_DB      (1ULL << 10)

				/*

				 * AFBC buffer content hints

				 *

				 * Indicates that the buffer includes per-superblock content hints.

				 */

				#define AFBC_FORMAT_MOD_BCH     (1ULL << 11)

				/*

				 * Allwinner tiled modifier

				 *

				 * This tiling mode is implemented by the VPU found on all Allwinner platforms,

				 * codenamed sunxi. It is associated with a YUV format that uses either 2 or 3

				 * planes.

				 *

				 * With this tiling, the luminance samples are disposed in tiles representing

				 * 32x32 pixels and the chrominance samples in tiles representing 32x64 pixels.

				 * The pixel order in each tile is linear and the tiles are disposed linearly,

				 * both in row-major order.

				 */

				#define DRM_FORMAT_MOD_ALLWINNER_TILED fourcc_mod_code(ALLWINNER, 1)

				#if defined(__cplusplus)

				}

				#endif

									
										36

include/drm-uapi/drm_mode.h
									
												View File
												
				@@ -93,6 +93,15 @@ extern "C" {

				#define DRM_MODE_PICTURE_ASPECT_NONE		0

				#define DRM_MODE_PICTURE_ASPECT_4_3		1

				#define DRM_MODE_PICTURE_ASPECT_16_9		2

				#define DRM_MODE_PICTURE_ASPECT_64_27		3

				#define DRM_MODE_PICTURE_ASPECT_256_135		4

				/* Content type options */

				#define DRM_MODE_CONTENT_TYPE_NO_DATA		0

				#define DRM_MODE_CONTENT_TYPE_GRAPHICS		1

				#define DRM_MODE_CONTENT_TYPE_PHOTO		2

				#define DRM_MODE_CONTENT_TYPE_CINEMA		3

				#define DRM_MODE_CONTENT_TYPE_GAME		4

				/* Aspect ratio flag bitmask (4 bits 22:19) */

				#define DRM_MODE_FLAG_PIC_AR_MASK		(0x0F<<19)

				@@ -102,6 +111,10 @@ extern "C" {

							(DRM_MODE_PICTURE_ASPECT_4_3<<19)

				#define  DRM_MODE_FLAG_PIC_AR_16_9 \

							(DRM_MODE_PICTURE_ASPECT_16_9<<19)

				#define  DRM_MODE_FLAG_PIC_AR_64_27 \

							(DRM_MODE_PICTURE_ASPECT_64_27<<19)

				#define  DRM_MODE_FLAG_PIC_AR_256_135 \

							(DRM_MODE_PICTURE_ASPECT_256_135<<19)

				#define  DRM_MODE_FLAG_ALL	(DRM_MODE_FLAG_PHSYNC |		\

								 DRM_MODE_FLAG_NHSYNC |		\

				@@ -173,8 +186,9 @@ extern "C" {

				/*

				 * DRM_MODE_REFLECT_<axis>

				 *

				 * Signals that the contents of a drm plane is reflected in the <axis> axis,

				 * Signals that the contents of a drm plane is reflected along the <axis> axis,

				 * in the same way as mirroring.

				 * See kerneldoc chapter "Plane Composition Properties" for more details.

				 *

				 * This define is provided as a convenience, looking up the property id

				 * using the name->prop id lookup is the preferred method.

				@@ -338,6 +352,7 @@ enum drm_mode_subconnector {

				#define DRM_MODE_CONNECTOR_VIRTUAL      15

				#define DRM_MODE_CONNECTOR_DSI		16

				#define DRM_MODE_CONNECTOR_DPI		17

				#define DRM_MODE_CONNECTOR_WRITEBACK	18

				struct drm_mode_get_connector {

				@@ -873,6 +888,25 @@ struct drm_mode_revoke_lease {

					__u32 lessee_id;

				};

				/**

				 * struct drm_mode_rect - Two dimensional rectangle.

				 * @x1: Horizontal starting coordinate (inclusive).

				 * @y1: Vertical starting coordinate (inclusive).

				 * @x2: Horizontal ending coordinate (exclusive).

				 * @y2: Vertical ending coordinate (exclusive).

				 *

				 * With drm subsystem using struct drm_rect to manage rectangular area this

				 * export it to user-space.

				 *

				 * Currently used by drm_mode_atomic blob property FB_DAMAGE_CLIPS.

				 */

				struct drm_mode_rect {

					__s32 x1;

					__s32 y1;

					__s32 x2;

					__s32 y2;

				};

				#if defined(__cplusplus)

				}

				#endif

Compare commits

4012 Commits mesa-19.0. ... 19.1

52 .gitignore vendored Unescape Escape View File

264 .gitlab-ci.yml Normal file Unescape Escape View File

181 .gitlab-ci/debian-install.sh Normal file Unescape Escape View File

3 .mailmap Unescape Escape View File

862 .travis.yml Unescape Escape View File

3 Android.common.mk Unescape Escape View File

15 Android.mk Unescape Escape View File

92 Makefile.am Unescape Escape View File

19 README.rst Unescape Escape View File

8 REVIEWERS Unescape Escape View File

2 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

14 autogen.sh Unescape Escape View File

44 bin/.cherry-ignore Normal file Unescape Escape View File

9 bin/.gitignore vendored Unescape Escape View File

6 bin/get-pick-list.sh Unescape Escape View File

15 bin/install_megadrivers.py Unescape Escape View File

63 bin/meson-options.py Executable file Unescape Escape View File

12 common.py Unescape Escape View File

3378 configure.ac View File

2 docs/application-issues.html Unescape Escape View File

270 docs/autoconf.html Unescape Escape View File

6 docs/bugs.html Unescape Escape View File

1 docs/codingstyle.html Unescape Escape View File

29 docs/contents.html Unescape Escape View File

8 docs/debugging.html Unescape Escape View File

4 docs/devinfo.html Unescape Escape View File

2 docs/dispatch.html Unescape Escape View File

20 docs/download.html Unescape Escape View File

41 docs/egl.html Unescape Escape View File

3 docs/envvars.html Unescape Escape View File

28 docs/faq.html Unescape Escape View File

88 docs/features.txt Unescape Escape View File

2 docs/helpwanted.html Unescape Escape View File

53 docs/index.html Unescape Escape View File

33 docs/install.html Unescape Escape View File

10 docs/llvmpipe.html Unescape Escape View File

37 docs/mangling.html Unescape Escape View File

45 docs/mesa.css Unescape Escape View File

225 docs/meson.html Unescape Escape View File

4 docs/opengles.html Unescape Escape View File

11 docs/osmesa.html Unescape Escape View File

2 docs/precompiled.html Unescape Escape View File

63 docs/release-calendar.html Unescape Escape View File

122 docs/releasing.html Unescape Escape View File

8 docs/relnotes.html Unescape Escape View File

6 docs/relnotes/10.2.html Unescape Escape View File

2 docs/relnotes/10.3.html Unescape Escape View File

2 docs/relnotes/11.0.0.html Unescape Escape View File

4 docs/relnotes/11.1.0.html Unescape Escape View File

5 docs/relnotes/17.3.5.html Unescape Escape View File

2 docs/relnotes/18.1.1.html Unescape Escape View File

2 docs/relnotes/18.1.2.html Unescape Escape View File

4 docs/relnotes/18.3.0.html Unescape Escape View File

208 docs/relnotes/18.3.3.html Normal file Unescape Escape View File

180 docs/relnotes/18.3.4.html Normal file Unescape Escape View File

271 docs/relnotes/18.3.5.html Normal file Unescape Escape View File

169 docs/relnotes/18.3.6.html Normal file Unescape Escape View File

2403 docs/relnotes/19.0.0.html View File

159 docs/relnotes/19.0.1.html Normal file Unescape Escape View File

122 docs/relnotes/19.0.2.html Normal file Unescape Escape View File

148 docs/relnotes/19.0.3.html Normal file Unescape Escape View File

4610 docs/relnotes/19.1.0.html Normal file View File

154 docs/relnotes/19.1.1.html Normal file Unescape Escape View File

194 docs/relnotes/19.1.2.html Normal file Unescape Escape View File

191 docs/relnotes/19.1.3.html Normal file Unescape Escape View File

227 docs/relnotes/19.1.4.html Normal file Unescape Escape View File

119 docs/relnotes/19.1.5.html Normal file Unescape Escape View File

132 docs/relnotes/19.1.6.html Normal file Unescape Escape View File

157 docs/relnotes/19.1.7.html Normal file Unescape Escape View File

267 docs/relnotes/19.1.8.html Normal file Unescape Escape View File

3 docs/shading.html Unescape Escape View File

2 docs/sourcetree.html Unescape Escape View File

49 docs/submittingpatches.html Unescape Escape View File

6 docs/versions.html Unescape Escape View File

16 docs/vmware-guest.html Unescape Escape View File

1772 include/CL/cl.h View File

11745 include/CL/cl.hpp View File

4012 Commits

mesa-19.0. ... 19.1

52

.gitignore vendored

View File

264

.gitlab-ci.yml Normal file

View File

181

.gitlab-ci/debian-install.sh Normal file

View File

3

.mailmap

View File

862

.travis.yml

View File

3

Android.common.mk

View File

15

Android.mk

View File

92

Makefile.am

View File

19

README.rst

View File

8

REVIEWERS

View File

2

SConstruct

View File

2

VERSION

View File

14

autogen.sh

View File

44

bin/.cherry-ignore Normal file

View File

9

bin/.gitignore vendored

View File

6

bin/get-pick-list.sh

View File

15

bin/install_megadrivers.py

View File

63

bin/meson-options.py Executable file

View File

12

common.py

View File

3378

configure.ac

View File

2

docs/application-issues.html

View File

270

docs/autoconf.html

View File

6

docs/bugs.html

View File

1

docs/codingstyle.html

View File

29

docs/contents.html

View File

8

docs/debugging.html

View File

4

docs/devinfo.html

View File

2

docs/dispatch.html

View File

20

docs/download.html

View File

41

docs/egl.html

View File

3

docs/envvars.html

View File

28

docs/faq.html

View File

88

docs/features.txt

View File

2

docs/helpwanted.html

View File

53

docs/index.html

View File

33

docs/install.html

View File

10

docs/llvmpipe.html

View File

37

docs/mangling.html

View File

45

docs/mesa.css

View File

225

docs/meson.html

View File

4

docs/opengles.html

View File

11

docs/osmesa.html

View File

2

docs/precompiled.html

View File

63

docs/release-calendar.html

View File

122

docs/releasing.html

View File

8

docs/relnotes.html

View File

6

docs/relnotes/10.2.html

View File

2

docs/relnotes/10.3.html

View File

2

docs/relnotes/11.0.0.html

View File

4

docs/relnotes/11.1.0.html

View File

5

docs/relnotes/17.3.5.html

View File

2

docs/relnotes/18.1.1.html

View File

2

docs/relnotes/18.1.2.html

View File

4

docs/relnotes/18.3.0.html

View File

208

docs/relnotes/18.3.3.html Normal file

View File

180

docs/relnotes/18.3.4.html Normal file

View File

271

docs/relnotes/18.3.5.html Normal file

View File

169

docs/relnotes/18.3.6.html Normal file

View File

2403

docs/relnotes/19.0.0.html

View File

159

docs/relnotes/19.0.1.html Normal file

View File

122

docs/relnotes/19.0.2.html Normal file

View File

148

docs/relnotes/19.0.3.html Normal file

View File

4610

docs/relnotes/19.1.0.html Normal file

View File

154

docs/relnotes/19.1.1.html Normal file

View File

194

docs/relnotes/19.1.2.html Normal file

View File

191

docs/relnotes/19.1.3.html Normal file

View File

227

docs/relnotes/19.1.4.html Normal file

View File

119

docs/relnotes/19.1.5.html Normal file

View File

132

docs/relnotes/19.1.6.html Normal file

View File

157

docs/relnotes/19.1.7.html Normal file

View File

267

docs/relnotes/19.1.8.html Normal file

View File

3

docs/shading.html

View File

2

docs/sourcetree.html

View File

49

docs/submittingpatches.html

View File

6

docs/versions.html

View File

16

docs/vmware-guest.html

View File

1772

include/CL/cl.h

View File

11745

include/CL/cl.hpp

View File

9690

include/CL/cl2.hpp Normal file

View File